Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the expected throughput? #7

Closed
makorne opened this issue Aug 20, 2021 · 15 comments
Closed

What is the expected throughput? #7

makorne opened this issue Aug 20, 2021 · 15 comments

Comments

@makorne
Copy link

makorne commented Aug 20, 2021

Hi!
Thank you for your great crate!

I am testing sqlxmq_stress and I dont see any high load for cores.

My results:

num_jobs = 1000; set_concurrency(50, 1000)

min: 8.296434179s
max: 9.840498547s
median: 8.851534467s
95th percentile: 9.600073887s
throughput: 99.19286908241159/s

num_jobs = 10000; set_concurrency(50, 1000)

Took more than 2 hours and still works on Ryzen 5900HX / SSD.

I think may be it is hung?
How to prevent such situations and what is the expected throughput on recent hardware?

@makorne
Copy link
Author

makorne commented Aug 20, 2021

I did this test several times with num_jobs = 3000; set_concurrency(50, 100)

1 time this result:

min: 3.804157088s
max: 40.62486232s
median: 34.72999443s
95th percentile: 39.781787279s
throughput: 72.50754711855541/s

1 time this result:

min: 8.497855039s
max: 23.348883659s
median: 18.169892469s
95th percentile: 22.225424099s
throughput: 127.19332590396806/s

Other 4 times: the job tasks ended in table mq_msgs but the program was still working endlessly.

@makorne makorne closed this as completed Aug 29, 2021
@Diggsey
Copy link
Owner

Diggsey commented Aug 29, 2021

Hi @makorne, sorry I didn't get around to investigating this earlier - I would like to figure out what the problem is though.

@Diggsey Diggsey reopened this Aug 29, 2021
@Diggsey
Copy link
Owner

Diggsey commented Aug 29, 2021

I can't seem to reproduce it. When I try with the parameters that caused it to hang for you (num_jobs = 10000, concurrency = [50, 1000]) I get these results:

min: 0.0021288s
max: 0.3401738s
median: 0.0037295s
95th percentile: 0.1190739s
throughput: 1427.4979093579368/s

Did you ever figure out what caused this?

@imbolc
Copy link
Contributor

imbolc commented Sep 18, 2021

I've tried it too with different settings for concurrency up to (5000, 10000), but I couldn't reproduce it on my laptop

@imbolc
Copy link
Contributor

imbolc commented Sep 18, 2021

Though after a couple of runs with (num_jobs = 100000, concurrency = [5000, 10000]) the process hangs with no activity and empty mq_payloads and mq_msgs tables.

@imbolc
Copy link
Contributor

imbolc commented Sep 18, 2021

I've tried to locate the bug somehow and got this numbers:

  180s total: 100000, started:  61975, got json:  61975, completed:  55702, sent:  55702, payloads:  13924, msgs:  13924
  185s total: 100000, started:  66977, got json:  66977, completed:  60204, sent:  60204, payloads:   9670, msgs:   9670
  190s total: 100000, started:  71977, got json:  71977, completed:  64772, sent:  64772, payloads:   5102, msgs:   5102
  195s total: 100000, started:  76099, got json:  76099, completed:  70456, sent:  70456, payloads:    399, msgs:    399
  200s total: 100000, started:  76099, got json:  76099, completed:  72049, sent:  72049, payloads:      0, msgs:      0
  205s total: 100000, started:  76099, got json:  76099, completed:  72049, sent:  72049, payloads:      0, msgs:      0
  hanging ...

Here's the code I used: imbolc@3399b41

@imbolc
Copy link
Contributor

imbolc commented Sep 20, 2021

Got it, at some point sqlxmq_stress::start_job results in PoolTimedOut. So it hangs because some tasks just aren't scheduled.

@Diggsey
Copy link
Owner

Diggsey commented Sep 20, 2021

Ah, nice find! We should probably just abort if sending fails.

@imbolc
Copy link
Contributor

imbolc commented Sep 20, 2021

Sure, but I new to async and couldn't find a way to pass the error back from a task without sacrificing performance

@Diggsey
Copy link
Owner

Diggsey commented Sep 20, 2021

I've addressed this in 0.3.0.

@Diggsey Diggsey closed this as completed Sep 20, 2021
@sbeckeriv
Copy link

@Diggsey Sorry to comment on a closed issue. I am not seeing the throughput on the stress test either. My system spec are at the bottom. I am running postgres 12.8 installed via the tool asdf if that matters.

With or without release I am getting about the same results. I even tried edited main to [50,1000]

min: 0.075423397s
max: 37.357557689s
median: 29.706058712s
95th percentile: 34.275748728s
throughput: 266.1852449625013/s

I know benchmarks depend on a lot of things and are really good for relative changes. I am wondering if there is anything you can think of that would cause the large difference?

Thanks for your hard work! I am excited about the project.
Becker

          .-/+oossssoo+/-.               becker
       `:+ssssssssssssssssss+:`           -------------------- 
     -+ssssssssssssssssssyyssss+-         OS: Ubuntu 21.04 x86_64 
   .ossssssssssssssssssdMMMNysssso.       Host: HP Z2 Tower G5 Workstation 
  /ssssssssssshdmmNNmmyNMMMMhssssss/      Kernel: 5.11.0-41-generic 
 +ssssssssshmydMMMMMMMNddddyssssssss+     Uptime: 45 days, 22 hours, 12 mins 
/sssssssshNMMMyhhyyyyhmNMMMNhssssssss/    Packages: 1957 (dpkg), 9 (snap) 
.ssssssssdMMMNhsssssssssshNMMMdssssssss.   Shell: bash 5.1.4 
+sssshhhyNMMNyssssssssssssyNMMMysssssss+   Resolution: 3840x2160 
ossyNMMMNyMMhsssssssssssssshmmmhssssssso   WM: Mutter 
ossyNMMMNyMMhsssssssssssssshmmmhssssssso   WM Theme: Adwaita 
+sssshhhyNMMNyssssssssssssyNMMMysssssss+   Theme: Yaru [GTK3] 
.ssssssssdMMMNhsssssssssshNMMMdssssssss.   Icons: Adwaita [GTK3] 
/sssssssshNMMMyhhyyyyhdNMMMNhssssssss/    Terminal: /dev/pts/2 
 +sssssssssdmydMMMMMMMMddddyssssssss+     CPU: Intel i9-10900K (20) @ 5.300GHz 
  /ssssssssssshdmNNNNmyNMMMMhssssss/      GPU: Intel CometLake-S GT2 [UHD Graphics 630] 
   .ossssssssssssssssssdMMMNysssso.       Memory: 6744MiB / 31882MiB 
     -+sssssssssssssssssyyyssss+-
       `:+ssssssssssssssssss+:`                                   
           .-/+oossssoo+/-.                                       

@Diggsey
Copy link
Owner

Diggsey commented Feb 5, 2022

@sbeckeriv I'm not sure TBH, this queue is not really designed for high throughput, but I do see much higher throughput than you're getting with much worse system specs. You are using an SSD right?

@sbeckeriv
Copy link

@Diggsey yes. PM981a NVMe Samsung 1024GB (15302129) Ext4 file format full disk encryption.

I know it doesnt have the symbols. I am working on it. It looks like there is a long pause for some reason.
flamegraph

I will keep digging and let you know what I find.

@sbeckeriv
Copy link

sbeckeriv commented Feb 17, 2022

Hello again,

I got a flame graph to report things but I dont know what to make of it. Maybe something will spark an idea for you. Thanks again for your work on this.

flamegraph

https://gist.githubusercontent.com/sbeckeriv/8b97f44a88364afdd1ba0d2b87f9527e/raw/bd706be7105e05f15eb4f1d91541e0aeadbd099d/flame.svg

Github does something funky with the svg file. I can zoom on it locally. The gist file at least supports hover.
Becker

@makorne
Copy link
Author

makorne commented Apr 18, 2022

I've tried to locate the bug somehow and got this numbers:

  180s total: 100000, started:  61975, got json:  61975, completed:  55702, sent:  55702, payloads:  13924, msgs:  13924
  185s total: 100000, started:  66977, got json:  66977, completed:  60204, sent:  60204, payloads:   9670, msgs:   9670
  190s total: 100000, started:  71977, got json:  71977, completed:  64772, sent:  64772, payloads:   5102, msgs:   5102
  195s total: 100000, started:  76099, got json:  76099, completed:  70456, sent:  70456, payloads:    399, msgs:    399
  200s total: 100000, started:  76099, got json:  76099, completed:  72049, sent:  72049, payloads:      0, msgs:      0
  205s total: 100000, started:  76099, got json:  76099, completed:  72049, sent:  72049, payloads:      0, msgs:      0
  hanging ...

Here's the code I used: imbolc@3399b41

I tried your code on pg14 and latest sqlxmq.
It is hanging too on Ryzen 5900HX / NVme SSD.
Looks like some bug still exists.

const MIN_CONCURRENCY: usize = 50;
const MAX_CONCURRENCY: usize = 1000;

32556s total: 100000, started:  82306, got json:  82306, completed:  81357, sent:  81357, payloads:      0, msgs:      0
32561s total: 100000, started:  82306, got json:  82306, completed:  81357, sent:  81357, payloads:      0, msgs:      0
32566s total: 100000, started:  82306, got json:  82306, completed:  81357, sent:  81357, payloads:      0, msgs:      0
32571s total: 100000, started:  82306, got json:  82306, completed:  81357, sent:  81357, payloads:      0, msgs:      0
32576s total: 100000, started:  82306, got json:  82306, completed:  81357, sent:  81357, payloads:      0, msgs:      0
32581s total: 100000, started:  82306, got json:  82306, completed:  81357, sent:  81357, payloads:      0, msgs:      0
32587s total: 100000, started:  82306, got json:  82306, completed:  81357, sent:  81357, payloads:      0, msgs:      0
32592s total: 100000, started:  82306, got json:  82306, completed:  81357, sent:  81357, payloads:      0, msgs:      0
32597s total: 100000, started:  82306, got json:  82306, completed:  81357, sent:  81357, payloads:      0, msgs:      0
32602s total: 100000, started:  82306, got json:  82306, completed:  81357, sent:  81357, payloads:      0, msgs:      0
32607s total: 100000, started:  82306, got json:  82306, completed:  81357, sent:  81357, payloads:      0, msgs:      0

const MIN_CONCURRENCY: usize = 50;
const MAX_CONCURRENCY: usize = 100;

  591s total: 100000, started:  81555, got json:  81555, completed:  81555, sent:  81555, payloads:      0, msgs:      0
  596s total: 100000, started:  81555, got json:  81555, completed:  81555, sent:  81555, payloads:      0, msgs:      0
  601s total: 100000, started:  81555, got json:  81555, completed:  81555, sent:  81555, payloads:      0, msgs:      0
  606s total: 100000, started:  81555, got json:  81555, completed:  81555, sent:  81555, payloads:      0, msgs:      0
  611s total: 100000, started:  81555, got json:  81555, completed:  81555, sent:  81555, payloads:      0, msgs:      0
  616s total: 100000, started:  81555, got json:  81555, completed:  81555, sent:  81555, payloads:      0, msgs:      0
  621s total: 100000, started:  81555, got json:  81555, completed:  81555, sent:  81555, payloads:      0, msgs:      0
  626s total: 100000, started:  81555, got json:  81555, completed:  81555, sent:  81555, payloads:      0, msgs:      0
  631s total: 100000, started:  81555, got json:  81555, completed:  81555, sent:  81555, payloads:      0, msgs:      0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants