Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add partition count config and improved defaults (#720) #722

Conversation

chriscrabtree
Copy link

The user can explicitly set the number of chromedriver partitions now.

But even if the user does not set this config, Wallaby will make an improved choice as to how many partitions it fires up.

This will lead to better performance and fewer false negative test results on machines with high CPU-core counts.

The user can explicitly set the number of chromedriver partitions now.

But even if the user does not set this config, Wallaby will make an
improved choice as to how many partitions it fires up.

This will lead to better performance and fewer false negative test
results on machines with high CPU-core counts.
Copy link
Member

@mhanberg mhanberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...

I left some comments on this PR weeks months ago, but never clicked "Submit review"...

Sorry about that.

lib/wallaby/chrome.ex Outdated Show resolved Hide resolved
lib/wallaby/chrome/default_partition_count.ex Outdated Show resolved Hide resolved
Without this change, running mix test on Wallaby itself on a 56-core
machine produces the same undesirable side effects we are trying to
remedy with this PR.

Specifically, all the CPU cores fire up, tests complete slowly (at
least 45 seconds for a single test), and with many errors.

We believe this is due to something related to how chromedriver
operates, but the workaround of limiting concurrency to 10 processes
works well enough for us to defer going down that rabbit hole.
@chriscrabtree
Copy link
Author

Understood, and done. I did find that I had to implement the same min(System.schedulers_online(), 10) idea in a few different places so that running mix test on Wallaby itself on a high-core-count machine would run smoothly.

While I felt a strong urge to factor that out into a meaningfully named function, lol, I went with the direct approach in accordance with the earlier guidance.

@mhanberg
Copy link
Member

@chriscrabtree I pushed up some changes, can you confirm that it works alright with your computer with tons of cores, and I will merge and release this.

@chriscrabtree
Copy link
Author

@mhanberg Hey, I appreciate you thinking about the possible consequences--good catch!

Indeed, it seems that at least one of the three configurations removed are in fact necessary for a smooth test run on the 56-core machine.

This is the recording of the run after I merged the latest commits. What this recording doesn't show is spiking all the cores while it slowly grinds through the test suite with multiple failures.

And this is the recording after I reapplied the essential 10-max logic in those three same places, which is both 100% successful and ~4x faster.

I can play with identifying the critical set of changes only to allow it to run smoothly.

And just so I'm clear: we no longer want to have concurrency as a configurable option?

@chriscrabtree
Copy link
Author

@mhanberg Okay, I determined that adding the 10-max governor on PartitionSupervisor is the one truly necessary change. Just pushed a commit reflecting that single thing which allows a smooth mix test experience.

It's possible that as the test suite grows over time, we might need to adjust ExUnit :max_cases and the :wallaby_pool size as well, but after multiple runs with the current suite, I noted no failures. So we can cross that bridge if and when we ever come to it.

@chriscrabtree
Copy link
Author

Realizing now those asciinema recordings may have been private by default. I've now set them public.

I had missed that the `:partitions` option was changed to `:concurrency`
when defing the partition supervisor. This was not doing anything, which
is why making the change in PartitionSupervisor was necessary.

Now that the supervisor is defined with the correct option, the change
in the (vendored) PartitionSupervisor module is not necessary.
@mhanberg
Copy link
Member

mhanberg commented Apr 2, 2023

@chriscrabtree I had missed something when i pushed up my original patches when simplifying the solution. You can see the commit message for more details.

Can you pull this down again and just re-confirm that it still works? (It should)

@chriscrabtree
Copy link
Author

@mhanberg Confirmed!

@mhanberg mhanberg closed this in 831fa2c Apr 3, 2023
@mhanberg
Copy link
Member

mhanberg commented Apr 3, 2023

This has been released with v0.30.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants