-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1184.feature - Add 'singlecollect' distribution mode #1184
base: master
Are you sure you want to change the base?
Conversation
This adds a new 'singlecollect' distribution mode that only collects tests on the first worker node and skips redundant collection on other nodes. This can significantly improve startup time for large test suites with expensive collection. Key features: - Only the first worker performs test collection - Other workers skip collection verification entirely - Tests are distributed using the same algorithm as 'load' mode - Handles worker failures gracefully, including the collecting worker - Solves issues with floating parameters in pytest collection
96175cb
to
5808348
Compare
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
You say this skips redundant collection on other nodes, but I don't think that's right -- how can a node execute a test (Item) it hasn't collected? Can you be more precise what is being saved here? Is it just the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm under the impression we should name this different as it's something that applies to most shedulers
@bluetech For example:
Worker 1 might collect: test_example[1], test_example[3], test_example[2]...
This forces developers to either avoid using unordered collections in parametrization or manually convert them to ordered sequences. The workers still need to have the test items to execute them. What's happening in the singlecollect mode is:
The key part in the code is:
Other nodes receive tests from the master node during test distribution. @RonnyPfannschmidt |
If the only difference for this is ignoring different test order on other nodes then it is a completely unacceptable no go as shedulers currently talk in terms of indexes into the collection |
I understand your concern about schedulers using collection indexes. |
Theres a number of edge cases to be aware of
Come to mind off hand |
This adds a new 'singlecollect' distribution mode that only collects tests on the first worker node and skips redundant collection on other nodes. This can significantly improve startup time for large test suites with expensive collection.
Key features: