Skip to content

Conversation

@deependujha
Copy link
Collaborator

Before submitting
  • Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

What does this PR do?

Fixes # (issue).

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in GitHub issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@deependujha deependujha changed the title ci: xdist increase number of processes [experiment] ci: xdist increase number of processes Aug 8, 2025
@codecov
Copy link

codecov bot commented Aug 8, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84%. Comparing base (a543725) to head (17772f5).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@         Coverage Diff         @@
##           main   #681   +/-   ##
===================================
  Coverage    84%    84%           
===================================
  Files        52     52           
  Lines      7021   7021           
===================================
+ Hits       5900   5902    +2     
+ Misses     1121   1119    -2     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@deependujha
Copy link
Collaborator Author

deependujha commented Aug 8, 2025

Result with -n $CORES

action: https://github.com/Lightning-AI/litData/actions/runs/16823796244/job/47655723453?pr=681

ignoring failing tests on windows for now.

  • mac (3 cpu in CI)

completes 5-8 min early

OS & Python Version main: -n 2 PR: -n 3
macos-14: py: 3.09 26m 02s 19m 19s
macos-14: py: 3.10 29m 33s 23m 46s
macos-14: py: 3.11 28m 51s 22m 24s

  • ubuntu (4 cpu in CI)

very similar, no significant difference

OS & Python Version main: -n 2 PR: -n 3
ubuntu-22.04: py: 3.09 27m 50s 27m 53s
ubuntu-22.04: py: 3.10 26m 48s 26m 38s
ubuntu-22.04: py: 3.11 27m 16s 27m 19s
ubuntu-22.04: py: 3.12 31m 26s 33m 23s
ubuntu-22.04: py: 3.13 26m 34s 26m 53s

my bad, earlier I said, github ci has 2 cpus, and my logic was:

CORES=$(python -c "import os; print(max(1, os.cpu_count() // 2))")
echo "Using $CORES cores for pytest"

I forgot about /2 and said github ci has 2 cpus. :)

  • ubuntu & windows have 4 cpus (cores)
  • mac: 3 cpus

cc: @Borda

@Borda Borda requested a review from bhimrazy August 8, 2025 11:37
@Borda
Copy link
Collaborator

Borda commented Aug 8, 2025

btw, seems to be a bit flaky on Windows:

>               raise AssertionError(f"Test left zombie thread: {thread}")
E               AssertionError: Test left zombie thread: <Timer(pytest_timeout tests/streaming/test_dataset.py::test_streaming_dataset_distributed_full_shuffle_even[zstd-True], started 1296)>

@bhimrazy
Copy link
Collaborator

Changing to n=3 also doesn’t seem to help much at the moment.

Next, I think we might need to look into the ignored tests for parallel execution and also improve performance at the test level, especially based on the latest top 100 test timing results. On Windows, some tests take over 200 seconds, and other runtimes have several tests above 60 seconds.

@Borda
Copy link
Collaborator

Borda commented Aug 12, 2025

Changing to n=3 also doesn’t seem to help much at the moment.

so let's close this experiment

@Borda Borda closed this Aug 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants