Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document the minimal resources needed to run tests for DataFusion #9398

Closed
SteveLauC opened this issue Feb 29, 2024 · 5 comments · Fixed by #9402
Closed

Document the minimal resources needed to run tests for DataFusion #9398

SteveLauC opened this issue Feb 29, 2024 · 5 comments · Fixed by #9402
Labels
enhancement New feature or request

Comments

@SteveLauC
Copy link
Contributor

SteveLauC commented Feb 29, 2024

Is your feature request related to a problem or challenge?

As a contributor, I would like to test more locally to reduce CI errors, I tried running cargo test at the project root, and then all my RAM was eaten by tests, and the kernel got killed, forcing me to restart my PC.

I have 32 GB of memory, looks like I need at least 64 GB or 128 GB :D

Describe the solution you'd like

Document the spec (minimal one) needed for running tests of DataFusion in the contributors' guide.

And, we should document which tests are resource-heavy so that contributors can ignore them locally.

Describe alternatives you've considered

No

Additional context

No response

@SteveLauC SteveLauC added the enhancement New feature or request label Feb 29, 2024
@Jefffrey
Copy link
Contributor

I think the doctests are the main culprit, see #5347

Agree on documenting this as part of contributor/developer guide 👍

@devinjdangelo
Copy link
Contributor

Thank you for bringing this up @SteveLauC. I had the same experience when I first starting working on DataFusion. You can work around memory limitations by running cargo test -- --test-threads=1 which will only run a single test at a time. It will be slower but consume substantially less memory. I think it would be a good idea to document this workaround since many new contributors won't have enough RAM to run all tests at max parallelism on their system.

The precise memory requirements to run cargo test will vary over time and depend on your exact development set up. Running in wsl on Windows for example is a bit more intensive as you need to reserve some memory for Windows. I personally upgraded from 32->64GB and that was plenty running natively on linux.

@SteveLauC
Copy link
Contributor Author

cargo test -- --test-threads=1

Thanks for showing me this! This indeed makes the memory usage controllable:)

@devinjdangelo
Copy link
Contributor

Glad that worked for you! I opened #9402 to document this in the contributors guide.

There does not seem to be a way to configure Cargo to default to --test-threads 1 other than setting environment variables, so documentation may be our best bet for now... see rust-lang/cargo#8430

@Omega359
Copy link
Contributor

I had my computer upgraded to 64GB of ram - the workaround mentioned above I used as well but the time to run the tests was using that was horrendous. Using linux under WSL2 in windows here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
4 participants