-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: address some integration test bloat of disk usage for development #1552
Conversation
ACTION NEEDED delta-rs follows the Conventional Commits specification for release automation. The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification. |
Unfortunately `cargo +nightly udeps --tests --no-default-features --features integration_test,datafusion,s3-native-tls` did not produce more unused dependencies which means there is fewer cruft to chop out of these dependency trees at the moment. Related to delta-io#1550 Sponsored-by: Databricks In
Since each of these unit tests are quite large, removing just two of them would result in a 1GB+ reduction in disk space utilization. This pull request removes a few of the sets of tests which rely on tests/data/ but do not require any non-default features. Sponsored-by: Databricks Inc
The previous use of "1" would still result in significant file sizes. When building integration tests locally a setting of "1" leads to a total target/ directory size of 3.1GB, whereas "line-tables-only" results in a total clean build target/ directory size of 2.1GB. The "line-tables-only" feature provides: > Generates the minimal amount of debug info for backtraces with filename/line > number info, but not anything else, i.e. no variable or function parameter info. And was introduced in 1.70: <https://blog.rust-lang.org/2023/06/01/Rust-1.70.0.html#named-levels-of-debug-information> Setting this to "0" does reduce the size further, but for panics and stacktraces in CI, we need some level of information to be present in order for the backtraces to be useful. Sponsored-by: Databricks Inc
01df558
to
20da846
Compare
Thanks for this PR @rtyler. As it has optimized not just on disk usage, but also significantly reduced RAM consumption and total test runtime. Peace for local development especially for newcomers to this project (like myself). My test run comparisons as below for reference:
Below is the new test command that I've constructed from this PR's CI yaml, for reference: RUSTFLAGS="-C debuginfo=line-tables-only" CARGO_INCREMENTAL=0 RUST_BACKTRACE=1 CARGO_NET_GIT_FETCH_WITH_CLI="true" cargo test -p deltalake --no-default-features --features integration_test,s3-native-tls,datafusion |
… project Based on [this comment](delta-io#1552 (comment)) I believe this will be a valuable setting for making local development cycles faster and easier for contributors to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for doing this. LGTM!
Description
This change attempts to further reduce our build/link time by changing
RUSTFLAGS
for CI and consolidating some integration tests into singular files, while moving more unit-style tests (those which just rely ontests/data/
) into unit tests.The previous use of "debuginfo=1" would still result in significant file sizes. When building integration tests locally a setting of "1" leads to a total target/ directory size of 3.1GB, whereas "line-tables-only" results in a total clean build target/ directory size of 2.1GB.
The "line-tables-only" feature provides:
Setting this to "0" does reduce the size further, but for panics and stacktraces in CI, we need some level of information to be present in order for the backtraces to be useful.
Related Issue(s)
Closes #1550
Documentation
The
line-tables-only
value fordebuginfo
was introduced in Rust 1.70