-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Polars build times #847
Comments
Me too! My two cents: The To mitigate compile times, I am actively introducing feature gates, both for the data types that should be compiled and the operation that should be compiled.
I am curious how this differs so much between machines. I can compile Anyway.. I will try to do a bloat scan in the coming weeks and see if I can find some low hanging fruit. |
For context. The mentioned compile times are with https://github.com/nushell/nushell/blob/55cab9eb4ff4ee3ee73efc6f8973901b2a91c921/Cargo.toml#L162 |
The dataframe feature depends on polars, which consumes a lot of memory and time to build. Enabling both LTO and dataframe feature may cause CI build to fail due to out of memory. pola-rs/polars#847 Signed-off-by: nibon7 <nibon7@163.com>
Build times improved recently due to compiler improvements in rust nighty. Compile time for optimized Polars (python) went down from 30 minutes to 15 minutes. |
Yeah, we can close this. It is a something we constantly evaluate. |
Are you using Python or Rust?
Rust
Which feature gates did you use?
Default and also ["serde", "rows"]
What version of polars are you using?
git = "https://github.com/pola-rs/polars"
rev = "f60d86bc0921bd42635e8a33e7aad28ebe62dc3e"
version = "0.14.2"
What operating system are you using polars on?
Linux
Describe your bug.
Build times both standalone and as part of Nushell are quite high. I did a
cargo bloat
run of a default build of polars: https://gist.github.com/jonathandturner/82cb8304996cc6c9ebc912b193eacce7.When we enable dataframe support in Nushell, which is built on polars and arrow, the build times for the release build increase 3x. On my machine, the build times go from 10mins pre-dataframe to 30mins with dataframe.
I'm hoping we can work together to figure out how we can improve build times.
What are the steps to reproduce the behavior?
In polars:
cargo bloat
In Nushell:
cargo build --release --all --features=extra,dataframe
What is the expected behavior?
While we know that polars will add some build time, we're seeing large amounts of memory usage during builds and hoping we can work together to lower the memory usage. In theory, this should cause less memory thrashing during build, yielding faster build times.
cc @elferherrera
The text was updated successfully, but these errors were encountered: