Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

duckdb cache miss #2870

Closed
max-sixty opened this issue Jun 20, 2023 · 7 comments
Closed

duckdb cache miss #2870

max-sixty opened this issue Jun 20, 2023 · 7 comments

Comments

@max-sixty
Copy link
Member

What's up?

I spent some time trying to work out what was going on. I have a hypothesis — that ~/.cargo/registry/src is required for duckdb, but no other crates, and the cache action removes it.

No need to read the full description below unless you're particularly interested; I'll post a reference to this on the rust cache repo.


Our build now takes 2-3x longer, because duckdb-0.8.0 needs to be re-compiled, even when the build has successfully cached.

I thought that the cache action might be deleting important files. So I tried reproducing the deletions it makes. It's a JS app, so I couldn't easily run it locally, so instead I ran a debug build of the job at https://github.com/PRQL/prql/actions/runs/5316798564/jobs/9627453030, and downloaded the file as deletions.log.

I then ran this to delete the equivalent paths on my system.

CARGO_INCREMENTAL=0 RUSTFLAGS="-C debuginfo=0" CARGO_TERM_COLOR=always cargo build --all-targets --target=(default-target)

# Two main groups of paths — in the specific target path, and the general `debug` path.

cat deletions.log | rg 'x86_64-unknown-linux-gnu/([^-]*)\b-?' -o -r  (pwd)/'target/aarch64-apple-darwin/$1-*' | sort | uniq | rargs fd -uu --prune --full-path --glob {} > deletions.txt

cat deletions.log | rg 'target/debug/([^-]*)\b-?'  -o -r  (pwd)/'target/debug/$1-*' | sort | uniq | rargs fd -uu --prune --full-path --glob {} > deletions.txt >> deletions.txt

cat deletions.txt | rargs rm -rf {}

...after doing this, the problem doesn't reproduce — it compiles quickly.

BUT! There is another path that does seem to matter though! Specifically removing ~/.cargo/registry/src means we can reproduce the slow compilation:

CARGO_INCREMENTAL=0 RUSTFLAGS="-C debuginfo=0" CARGO_TERM_COLOR=always cargo build --all-targets --target=(default-target)
   Compiling libduckdb-sys v0.8.0  # Wut!
   Compiling duckdb v0.8.0
   Compiling prql-compiler v0.8.1 (/Users/maximilian/workspace/prql/prql-compiler)
    Finished dev [unoptimized + debuginfo] target(s) in 46.41s

...so this is likely the culprit.


An aside — I confirmed that this ordering doesn't matter — running each one after cargo clean:

  • cargo build -p prql-compiler && cargo clippy -p prql-compiler takes 28s
  • cargo clippy -p prql-compiler && cargo build -p prql-compiler also takes 28s,
max-sixty added a commit to max-sixty/prql that referenced this issue Jun 21, 2023
Some small chance this helps with PRQL#2870 (which is becoming a bottleneck on iterations, I'm finding)
@max-sixty
Copy link
Member Author

If we can't fix this, we could remove integration testing from the PR CI, and run it only on test-all.yaml on merges. Not ideal but CI latency is much higher now, both because of the longer CI and the CI queue which the longer CI creates...

@eitsupi
Copy link
Member

eitsupi commented Jun 21, 2023

Is it possible to remove duckdb-rs from the default development dependencies?
Obviously this crate is difficult to build and could be an obstacle to contribution. (See #2894 (comment))

@max-sixty
Copy link
Member Author

Is it possible to remove duckdb-rs from the default development dependencies?

ref #2899

@max-sixty
Copy link
Member Author

An update here — lots of work happened on this, but unresolved as of yet. duckdb-rs is no longer in the PR CI path; we're waiting on Swatinem/rust-cache#150.

@eitsupi
Copy link
Member

eitsupi commented Sep 17, 2023

This can be closed?

@max-sixty
Copy link
Member Author

Yes!

@max-sixty
Copy link
Member Author

(I'm not quite sure how this resolved, but thank you to everyone who spent time on it — it was quite an effort in the end. Very happy it's working now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants