Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo ./target fills with outdated artifacts as toolchains are updated/changed #5026

Open
alexheretic opened this issue Feb 9, 2018 · 31 comments
Labels
A-caching Area: caching of dependencies, repositories, and build artifacts A-layout Area: target output directory layout, naming, and organization C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-clean S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. Z-gc Nightly: garbage collection

Comments

@alexheretic
Copy link
Member

cargo has to recompile dependencies once a toolchain has updated/changed, this leaves all previous artifacts intact. However, this means when you're updating toolchains the target dir will just grow and grow.

Currently you can cargo clean and rebuild, but it would be nice to be able to clean only the dependencies that are not compatible for the current toolchain. Perhaps cargo clean --incompatible ?

For example building a small crate

# cargo +nightly-2018-02-07 build; du -hd2 ./target
57M   ./target/debug/deps
4.0K  ./target/debug/examples
528K  ./target/debug/.fingerprint
4.0K  ./target/debug/native
18M   ./target/debug/incremental
7.6M  ./target/debug/build
82M   ./target/debug
82M   ./target

# cargo +nightly-2018-02-08 build; du -hd2 ./target
113M  ./target/debug/deps
4.0K  ./target/debug/examples
1.1M  ./target/debug/.fingerprint
4.0K  ./target/debug/native
35M   ./target/debug/incremental
16M   ./target/debug/build
163M  ./target/debug
163M  ./target

# cargo +nightly build; du -hd2 ./target
169M  ./target/debug/deps
4.0K  ./target/debug/examples
1.6M  ./target/debug/.fingerprint
4.0K  ./target/debug/native
52M   ./target/debug/incremental
23M   ./target/debug/build
245M  ./target/debug
245M  ./target

This can obviously be much more problematic on larger crates.

@alexcrichton
Copy link
Member

This is currently a feature of Cargo that it aggressively caches, but we could always add a comment to delete otherwise stale artifacts!

@alexheretic
Copy link
Member Author

Yes, I don't think the current behaviour is wrong. It would just be nice to able to say "get rid of the old cached stuff now, I'm not going back to rust 1.22 any time soon". Particularly for things like Rls where we can expect users to update the toolchain fairly often, but don't really inform them about cleaning ./target/rls/.

@alexcrichton alexcrichton added the C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` label Feb 9, 2018
@seanmonstar
Copy link

Since I always develop on nightly, this is true every day. I'm kind of in the habit now of deleting ./target each morning.

However, it seems like this also affects the cache used by Travis CI, as noted in seanmonstar/reqwest#259

@shepmaster
Copy link
Member

I've experienced this problem both for local development and in Travis.

For example, my Travis CI cache has 56 copies of every dependency; sometimes the dependency is updated and sometimes the compiler is updated. For one crate I looked at, each build was 250K. In total, my cache weighed in at 1.4 GB. This caused the cache to take about 7-9 minutes to download and upload, compared to a build time of ~2 minutes! Some of that cache is from other parts of the build, but after clearing the cache and rebuilding, it's only ~175MB. Roughly 1.2 GB of the Travis cache was composed of these build artifacts.

I also primarily locally develop using nightly, and often find my laptop's disk running out of space because of multiple outdated gigabytes worth of build artifacts.

I agree that cargo clean --outdated would be a good first step. Right now, I do some complicated shell-scriptery to find all the directories that contain a Cargo.toml and run cargo clean in them.

A potential second step would be to do a bit of inspection during a build to note that there are (many? / large?) artifacts lying around that were not used and warn the user about them. A future thing would be akin to git's garbage collection, where unused artifacts older than some date are just automatically removed.

@Manishearth
Copy link
Member

Aside from CI, how common is switching compilers for a given project? I feel like it's not very common.

Can we have an on-by-default pref that specifies whether or not these things are deleted? Travis's image can turn the pref off.

We can also implement something where we record whenever an artifact was last used, and unnecessary artifacts that haven't been touched for a couple days get deleted. But I'd rather go the flat deletion route.

@shepmaster
Copy link
Member

how common is switching compilers for a given project? I feel like it's not very common.

Ideally it happens every 6 weeks at a minimum (stable), or perhaps every few days (nightly).

@Manishearth
Copy link
Member

I mean, switching back and forth between compilers.

If you update your stable or nightly you don't need the old artifacts anymore. It's only an issue if you're switching back and forth, which I feel is rare.

@dwijnand
Copy link
Member

Switching between current stable and current nightly is at least fairly common.

@Manishearth
Copy link
Member

Off of CI? I find that a bit hard to believe.

Either way, this would at least work as a preference, preferably on by default.

@alexheretic alexheretic changed the title cargo target fills will outdated artifacts as toolchains are updated/changed cargo ./target fills with outdated artifacts as toolchains are updated/changed Sep 29, 2018
@ghost
Copy link

ghost commented Sep 30, 2018

Switching between current stable and current nightly is at least fairly common.

probably only before clippy was in stable, though

@dwijnand
Copy link
Member

I was thinking of cargo dev, actually. Perhaps it's more niche than I thought.

@Manishearth
Copy link
Member

Manishearth commented Sep 30, 2018 via email

@dbaron
Copy link

dbaron commented Sep 30, 2018

My hopefully slightly-less-niche use case was that I'd built software in rust that runs on a server (in my case, an IRC bot that I run on a VM in AWS), set up a cron job to keep the rust compiler up-to-date ($HOME/.cargo/bin/rustup update), and set up the script that starts the bot to update the source code and rebuild, so that restarting the bot would update it to the latest version. The combination of these things ended up filling up the disk space on my VM (i.e., filling up 2GB of storage out of 5GB configured) in about 14 months (April 2017 - June 2018).

@joelgallant
Copy link
Contributor

To put in two cents from a non-compiler person, I do a lot of cross compiling, and I'd like for the caching to be more aggressive. cargo check seems to utilize a non-target dependent folder, and forces rebuilds when moving from a host build to target build. I'm not sure if auto-gc'ing or something like that is actually intuitive for a build system to do. As a user I'm fine with a cargo clean once in a while - but I can definitely see how this could be a problem in CI.

Maybe a cargo build --throw-away-outdated (excuse the bad arg name) might be the middleground, and CI runners can always opt for that.

@Manishearth
Copy link
Member

I feel like that's an orthogonal problem -- cargo check should be able to reuse metadata from already compiled crates -- it just doesn't.

If cargo check --target foo isn't reusing the target folder I feel like that's a bug.

I don't think what I'm proposing here affects your use case at all -- I'm proposing cleaning up after compiler upgrades, cargo check works differently. It's imperfect but it's not made worse by this proposal.

@shepmaster
Copy link
Member

cargo check should be able to reuse metadata from already compiled crates

Specifically it's #3501

@Manishearth
Copy link
Member

@alexcrichton i might try and implement this, do you have any pointers as to what I should look at?

@alexcrichton
Copy link
Member

It sort of depends on the method of implementing, but it'd likely all be around src/cargo/core/compiler/*

@ehuss
Copy link
Contributor

ehuss commented Dec 6, 2018

Just a rough idea: One option is to completely remove rustc from the metadata hash (here). It is still included in the fingerprint, so when rustc changes cargo will recompile everything.

A more ambitious approach would be to make it so that you can keep multiple release channels cached at the same time. This would require extracting the channel from the version information and only including that in the metadata ("stable", "beta", "nightly"). The version string doesn't explicitly include the channel, so it might take some rough interpretation.

Also, I would be very careful to very fully test things in the rustc repo. Since it builds with multiple stages, I would double check that it won't suddenly start clobbering artifacts. There are some tricky issues involving __cargo_default_lib_metadata.

I don't know if this approach will work, but offhand I can't think of any major issues. I'm not sure how good rustc itself is at cleaning up its incremental directory and dealing with reusing it across versions.

@Hi-Angel
Copy link

I want to add another problem that these artifacts are causing. It's that the generated code usually ends up somewhere down the target/debug/build/crate_name-id, and sometimes you may need to take a look at it. And when there's a bunch of crate_name-id with different ids, it's tough to figure out which one is the current.

@crusty-dave
Copy link

crusty-dave commented Dec 7, 2019

I was going to open this as an issue when I found this one.

I was thinking we need:
cargo clean --stale
Maybe there needs to be:
cargo clean --analyze
To let you see how bad things have gotten. If each incremental has a way to easily detect the version, then perhaps:
cargo clean --stale --version 1.34.0
Regards,
-Dave

@fenhl
Copy link
Contributor

fenhl commented Feb 19, 2020

Looks like this has been implemented as cargo-sweep.

@ehuss
Copy link
Contributor

ehuss commented Apr 5, 2020

I have proposed a change in #8073 so that nightly and beta artifacts use the same filenames between versions. Stable artifacts still use different versions as I'm not comfortable with doing that yet (imagine someone testing stable and then their msrv, it could be annoying to trigger rebuilds or break CI caches). My intent is that some kind of gc will get built-in to cargo at some point in the future to address that.

@shepmaster
Copy link
Member

My intent is that some kind of gc will get built-in to cargo at some point in the future to address that.

At that point in time, will the change proposed in #8073 be reverted?

@ehuss
Copy link
Contributor

ehuss commented Apr 8, 2020

I added a -Zseparate-nightlies flag to disable the new behavior just in case.

At that point in time, will the change proposed in #8073 be reverted?

Probably. It is desired that the gc will be rustup-aware and know which toolchains are still installed. One downside of that approach is that it probably won't run on every build (if there is a performance issue), but I imagine we'll figure things out when the time comes.

@llogiq
Copy link
Contributor

llogiq commented Dec 19, 2021

Another side effect is that sometimes after updating tool chains, I get a "This crate was built by an incompatible rust version" error. Couldn't cargo just detect the version from the crate metadata, compare with the current rustc version and clean all outdated artifacts if those don't match?

@bjorn3
Copy link
Member

bjorn3 commented Dec 19, 2021

Both rustc and cargo normally ignore files compiled with an incompatible rustc version. Rustc only gives an "This crate was built by an incompatible rust version" error if the only files matching a crate name are built with a different rustc version. If there is one with the right rustc version, rustc will pick this one. Cargo can't read the rustc version from the files itself, but it does keep a "fingerprint" for each file with the rustc version it was built from. In addition changing the rustc version changes the hash in the filename. The only case in which this error should be possible would be when clearing the target dir and building again would result in an error that the crate doesn't exist at all.

Couldn't cargo just detect the version from the crate metadata, compare with the current rustc version and clean all outdated artifacts if those don't match?

This would mean that switching between stable and nightly will recompile everything every time, which would be very annoying when you are using a nightly only tool but otherwise use stable.

@Manishearth
Copy link
Member

This would mean that switching between stable and nightly will recompile everything every time, which would be very annoying when you are using a nightly only tool but otherwise use stable.

I think a slightly better model might be one where cargo keeps track of when artifacts were last used, and if they haven't been used in a while, prompt to delete them, and provide a cargo clean --outdated or something to do so. Perhaps even delete things automatically.

Also perhaps for toolchains that aren't in rustup anymore, though this may not work well in cases where other non-rustup toolchains are being used.

@ehuss
Copy link
Contributor

ehuss commented Dec 19, 2021

I think a slightly better model might be one where cargo keeps track of when artifacts were last used

FWIW, I'm currently working on this. It's been a bit difficult to hit the kind of performance goals I want, though.

@Manishearth
Copy link
Member

Oooh neat!

@kdy1
Copy link

kdy1 commented Dec 2, 2022

This is a bit off-topic as it's not about rustc version, but I made https://github.com/dudykr/ddt which can remove artifacts for old dependencies.

It uses cargo metadata --all-features to detect the current dependency graph, and remove rlib if it's from old version.

Note: I want to add a feature to remove artifacts if the rustic version is different, but I'm not sure how can I do it.

@epage epage added the S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. label Oct 17, 2023
@ehuss ehuss added A-caching Area: caching of dependencies, repositories, and build artifacts A-layout Area: target output directory layout, naming, and organization labels Nov 28, 2023
@epage epage added the Z-gc Nightly: garbage collection label Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-caching Area: caching of dependencies, repositories, and build artifacts A-layout Area: target output directory layout, naming, and organization C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-clean S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. Z-gc Nightly: garbage collection
Projects
None yet
Development

No branches or pull requests