Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cargo ./target fills with outdated artifacts as toolchains are updated/changed #5026

Open
alexheretic opened this Issue Feb 9, 2018 · 20 comments

Comments

Projects
None yet
@alexheretic
Copy link
Member

alexheretic commented Feb 9, 2018

cargo has to recompile dependencies once a toolchain has updated/changed, this leaves all previous artifacts intact. However, this means when you're updating toolchains the target dir will just grow and grow.

Currently you can cargo clean and rebuild, but it would be nice to be able to clean only the dependencies that are not compatible for the current toolchain. Perhaps cargo clean --incompatible ?

For example building a small crate

# cargo +nightly-2018-02-07 build; du -hd2 ./target
57M   ./target/debug/deps
4.0K  ./target/debug/examples
528K  ./target/debug/.fingerprint
4.0K  ./target/debug/native
18M   ./target/debug/incremental
7.6M  ./target/debug/build
82M   ./target/debug
82M   ./target

# cargo +nightly-2018-02-08 build; du -hd2 ./target
113M  ./target/debug/deps
4.0K  ./target/debug/examples
1.1M  ./target/debug/.fingerprint
4.0K  ./target/debug/native
35M   ./target/debug/incremental
16M   ./target/debug/build
163M  ./target/debug
163M  ./target

# cargo +nightly build; du -hd2 ./target
169M  ./target/debug/deps
4.0K  ./target/debug/examples
1.6M  ./target/debug/.fingerprint
4.0K  ./target/debug/native
52M   ./target/debug/incremental
23M   ./target/debug/build
245M  ./target/debug
245M  ./target

This can obviously be much more problematic on larger crates.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 9, 2018

This is currently a feature of Cargo that it aggressively caches, but we could always add a comment to delete otherwise stale artifacts!

@alexheretic

This comment has been minimized.

Copy link
Member Author

alexheretic commented Feb 9, 2018

Yes, I don't think the current behaviour is wrong. It would just be nice to able to say "get rid of the old cached stuff now, I'm not going back to rust 1.22 any time soon". Particularly for things like Rls where we can expect users to update the toolchain fairly often, but don't really inform them about cleaning ./target/rls/.

@seanmonstar

This comment has been minimized.

Copy link

seanmonstar commented Feb 17, 2018

Since I always develop on nightly, this is true every day. I'm kind of in the habit now of deleting ./target each morning.

However, it seems like this also affects the cache used by Travis CI, as noted in seanmonstar/reqwest#259

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Mar 12, 2018

I've experienced this problem both for local development and in Travis.

For example, my Travis CI cache has 56 copies of every dependency; sometimes the dependency is updated and sometimes the compiler is updated. For one crate I looked at, each build was 250K. In total, my cache weighed in at 1.4 GB. This caused the cache to take about 7-9 minutes to download and upload, compared to a build time of ~2 minutes! Some of that cache is from other parts of the build, but after clearing the cache and rebuilding, it's only ~175MB. Roughly 1.2 GB of the Travis cache was composed of these build artifacts.

I also primarily locally develop using nightly, and often find my laptop's disk running out of space because of multiple outdated gigabytes worth of build artifacts.

I agree that cargo clean --outdated would be a good first step. Right now, I do some complicated shell-scriptery to find all the directories that contain a Cargo.toml and run cargo clean in them.

A potential second step would be to do a bit of inspection during a build to note that there are (many? / large?) artifacts lying around that were not used and warn the user about them. A future thing would be akin to git's garbage collection, where unused artifacts older than some date are just automatically removed.

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Sep 28, 2018

Aside from CI, how common is switching compilers for a given project? I feel like it's not very common.

Can we have an on-by-default pref that specifies whether or not these things are deleted? Travis's image can turn the pref off.

We can also implement something where we record whenever an artifact was last used, and unnecessary artifacts that haven't been touched for a couple days get deleted. But I'd rather go the flat deletion route.

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Sep 28, 2018

how common is switching compilers for a given project? I feel like it's not very common.

Ideally it happens every 6 weeks at a minimum (stable), or perhaps every few days (nightly).

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Sep 28, 2018

I mean, switching back and forth between compilers.

If you update your stable or nightly you don't need the old artifacts anymore. It's only an issue if you're switching back and forth, which I feel is rare.

@dwijnand

This comment has been minimized.

Copy link
Member

dwijnand commented Sep 28, 2018

Switching between current stable and current nightly is at least fairly common.

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Sep 28, 2018

Off of CI? I find that a bit hard to believe.

Either way, this would at least work as a preference, preferably on by default.

@alexheretic alexheretic changed the title cargo target fills will outdated artifacts as toolchains are updated/changed cargo ./target fills with outdated artifacts as toolchains are updated/changed Sep 29, 2018

@yory8

This comment has been minimized.

Copy link

yory8 commented Sep 30, 2018

Switching between current stable and current nightly is at least fairly common.

probably only before clippy was in stable, though

@dwijnand

This comment has been minimized.

Copy link
Member

dwijnand commented Sep 30, 2018

I was thinking of cargo dev, actually. Perhaps it's more niche than I thought.

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Sep 30, 2018

@dbaron

This comment has been minimized.

Copy link

dbaron commented Sep 30, 2018

My hopefully slightly-less-niche use case was that I'd built software in rust that runs on a server (in my case, an IRC bot that I run on a VM in AWS), set up a cron job to keep the rust compiler up-to-date ($HOME/.cargo/bin/rustup update), and set up the script that starts the bot to update the source code and rebuild, so that restarting the bot would update it to the latest version. The combination of these things ended up filling up the disk space on my VM (i.e., filling up 2GB of storage out of 5GB configured) in about 14 months (April 2017 - June 2018).

@joelgallant

This comment has been minimized.

Copy link
Contributor

joelgallant commented Sep 30, 2018

To put in two cents from a non-compiler person, I do a lot of cross compiling, and I'd like for the caching to be more aggressive. cargo check seems to utilize a non-target dependent folder, and forces rebuilds when moving from a host build to target build. I'm not sure if auto-gc'ing or something like that is actually intuitive for a build system to do. As a user I'm fine with a cargo clean once in a while - but I can definitely see how this could be a problem in CI.

Maybe a cargo build --throw-away-outdated (excuse the bad arg name) might be the middleground, and CI runners can always opt for that.

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Sep 30, 2018

I feel like that's an orthogonal problem -- cargo check should be able to reuse metadata from already compiled crates -- it just doesn't.

If cargo check --target foo isn't reusing the target folder I feel like that's a bug.

I don't think what I'm proposing here affects your use case at all -- I'm proposing cleaning up after compiler upgrades, cargo check works differently. It's imperfect but it's not made worse by this proposal.

@shepmaster

This comment has been minimized.

Copy link
Member

shepmaster commented Oct 1, 2018

cargo check should be able to reuse metadata from already compiled crates

Specifically it's #3501

@Manishearth

This comment has been minimized.

Copy link
Member

Manishearth commented Dec 4, 2018

@alexcrichton i might try and implement this, do you have any pointers as to what I should look at?

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Dec 6, 2018

It sort of depends on the method of implementing, but it'd likely all be around src/cargo/core/compiler/*

@ehuss

This comment has been minimized.

Copy link
Contributor

ehuss commented Dec 6, 2018

Just a rough idea: One option is to completely remove rustc from the metadata hash (here). It is still included in the fingerprint, so when rustc changes cargo will recompile everything.

A more ambitious approach would be to make it so that you can keep multiple release channels cached at the same time. This would require extracting the channel from the version information and only including that in the metadata ("stable", "beta", "nightly"). The version string doesn't explicitly include the channel, so it might take some rough interpretation.

Also, I would be very careful to very fully test things in the rustc repo. Since it builds with multiple stages, I would double check that it won't suddenly start clobbering artifacts. There are some tricky issues involving __cargo_default_lib_metadata.

I don't know if this approach will work, but offhand I can't think of any major issues. I'm not sure how good rustc itself is at cleaning up its incremental directory and dealing with reusing it across versions.

@Hi-Angel

This comment has been minimized.

Copy link

Hi-Angel commented Jan 20, 2019

I want to add another problem that these artifacts are causing. It's that the generated code usually ends up somewhere down the target/debug/build/crate_name-id, and sometimes you may need to take a look at it. And when there's a bunch of crate_name-id with different ids, it's tough to figure out which one is the current.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.