New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cargo build --dependencies-only #2644
Comments
@nagisa, |
I do not remember exactly why, but I do remember that I ended just running rustc manually. |
Sometimes you add a bunch of dependencies to your project, know it will take a while to compile next time you |
As described in #3615 this is useful with |
@gregwebs out of curiosity do you want to cache compiled dependencies or just downloaded dependencies? Caching compiled dependencies isn't implemented today (but would be with a command such as this) but downloading dependencies is available via |
Generally, as with my caching use case, the dependencies change infrequently and it makes sense to cache the compilation of them. The Haskell tool stack went through all this and they seemed to generally decided to merge things into a single command where possible. For |
@gregwebs ok thanks for the info! |
@alexcrichton, |
@KalitaAlexey I personally wouldn't be convinced just yet, but it'd be good to canvas opinions from others on @rust-lang/tools as well |
@alexcrichton, |
I don't see much of a use case - you can just do |
What's the API? |
Implement an |
I wasn't able to find any information about an |
Docs are a little thin, but start here: cargo/src/cargo/ops/cargo_rustc/mod.rs Lines 62 to 64 in 609371f
You can look at the RLS for an example of how to use them: https://github.com/rust-lang-nursery/rls/blob/master/src/build.rs#L288 |
A question of Stack Overflow wanted this feature. In that case, the OP wanted to build the dependencies for a Docker layer. A similar situation exists for the playground, where I compile all the crates once. In my case, I just put in a dummy |
@shepmaster unfortunately the proposed solution wouldn't satisfy that question because a |
I ended up here because I also am thinking about the Docker case. To do a good docker build I want to: COPY Cargo.toml Cargo.lock /mything
RUN cargo build-deps --release # creates a layer that is cached
COPY src /mything/src
RUN cargo build --release # only rebuild this when src files changes This means the dependencies would be cached between docker builds as long as I understand |
The dockerfile template in shepmaster's linked stackoverflow post above SOLVES this problemI came to this thread because I also wanted the docker image to be cached after building the dependencies. After later resolving this issue, I posted something explaining docker caching, and was informed that the answer was already linked in the stackoverflow post. I made this mistake, someone else made this mistake, it's time to clarify.
After building, changing only the source and rebuilding starts from the cached image with dependencies already built. |
someone needs to relax... |
Also @KarlFish what you're proposing is not actually working. If using
|
Here's a better version. FROM rust:1.20.0
WORKDIR /usr/src
# Create blank project
RUN USER=root cargo new umar
# We want dependencies cached, so copy those first.
COPY Cargo.toml Cargo.lock /usr/src/umar/
WORKDIR /usr/src/umar
# This is a dummy build to get the dependencies cached.
RUN cargo build --release
# Now copy in the rest of the sources
COPY src /usr/src/umar/src/
# This is the actual build.
RUN cargo build --release \
&& mv target/release/umar /bin \
&& rm -rf /usr/src/umar
WORKDIR /
EXPOSE 3000
CMD ["/bin/umar"] |
You can always review the complete Dockerfile for the playground. |
Hi! |
I agree that it would be really cool to have a --deps-only option so that we could cache our filesystem layers better in Docker. I haven't tried replicating this yet, but it looks very promising. This is in glibc and not musl, by the way. My main priority is to get to a build that doesn't take 3-5 minutes ever time, not a 5 MB alpine-based image. |
Speaking for myself, I deploy 100% of my services in containers, built with In concrete terms, my container builds always look something like this: FROM rust:1.68 AS builder
WORKDIR /repo
# Cache downloaded+built dependencies
COPY Cargo.toml Cargo.lock /repo/
RUN \
mkdir /repo/src && \
echo 'fn main() {}' > /repo/src/main.rs && \
cargo build --release && \
rm -Rvf /repo/src
# Build our actual code
COPY src /repo/src
RUN \
touch src/main.rs && \
cargo build --release
FROM gcr.io/distroless/cc-debian11
COPY --from=builder /repo/target/release/whatever /usr/local/bin/
ENTRYPOINT ["/usr/local/bin/whatever"] EDIT to add: It's worth noting that this workflow works fine; while it makes for a slightly verbose |
I think these examples are both off the mark:
If I'm using a containerized Rust development environment, I'm running
That can be done without If folks are trying to use Docker's layer cache, they're by definition running I don't think I also think they're often (not always) doing these builds on a stateless CI system, where there's the additional complication of how you save/restore the layer cache between invocations. There are various options but they all seem to have (different) caveats. |
In this case, you're most likely to use something like I understand the desire to have a more efficient solution also, but it's been nearly 7 years since this issue was opened and the described use-cases still require a third-party tool and some googling. If I started to implement this feature close to originally described, would it be likely to be accepted? |
I'd like to cache all sorts of commands that I repeatedly execute in CI. For me, this has nothing to do with my local development machine. But I honestly don't understand the distinction. Why is just Whether the size of the resulting cache is large is all relative. If it's many gigabytes but cuts CI time by 90%, that's a win in my book. BTW, I run |
Fair enough, for a development use-case the user will probably indeed have a long-lived container where caching is not such a concern.
That's true, the services use-case is maybe performed with docker compose or something similar, what I meant was using
What do you mean by "close to originally described"? Something akin to Implementing a command for building dependencies only, while ignoring all source files, could in theory remove the need to create dummy files, but users would still need to copy all
That's one of the questions that we wanted to know the answer to, whether users care about the cached layer size or not. My personal expectation is that users don't care about it and just want caching to work. |
After going back and forth for a while, I've landed at the following
(including at work):
1. If you can build statically-linked binaries with minimal/simple
dependencies (libc, rustls, etc.) like a lot of network services, then
do builds outside the container, and copy them into a minimal runtime
container like alpine or even scratch;
2. If you have complex or dynamically linked dependencies, build
_inside_ the container for 2 primary (but related) reasons:
reproducibility between development environments and ensuring your
distributed artifact in the container is more likely to work the first
time/across upgrades.
If you're working with something like imagemagick, for example, it's a
pain to get all the developers working with the same versions installed
across platforms. It's also easy to make a mistake in your build and
copy an artifact into a container that has the wrong version of
imagemagick, or is missing it completely. You also have to maintain and
synchronize 2 environments, at a minimum: CI outside the Docker build
and the Dockerfile dependency installation.
In general, I've found the stub approach to be sufficient, especially
when not using workspaces, but it definitely confuses people on first
encounter, and feels a bit like a magic art to get everything set up
with efficient caching.
I mostly just wanted to weigh in with what I think is the single most
critical reason to run cargo build in a container: ensuring you're
building against your runtime dependencies.
…On 4/6/23 00:30, Jakub Beránek wrote:
> If I'm using a containerized Rust development environment, I'm running cargo build from within a relatively long-lived docker run invocation. I'll have the source code and target dir in a volume mount. There's no layer caching of my builds or any need for it.
Fair enough, for a development use-case the user will probably indeed have a long-lived container where caching is not such a concern.
> That can be done without cargo running in the same Docker container as those external services, or any Docker container at all.
That's true, the services use-case is maybe performed with docker compose or something similar, what I meant was using `cargo build` in a Docker image as a means of preparing a production image (the external service thing is related, but probably orthogonal).
> If I started to implement this feature close to originally described, would it be likely to be accepted?
What do you mean by "close to originally described"? Something akin to `--deps-only` has been implemented in this issue several times, but it has been shown repeatedly that it does not help with Docker layer caching.
Implementing a command for building dependencies only could in theory remove the need to create dummy files, but users would still need to copy all `Cargo.toml`, `config.toml` files and `Cargo.lock`. We have examined this and found that it would require a non-trivial refactoring of Cargo to make this work, so currently this solution does not seem viable, since it wouldn't even solve the problem completely, it would just remove a part of the friction.
> Whether the size of the resulting cache is large is all relative. If it's many gigabytes but cuts CI time by 90%, that's a win in my book.
That's one of the questions that we wanted to know the answer to, whether users care about the cached layer size or not. My personal expectation is that users don't care about it and just want caching to work.
|
I want to thank @Kobzol for writing a nice doc laying out the problem nicely and why I've described my use case and alternative approach above. Below is my opinion about what cargo changes might be helpful. I'd still like to see I'm not sure
I do think
|
btw, I think a relatively slick outside-cargo solution would be to have a single tool, let's call it RUN \
--mount=type=cache,id=target,target=.../target \
--mount=type=cache,id=cargo,target=/root/.cargo \
cargo gha-cache load; \
cargo clippy --release --workspace && \
cargo fmt --release --all && \
cargo build --release --workspace && \
cargo test --release; \
rv=$?; \
cargo gha-cache save; \
exit $rv The tool would:
As a user, if you're going to need an extra tool, better to have one that doesn't need much config. For Docker, I think you need to plumb in the right env variables for the cache API access but otherwise it could be as easy as outside Docker. It wouldn't be as fun to be the maintainer of the tool. It'd have to know about both GHA's API and |
One more thought: maybe the most minimal thing
My hypothetical gha cache helper command could use this so it doesn't have to know And have I mentioned #6529 enough yet? ;-) |
Our use case is running a production build, but that's trivializing it. For us, the main use case for running In the cases were you don't have native builder instances available and cross-compilation is actually necessary, running Cargo in Docker can still help simplify things (see https://github.com/tonistiigi/xx for example). |
I met with a use case for this requested feature today. I am working on a non-Rust project, but it depends on a CLI tool that can be installed by cargo. It would be nice if I can specify all cargo dependencies in a virtual manifest and install them using the cargo command. |
@UlyssesZh it sounds like you'd want something like #5120 though that was closed because the given use case would be better served by #2267. |
Hi, we've created an Earthly function that significantly eases caching cargo builds on CI. Even if you are not into Earthly, I think these implementation details might be useful for you:
|
Cargo generates a bunch of `.fingerprint` directories for each of the packages that it builds. This tracks changes to files and allows it to skip compilation on things that have not changed [^1]. The reason that this was not being picked up *even though* the contents of `main.rs` were different in the two stages, seems to be that the fingerprint is based on the `mtime` (modification time?) of a file and not on its contents[^2]. Relevant discussion here: rust-lang/cargo#2644 (comment) [^1]: https://doc.rust-lang.org/stable/nightly-rustc/cargo/core/compiler/fingerprint/index.html# [^2]: rust-lang/cargo#6529
cargo team notes:
There should be an option to only build dependencies.
The text was updated successfully, but these errors were encountered: