New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cargo build --dependencies-only #2644
Comments
@nagisa, |
I do not remember exactly why, but I do remember that I ended just running rustc manually. |
Sometimes you add a bunch of dependencies to your project, know it will take a while to compile next time you |
As described in #3615 this is useful with |
@gregwebs out of curiosity do you want to cache compiled dependencies or just downloaded dependencies? Caching compiled dependencies isn't implemented today (but would be with a command such as this) but downloading dependencies is available via |
Generally, as with my caching use case, the dependencies change infrequently and it makes sense to cache the compilation of them. The Haskell tool stack went through all this and they seemed to generally decided to merge things into a single command where possible. For |
@gregwebs ok thanks for the info! |
@alexcrichton, |
@KalitaAlexey I personally wouldn't be convinced just yet, but it'd be good to canvas opinions from others on @rust-lang/tools as well |
@alexcrichton, |
I don't see much of a use case - you can just do |
What's the API? |
Implement an |
I wasn't able to find any information about an |
Docs are a little thin, but start here: cargo/src/cargo/ops/cargo_rustc/mod.rs Lines 62 to 64 in 609371f
You can look at the RLS for an example of how to use them: https://github.com/rust-lang-nursery/rls/blob/master/src/build.rs#L288 |
A question of Stack Overflow wanted this feature. In that case, the OP wanted to build the dependencies for a Docker layer. A similar situation exists for the playground, where I compile all the crates once. In my case, I just put in a dummy |
@shepmaster unfortunately the proposed solution wouldn't satisfy that question because a |
I ended up here because I also am thinking about the Docker case. To do a good docker build I want to: COPY Cargo.toml Cargo.lock /mything
RUN cargo build-deps --release # creates a layer that is cached
COPY src /mything/src
RUN cargo build --release # only rebuild this when src files changes This means the dependencies would be cached between docker builds as long as I understand |
The dockerfile template in shepmaster's linked stackoverflow post above SOLVES this problemI came to this thread because I also wanted the docker image to be cached after building the dependencies. After later resolving this issue, I posted something explaining docker caching, and was informed that the answer was already linked in the stackoverflow post. I made this mistake, someone else made this mistake, it's time to clarify.
After building, changing only the source and rebuilding starts from the cached image with dependencies already built. |
someone needs to relax... |
Also @karlfish what you're proposing is not actually working. If using
|
Here's a better version. FROM rust:1.20.0
WORKDIR /usr/src
# Create blank project
RUN USER=root cargo new umar
# We want dependencies cached, so copy those first.
COPY Cargo.toml Cargo.lock /usr/src/umar/
WORKDIR /usr/src/umar
# This is a dummy build to get the dependencies cached.
RUN cargo build --release
# Now copy in the rest of the sources
COPY src /usr/src/umar/src/
# This is the actual build.
RUN cargo build --release \
&& mv target/release/umar /bin \
&& rm -rf /usr/src/umar
WORKDIR /
EXPOSE 3000
CMD ["/bin/umar"] |
You can always review the complete Dockerfile for the playground. |
Hi! |
I agree that it would be really cool to have a --deps-only option so that we could cache our filesystem layers better in Docker. I haven't tried replicating this yet, but it looks very promising. This is in glibc and not musl, by the way. My main priority is to get to a build that doesn't take 3-5 minutes ever time, not a 5 MB alpine-based image. |
My interpretation (I'm also just a subscribed lurker here): maintainers have indicated that this feature isn't as easy as it may seem at first glance, and have no interest in implementing it themselves right now. To push this forward, someone needs to come up with a proposal that maintainers could evaluate, that takes into account the complexities and proposes resolutions and a path forward. That's how open source is supposed to work: if you want work done, you need to get involved. But instead of any in depth discussion, what I see is hundreds of "I need this, why isn't this done yet?" comments, which drown out any remaining signal in this thread. |
To add to intgr's comments, the cargo team is a small group of volunteers. So small in fact that we have this statement in our contribution docs
With some new team members, we are starting to get a little more breathing room and are discussing how to communicate out our mentorship capacity. As for where we contribute with our limited time, that is mostly driven by our own priorities. There are many different important things and its ok that we each have different needs and priorities. For me, my priorities are on lowering the barrier for contributors and improving the MSRV experience. |
So in your eyes what would this issue need to see before it was explicitly accepted by the cargo team? Obviously there are thousands of open cargo issues at the moment, and priorities of the team will not align with all of them. In particular I see this issue as a way to improve the experience of using Rust in production environments. |
Considering it would take time for a cargo team member to catch up on this when it isn't any of our priorities, what would be most helpful is someone summarizing the current state of this issue, including
From there, as time allows, we could provide guidance on what next steps someone could take to move this along. |
Here's my attempt to sum up the current status of this issue, forgive anything I miss out as this thread is long! At the time of writing it is 7 years old with 104 participants. OverviewThis issue has converged to a feature request that adds a build flag to The majority of participants are interested in this feature so that Rust projects can leverage Docker's layer caching system. Right now, when one is rebuilding a Rust project in Docker, you must recompile every dependency even if your change was to the source files only. There were a couple other use cases discussed in this thread that don't have as much community support:
Therefore I will focus on the first use case I outlined for the remainder of this summary. RequirementsA hypothetical
Current SolutionThe fundamental solution that most of the active work builds on is outlined nicely in a blog post by @LukeMathWalker here. The basic idea consists of:
Existing toolsThere has been some community effort put into this since this issue was opened which largely derive from the solution above, but in a scalable fashion:
Prior ArtThere are several examples of build systems and package managers allowing for this type of integration. This table originally created by @mroth:
Rust in ProductionI would say that the overall goal from this thread is to lower the friction of deploying Rust in production systems. There do exist tools that ease the pain (listed above). That said I think this is a worthwhile endeavor to bring into the standard tooling of the Rust ecosystem for the following reasons:
Thanks all! |
@bkolligs while that is a good summary of that specific solution, I was asking for a further step back to make sure the problems are understood, that we've identified the needs of the problems, and explored alternatives. For example, how does RUN caching fit into the solution space? Were there any challenges raised in the thread that need addressing? What are the trade offs of those workarounds with each other and with the built-in solution? You mention one in passing buried in the "Rust in Production section" related to sccache but that is an important topic to address, either for people to workaround this in the mean time or for prioritizing mentoring people on solving this. As for the prior art,
And as a heads up, with our team's capacity, we can't regularly be providing this level of hand holding for areas that we aren't focused on but we need people to step up, take a wider view, and tackle things like this. I made an exception here to try to help turn this thread around but I'm likely to bow out at this point. |
Thank you for the feedback, I'll reflect on some of these questions you posed (and invite others to do the same) |
As this thread has become hot again, I've decided to try my hand at implementing the feature. Actually, it's my first contribution to cargo (and my first dive into the code), and I'm quite new to the ecosystem, so I may have miss some points. Please, tell me if I'm off the mark. @epage I understand you are part of cargo team, so can I consult you about opening a PR? |
@wyfo I prefer to not look at PRs until the design phase is resolved. See my above comments on that topic. |
Does it mean that this issue has finally entered into a design phase? However, how can this design phase be driven, if, as you pointed out, the cargo team is overwhelmed and you may yourself bow out? This feature isn't accepted yet, but still is the most popular feature request ever, by far. No criticism here, I'm just not familiar with cargo design process and I truly wonder how this issue can quit its stale state this way. Anyway, I let my POC here, hoping it can demonstrate that the feature is quite simple regarding its implementation (again, if I haven't miss anything). |
It's unfortunately not so simple and your changes won't solve this issue. I think that is also a problem with this issue in general - it seems like it should be a tiny change to cargo, and that's possibly why people are frustrated with the cargo team not implementing it. But in reality, as has already been mentioned in this thread for several times, the issue is much more complicated. Just avoiding the build of the final crates won't help you with docker. The core of the problem isn't avoiding the build of the leaf crate, it's how to cooperate with Docker in a way that will allow proper layer caching. Even with your change, there is no easy way how to copy only the files that define all the workspace dependencies into Docker so that the layer is not busted when only leaf code changes. For that, you'd probably need to export some kind of build plan from Cargo, that would allow it to build the project (potentially without the leaf crate) - this is more or less what cargo chef does. And this has a lot of edge cases and complications (in any case, it's not a trivial feature!). This is much harder for cargo than e.g. for npm, simply because they define dependencies in a different way. npm basically contains a flat list of dependencies in a JSON file. There are no workspaces in npm! That's mostly trivial to resolve (although Regarding the design phase: I don't think that it has to be done by the cargo team :) But this would need indeed a thorough design document. |
Actually, I should have added a test to show the case where source code is not present; it's done now 38302f8, and it works fine. Therefore, I think this POC play nicely with Docker, as it doesn't require source code to be added, only manifest(s). However, there is one caveat: it requires explicit targets (cf. commit linked above for an example), so inferred IMO, this is not a big issue, as it is still less additional things to add compared to using So it seemed to me that this solution was fulfilling all the requirements listed in the summary above – keeping the manifest untouched was not listed as a requirement – but I should have mentioned the case without source, mea culpa. Did I miss something else? P.S. Regarding inferred targets, it's not really hard to add them, but it requires a lot more code modifications. However, it may introduce a small edge-case with workspace: if a workspace member has target with the same name as another member, wrongly inferring a target for this second member will cause a name collision. I admit this edge case is quite convoluted, so it might not be an issue if it's properly documented. |
Again, to reiterate, there are many edge cases that would need to be resolved (with some properly written design, not implementation), before this could be considered for inclusion in cargo. A "solution" that works for one particular usecase, requires changes in Cargo.toml of end crates and doesn't deal with edge cases is fine for a cargo crate, but not for an official cargo command. If you wanted to demonstrate that your approach solves the issues described here, you should show various situations (normal crate, workspace, patch sections etc.) and how would they work with Docker. Your current solution doesn't seem to make life any easier for workspace projects for example - you still need to copy the Cargo.toml files manually. It's important to think about the use-case - unless the command helps the Docker use-case (or other usecases mentioned here), it's not useful on its own. This should really be sketched in some design document first. |
I don't think anyone who's used to writing Dockerfiles would consider that a problem at all. If anything, that's a Dockerfile issue for not having support for better COPY directives. |
Yes, but in that case the latest implementation presented here (the simple "just don't build the leaf crate(s)" code) doesn't help you at all. To get Docker layer caching working for a (workspace) cargo project, you'd need to:
And you'd need to do these things regardless if you use Btw, all this has been discovered and mentioned in this long issue thread several times over. That's why it's not beneficial to offer simplistic implementations, but rather a design should be proposed, one which takes into account this whole issue thread, realistic use-cases, implementations from cargo chef and other tools (npm ci etc.), and sketches out how would the feature need to work to actually enable Docker layer caching without requiring the user to do all these hacks. I know that it's tempting to think that the simple implementation will help, but as has been demonstrated repeatedly in this thread, the "naive" simple solution of simply building only the dependencies doesn't help with Docker caching, because of how Cargo projects are structured. |
I need to clarify the fact that I've written my POC a few hours before the start of the last discussion with cargo team intervention asking for design phase and the impressive summary written by @bkolligs. I've discovered all theses comments just before writing mine, but as the implementation was ready, I decided to still post it. I didn't want to skip the design phase on purpose. But I should have presented it more clearly, because wrong assumptions were made about it. @Kobzol I think you misunderstood my previous explanation: with the latest implementation presented here, you don't have to create a dummy
Even if this implementation is simplistic (and again, written before the start of design discussion), I do believe it helps with Docker caching. However, I don't want to present it as the solution, I just think it's one possible solution, and if I keep discussing about it particularly, that's mostly because it seems to me it was not correctly understood (especially the "work without source" part). But I agree that the last sentence of my second comment was inappropriate, and I apologize. That's being said, let's talk about design now.
Here is a quick and subjective attempt of comparing both in term of Docker integration (I've implemented the latter, but use he former daily in production)
Personally, both are fine to me. I find "manifests only" more straightforward, and slightly more cache-efficient (no regeneration), but at the cost of any Cargo.toml change invalidating the cache. Going into implementation detail, "manifests only" seems a lot simpler (wihtout inferred targets, it's definitively simple), although inferred targets handling may sound quite hacky. "build plan" on the other hand would require both plan generation and parsing/compilation. However, I haven't read it mentioned, but the more I get familiar with this part of cargo code, the more I think |
I did not want to diminish your work, I just wanted to note that similar implementation as you have posted have already been proposed (and implemented) in this issue, and then abandoned, as they wouldn't be solving the Docker issue. I don't want to speak for the Cargo team, but I think that something akin to a RFC should be written first and discussed, before we get to an implementation. Some discussion about this, where the author of cargo chef has participated, has happened on Zulip. I think that creating some HackMD document or something like that with analysis of the current state of affairs would be a good next step. I'll try to create it if I can find some time in April. I don't want to continue with the endless stream of comments debating the usefulness of the simple approach, but to present a counter-point to Maybe to put it in another way, which is how I interpret it: even if some (e.g. yours) implementation magically solved all of the problems of this issue, I think that it still wouldn't get merged, before the author can describe exactly what problems does it solve, what was the state before, how does it differ after the new implementation, and if there are any alternatives. |
That's demonstrably false. You'd quickly run into roadblocks for all but the simplest projects, which is why this issue has a lot of interest. Otherwise most of us would just use that workaround and call it a day (it's not a feasible workaround). |
I agree that it's not feasible. That's why I'd like to see a solution that removes the need for manual copying, and also resolves the other issues mentioned here. |
I'm very confused by this. The problem is that you need to create dummy files and then figure out how to invalidate the cache of just those dummy files. Are there other problems with building in Docker? Dynamic linking preventing |
As @wyfo mentions, the calculation of what to build needs to be based on some part of the source code, but not all, which would invalidate the cache and made the whole feature moot (for docker use-case). However if there is a better approach, e.g. based on unit-graph or even better simply using single Regarding the "design doc", I'm not totally clear having this discussion in some other place would be better than here, in both places you need people to comment and provide feedback, but possibly it would be easier to keep track / pin the freshest conclusions, spec and open issues out there. In any case I feel the person most knowledgeable of those would be best to create it and paste the relevant content into it, to which I would encourage @Kobzol to do if you feel it would keep the info better organized. |
Ok, I started writing the document, I'll post it here once it's presentable. It will take some time to try all the mentioned workarounds here, find out how other languages do it, examine the existing 3rd party tools etc. |
I put my findings here. It contains an overview of the problem, description of possible workarounds and some potential solutions. I'll try to follow up the conversation with the cargo team to find out what could be the next steps. |
After discussing this with @epage, we think that the next step should be to better understand how are people using Cargo with Docker, primarily if they are using some of the workarounds mentioned in the document (primarily If you're using Cargo with Docker and you have a problem with dependencies being rebuilt unnecessarily because of missing Docker layer caching, could you please answer the questions below in a comment on this issue? The
|
Tried but had to give up on it, because I'd need the cargo build artifacts being available in the built image to actually use it in CI runs (e.g. Docker's cache mount does not provide a way to keep the cached path in the image, and trying to copy manually (in the It also has serious limitations with locking. |
Why is Docker critical to your workflow?When deploying an AWS lambda, a microservice against AWS ECS / , a helm chart in Kubernetes, almost every time people talk about serverless, docker enters the picture. Many organizations rely on hyperscaler managed services to deploy and operate software. In this context, one common pattern is to use docker as part of the development process. This is used so that the "works in my computer" risk is mitigated. Rust is no exception. When building a microservice, it is more reliable to use docker-compose to setup the scene of the different services (i.e. Th former is more reliable because (this is docker-compose specific, but the same considerations apply to Helm charts):
When Rust does not have a good story around Docker (when compared with other languages), people are forced to use the latter, which
Alternatively, if they use the former, it:
I.e. developers need to choose between two "bad" choices. This makes the case for Rust less compelling when compared to other languages. Coincidentally, the two "bad" choices result in exactly what Rust aims to improve: reliability, productivity and security. This is the reason why the issue on GH is so popular - people that value these aspects gravitate towards Rust. Those same people are trying to explain that Rust does not fulfill these when they integrate it with Docker (for the reasons summarized in the OP's summary). Have you tried using
|
I also encountered this, it's pretty annoying. The solutions to this (AFAIK) are:
|
My two cents:
Deployment to AWS EC2/ECS, so a Docker image is the intended build product of my GitHub Actions CI infrastructure. Also, we use crusty third-party/proprietary native libraries that work best with a particular distro version, so it's important to build and test our Rust binary within that Docker image, rather than build then copy into a Docker image.
I haven't tried
The reason I say this setup is better for us than
Docker cache mounts don't get saved on the host at a consistent location, so there's no straightforward way to save and restore them in a GHA cache. We'd have a fresh cache on each CI invocation, which is not useful. |
I just verified the unstable I ran: RUSTC_BOOTSTRAP=1 cargo doc --keep-going -Z unstable-options
Tracking issue is #10496 |
cargo team notes:
There should be an option to only build dependencies.
The text was updated successfully, but these errors were encountered: