-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compile time #113
Comments
yeah this is definitely on our radar. Thanks for flagging it. Other folks: please 👍🏻 this issue to prioritize compile times if they become an issue for you. |
You could add a demo example that depends on all SDKs for compile times measurement. |
We kind of get that right now through the CI in smithy-rs where one of the actions compiles every single SDK as a single Cargo workspace. For example, the cargo test AWS SDK step, which currently shows 14 minutes for all of the supported services. This data gets muddied every time we add a new service though. It would be nice to have it split out by service so that we could, for example, compare EC2's compile time across builds. It would also be nice to track with |
This is also my biggest complaint in using the SDK. One idea that a team member here had is that EC2 functionality could be opted-into using Cargo features. Often you will be using only a few calls from the API, but you have to compile all of it. Another idea would be to use dynamically dispatched trait objects instead of trait bound generics. I don't know how much that wouldn't help, but I mention it because Rust for Rustaceans mentions faster compile times as a benefit of using dynamic dispatch. |
I've investigated this a teeny bit and at least part of the problem seems to stem from the fact that EC2 has a significant number (118) of paginated APIs. The paginators return |
For my purposes I actually switched off of using this SDK to instead using a thin wrapper around the AWS CLI and shaved off many many minutes of build time, especially when you consider the separate rebuilds involved for e.g. tests/clippy/release. Obviously this it not an option or ideal in most cases where the SDK is used, I'm simply sharing the difference it made for me. I would love to use the SDK instead but for my light purposes it wasn't worth the cost of many minutes of build time. I sincerely believe that it would be a great idea to get in touch with the Rust compiler optimization team about the compile times incurred by these crates because I think it would serve as a great case study and benchmark on the impact of e.g. monomorphization and optimizing that would benefit the broader Rust ecosystem. |
Did anything come of these investigations? For a short term fix I am considering forking the generated ec2 crate and removing all the code that is unused by my project. |
I think we were all expecting at least some progress on this before this crate went 1.0. :-( Any plans to event start addressing this issue? |
I think what is needed is for the code generation to include a bunch of feature gates on EC2 API functionality so that we can enable the things we need in Cargo.toml. |
Compile times of an async cli application went from 5 to 15 seconds just by referencing aws-sdk-ec2, and calling 2 different areas of the functionality. I also was expecting some progress on this before Stable release, please address this asap. |
Just to chime in here on the build times, I'm looking into build failures (well, more like Jenkins dying during builds) and noticed a couple of things that seem to line up with the behaviour here in more details. I've been tinkering with more verbose compiler logging/tracing for each stage and noticed some pretty interesting results for the AWS SDKs in particular that are miles more expensive that any other crate in shotover. The important results here are as follows.
Full log is here: Crate HighlightsI'd like to note that I'm aware that RSS based metrics for memory usage aren't the most accurate, but they are decent indicators. To that end, there are some seriously big numbers here. AWS IAMMost notably in the IAM crate, with type checking and borrow checking. The first round is ok, but the second round of it is what concerns me the the most. time: 36.337; rss: 1743MB -> 2523MB ( +780MB) type_check_crate
time: 6.148; rss: 686MB -> 886MB ( +200MB) MIR_borrow_checking
time: 7.677; rss: 463MB -> 686MB ( +223MB) type_check_crate
time: 30.141; rss: 2523MB -> 3476MB ( +953MB) MIR_borrow_checking This crate doesn't even finish compiling, and by the time the stage after AWS EC2Similar behaviour in the EC2 crate as well, but it seems to be more centred around translation to IR and codegen. It seems the EC2 crate has a smaller footprint in terms of checkable usage [2024-01-09T01:45:11.935Z] time: 9.141; rss: 49MB -> 1016MB ( +967MB) expand_crate
[2024-01-09T01:45:11.935Z] time: 9.142; rss: 49MB -> 1016MB ( +967MB) macro_expand_crate
[2024-01-09T01:46:00.366Z] time: 4.718; rss: 813MB -> 1577MB ( +764MB) codegen_to_LLVM_IR
[2024-01-09T01:46:00.366Z] time: 117.170; rss: 637MB -> 1577MB ( +940MB) codegen_crate Crate Structure ConsiderationsImmediately, this rings a bell to me as poorly structured code that relies on insane amounts of context passed around, or transitive overuse of contexts between function calls. Another avenue Looking at the IAM crate in particular, first thing is that the crate seems match that, from what I can see it relies on async state passed around through invocations of various kinds. Within #[::tokio::main]
async fn main() -> Result<(), iam::Error> {
let config = aws_config::load_from_env().await;
let client = aws_sdk_iam::Client::new(&config);
// ... make some calls with the client
Ok(())
} I feel like this presents a nightmare situation for the borrow checker with production-level usage of this crate. Could be wrong, but the compilation data seems to indicate something in this realm. I haven't looked too far into the EC2 crate characteristics though |
The crates are all fundamentally identical in terms of structure—these are all interesting statistics. We're planning on spending time this year to see if there any quick wins that are achievable. We have some inherent challenges—EC2 has literally hundreds of APIs. We need to generate all of them. But some things can probably be improved. |
Well said. I have run out of memory on 16GB laptop just trying to compile something that used one EC2 API call. It's unacceptable. AWS Developers, thank you for the efforts to make this usable. However, your team really needs to address this even if it means releasing v2.0.0 of the crates with an entirely different approach. |
I totally agree—especially about memory. The compile time itself I expect to be improved by the upcoming rustc parallel frontend, but memory is very difficult to work around and can make compiling the Rust SDK in CI prohibitively expensive. We're prioritizing investigating this and will keep folks posted. We're also very cognizant of the fact that without decisive action, this issue is only going to get worse as more and more APIs are added to these existing crates. I think, however, a lot of this comes from a very fundamental limitation: EC2 (edit: and other large crates like S3, SQS, IAM etc.) currently have hundreds of operations (EC2 has 615 as of today). They each need their own serializers and deserializers, inner types, etc.—unavoidably an absurdly large amount of code. There are definitely things we can do to shrink how much code we generate; I expect that these will yield improvements in the low double digit percentages. This is helpful, but I think ultimately an EC2 that needs 12GB of memory and compiles in 1m30s is not that much better than 16GB compiling in 2m assuming the rest of your build only needs 1GB and 10s. We're concurrently investigating ways that we can allow customers to compile only parts of the EC2 crate. These are unfortunately also limited because it was recently discovered that Another last ditch item that we're floating is a way to generate ad-hoc SDKs with only a subset of operations included. This is obviously non-ideal for a number of reasons, but we agree that's it's incredibly frustrating to run out of memory on your build because you want to make one HTTP request to start an instance. In any case, bear with us, this is among our top priorities to improve for this quarter. |
It seems there is a big emphasis on EC2 here but I am curious if the other SDKs are being included as well. I am experimenting with |
that's a good point—I updated the note. EC2 is the biggest, but there are a lot that are close behind. |
What do you mean by "does not scale well on crates.io"?
|
There was a Rust blog post on the feature scalability issue. Although, 23,000 is quite a bit larger than 600 😄 This is the important bit that prevents us from taking this approach though:
|
The limitation can be relaxed for a project.
Therefore, I think it's still worth considering the approach, because IMHO it's quite straightforward and improves the situation a lot immediately. Maybe not at the first resort, but can give it a try if other ways don't work or require too much effort. |
My understanding of the Cargo.toml
build.rs fn main() {
aws_sdk_builder::generate_rust_file(
"src/my_aws_api.rs",
vec![
"ec2/run_instances",
"ec2/list_instances",
"s3/get_object",
]
);
} Into -- --
|
I thought it worth mentioning that I've ported our usage of aws-sdk-ec2 to shell out to the aws CLI as a workaround until the compile time issues are fixed: shotover/aws-throwaway#41 |
I was having a read of the generated code in
What about features for groups of operations? Is there any kind of internal grouping of operations in the underlying model? Or otherwise by "commonness of use"? From quickly skimming the operation list, it looked to me that most of the operations I would never need because they cover some use case that I think of as obscure. I wonder how big the subset is of "stuff that most people use a lot"? I.e. would it be useful to have features for:
🤷♂️ I'd be keen to know if there are particular avenues that are currently favored / considered most promising by the AWS team. Thanks! 🙇 |
I tested it out on EC2 v1.42.0 and it's a big improvement over the current front-end. I ran the following compiles on my work laptop: a 2021 16" M1 MacBook Pro 32GB.
It looks like they still have many issues to solve before it can be the default, though. |
Parallel frontend won't help if the CPU is already fully occupied, e.g., busy with compiling other crates together with |
Parallel frontend will also not help with memory usage — I'd expect it to make it worse. For a really meaningful improvement, I think a solution is still needed that allows doing less work, not just trying to do the same amount of work faster. E.g. splitting it up into separate crates, feature flags, etc. If Rust had better support for dynamic linking (e.g. via the proposed crABI and |
**TLDR: custom wrapper around smithy-rs resulted reduced aws-sdk-* compile time to seconds.**
Update: For our team, here us how we solved aws-sdk compile times:
We wrote a rust cli designed build a custom version of aws-sdk-*
The cli program does this:
1. clones smithy-rs into temporary dir
2. strips down the json definition files to just the services and apis we use
3. Calls the smithy-rs build process to generate the rust code
4. Copies that outputted rust code into a public github repo in our account
5. Tags it with the aws-sdk version
Finally we updated our project's Cargo.toml to refer to our github version of aws-sdk-*
This works great because smithy-rs is powered from json definitions, and there were no additional changes needed.
Once using our own github version of the aws-sdk-* crates, fresh compile times for our project went from minutes down to seconds, recompile times went from 20 seconds down to 8.
Using the mold linker brought this down to 1.5 seconds.
For us it was worth it because nearly tripling our compile times was affecting the development flow, esp in web dev where you need to iterate more quickly.
Keeping updated with upstream is as simple as running a cli command and updating the version tags in our project's cargo.toml
We considered committing the generated crates to our project directory, but there were two main reasons we elected to use a separate repo instead:
1. cargo clippy DOES NOT like generated aws-sdk code
2. It was still 10mb of source code with only four sdk apis enabled.
Looking forward to not needing to do this, but it does work well for now.
|
@gahooa Would your team be willing to open source your implementation? |
@vultix yes we would. We are working on updating it to account for recent changes in the aws crates (service definitions moved to another repo) which prevented us from upgrading. |
I just tried the aws-sdk-ec2 in my workspace. rusoto_ec2 was always the second to last crate and took forever (66s) to build. aws-sdk-ec2 seems to be even worse.
Is compile time something you are going to consider in the future?
The text was updated successfully, but these errors were encountered: