Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile time #113

Open
tschuett opened this issue Jun 15, 2021 · 28 comments
Open

Compile time #113

tschuett opened this issue Jun 15, 2021 · 28 comments
Labels
feature-request A feature should be added or improved. p2 This is a standard priority issue

Comments

@tschuett
Copy link

I just tried the aws-sdk-ec2 in my workspace. rusoto_ec2 was always the second to last crate and took forever (66s) to build. aws-sdk-ec2 seems to be even worse.

>  cargo +nightly build -Z timings

aws-sdk-ec2 v0.0.8-alpha | 95.8s
rusoto_ec2 v0.46.0 | 66.5s

Is compile time something you are going to consider in the future?

@rcoh
Copy link
Contributor

rcoh commented Jun 15, 2021

yeah this is definitely on our radar. Thanks for flagging it.

Other folks: please 👍🏻 this issue to prioritize compile times if they become an issue for you.

@tschuett
Copy link
Author

You could add a demo example that depends on all SDKs for compile times measurement.

@jdisanti
Copy link
Contributor

We kind of get that right now through the CI in smithy-rs where one of the actions compiles every single SDK as a single Cargo workspace. For example, the cargo test AWS SDK step, which currently shows 14 minutes for all of the supported services. This data gets muddied every time we add a new service though. It would be nice to have it split out by service so that we could, for example, compare EC2's compile time across builds. It would also be nice to track with cargo llvm-lines for a measurement that is less impacted by differences in the CI hosts.

@webern
Copy link

webern commented Apr 20, 2022

This is also my biggest complaint in using the SDK. One idea that a team member here had is that EC2 functionality could be opted-into using Cargo features. Often you will be using only a few calls from the API, but you have to compile all of it.

Another idea would be to use dynamically dispatched trait objects instead of trait bound generics. I don't know how much that wouldn't help, but I mention it because Rust for Rustaceans mentions faster compile times as a benefit of using dynamic dispatch.

@Velfi
Copy link
Contributor

Velfi commented Apr 20, 2022

I've investigated this a teeny bit and at least part of the problem seems to stem from the fact that EC2 has a significant number (118) of paginated APIs. The paginators return impl tokio_stream::Stream which means we're doing a good amount of monomorphization. We'll investigate feature gating pagination or switching to dynamic dispatch.

@blaenk
Copy link

blaenk commented Nov 26, 2022

For my purposes I actually switched off of using this SDK to instead using a thin wrapper around the AWS CLI and shaved off many many minutes of build time, especially when you consider the separate rebuilds involved for e.g. tests/clippy/release.

Obviously this it not an option or ideal in most cases where the SDK is used, I'm simply sharing the difference it made for me.

I would love to use the SDK instead but for my light purposes it wasn't worth the cost of many minutes of build time.

I sincerely believe that it would be a great idea to get in touch with the Rust compiler optimization team about the compile times incurred by these crates because I think it would serve as a great case study and benchmark on the impact of e.g. monomorphization and optimizing that would benefit the broader Rust ecosystem.

@jmklix jmklix added feature-request A feature should be added or improved. p2 This is a standard priority issue labels Nov 28, 2022
@rukai
Copy link

rukai commented Jul 12, 2023

We'll investigate feature gating pagination or switching to dynamic dispatch

Did anything come of these investigations?

For a short term fix I am considering forking the generated ec2 crate and removing all the code that is unused by my project.
Edit: i attempted stripping out the bits I didnt need but I've given up because of how entangled the generated code is.

@johnm
Copy link

johnm commented Nov 28, 2023

I think we were all expecting at least some progress on this before this crate went 1.0. :-(

Any plans to event start addressing this issue?

@webern
Copy link

webern commented Nov 28, 2023

i attempted stripping out the bits I didnt need but I've given up because of how entangled the generated code is.

I think what is needed is for the code generation to include a bunch of feature gates on EC2 API functionality so that we can enable the things we need in Cargo.toml.

@gahooa
Copy link

gahooa commented Dec 3, 2023

Compile times of an async cli application went from 5 to 15 seconds just by referencing aws-sdk-ec2, and calling 2 different areas of the functionality.

I also was expecting some progress on this before Stable release, please address this asap.

@Jack-Kilrain
Copy link

Jack-Kilrain commented Jan 9, 2024

Just to chime in here on the build times, I'm looking into build failures (well, more like Jenkins dying during builds) and noticed a couple of things that seem to line up with the behaviour here in more details.

I've been tinkering with more verbose compiler logging/tracing for each stage and noticed some pretty interesting results for the AWS SDKs in particular that are miles more expensive that any other crate in shotover. The important results here are as follows.

[2024-01-09T01:45:01.991Z]    Compiling aws-sdk-ec2 v1.3.0
[2024-01-09T01:45:01.991Z] time:   0.001; rss:   41MB ->   44MB (   +3MB)	parse_crate
[2024-01-09T01:45:01.991Z] time:   0.000; rss:   44MB ->   49MB (   +4MB)	crate_injection
[2024-01-09T01:45:11.935Z] time:   9.141; rss:   49MB -> 1016MB ( +967MB)	expand_crate
[2024-01-09T01:45:11.935Z] time:   9.142; rss:   49MB -> 1016MB ( +967MB)	macro_expand_crate
[2024-01-09T01:45:11.935Z] time:   0.258; rss: 1016MB -> 1015MB (   -1MB)	AST_validation
[2024-01-09T01:45:11.935Z] time:   0.027; rss: 1015MB -> 1018MB (   +4MB)	finalize_imports
[2024-01-09T01:45:11.935Z] time:   0.108; rss: 1018MB -> 1018MB (   +0MB)	compute_effective_visibilities
[2024-01-09T01:45:11.935Z] time:   0.116; rss: 1018MB -> 1019MB (   +1MB)	finalize_macro_resolutions
[2024-01-09T01:45:15.203Z] time:   3.400; rss: 1019MB -> 1282MB ( +264MB)	late_resolve_crate
[2024-01-09T01:45:15.203Z] time:   0.163; rss: 1282MB -> 1284MB (   +1MB)	resolve_check_unused
[2024-01-09T01:45:15.459Z] time:   0.297; rss: 1284MB -> 1284MB (   +0MB)	resolve_postprocess
[2024-01-09T01:45:15.459Z] time:   4.117; rss: 1015MB -> 1284MB ( +269MB)	resolve_crate
[2024-01-09T01:45:15.714Z] time:   0.163; rss: 1284MB -> 1282MB (   -2MB)	write_dep_info
[2024-01-09T01:45:15.970Z] time:   0.165; rss: 1282MB -> 1282MB (   +0MB)	complete_gated_feature_checking
[2024-01-09T01:45:24.052Z] time:   0.497; rss: 1940MB -> 1901MB (  -39MB)	drop_ast
[2024-01-09T01:45:24.052Z] time:   8.104; rss: 1282MB -> 1746MB ( +464MB)	looking_for_derive_registrar
[2024-01-09T01:45:24.979Z] time:   9.303; rss: 1282MB -> 1743MB ( +461MB)	misc_checking_1
[2024-01-09T01:45:27.493Z] time:   2.087; rss: 1743MB -> 1823MB (  +80MB)	type_collecting
[2024-01-09T01:45:27.493Z] time:   0.490; rss: 1823MB -> 1819MB (   -4MB)	coherence_checking
[2024-01-09T01:45:42.328Z] time:  13.207; rss: 1819MB -> 1953MB ( +135MB)	wf_checking
[2024-01-09T01:46:00.366Z] time:   4.718; rss:  813MB -> 1577MB ( +764MB)	codegen_to_LLVM_IR
[2024-01-09T01:46:00.366Z] time: 109.242; rss: 1286MB -> 1577MB ( +291MB)	LLVM_passes
[2024-01-09T01:46:00.366Z] time: 117.170; rss:  637MB -> 1577MB ( +940MB)	codegen_crate
[2024-01-09T01:46:00.366Z] time:   0.187; rss: 1577MB ->  939MB ( -638MB)	free_global_ctxt
[2024-01-09T01:46:00.366Z] time:   0.133; rss:  939MB ->  944MB (   +5MB)	link_rlib
[2024-01-09T01:46:00.366Z] time:   0.141; rss:  939MB ->  944MB (   +5MB)	link_binary
[2024-01-09T01:46:00.366Z] time:   0.142; rss:  939MB ->  937MB (   -2MB)	link_crate
[2024-01-09T01:46:00.366Z] time:   0.143; rss:  939MB ->  937MB (   -2MB)	link
[2024-01-09T01:46:00.366Z] time: 132.034; rss:   27MB ->  143MB ( +116MB)	total
[2024-01-09T01:46:00.366Z]    Compiling aws-sdk-iam v1.3.0
[2024-01-09T01:46:00.366Z] time:   0.001; rss:   39MB ->   43MB (   +3MB)	parse_crate
[2024-01-09T01:46:01.293Z] time:  36.337; rss: 1743MB -> 2523MB ( +780MB)	type_check_crate
[2024-01-09T01:46:02.657Z] time:   2.174; rss:   43MB ->  289MB ( +247MB)	expand_crate
[2024-01-09T01:46:02.658Z] time:   2.175; rss:   43MB ->  289MB ( +247MB)	macro_expand_crate
[2024-01-09T01:46:02.658Z] time:   0.059; rss:  289MB ->  289MB (   +0MB)	AST_validation
[2024-01-09T01:46:02.658Z] time:   0.006; rss:  289MB ->  290MB (   +0MB)	finalize_imports
[2024-01-09T01:46:02.658Z] time:   0.015; rss:  290MB ->  290MB (   +0MB)	compute_effective_visibilities
[2024-01-09T01:46:02.658Z] time:   0.021; rss:  290MB ->  291MB (   +1MB)	finalize_macro_resolutions
[2024-01-09T01:46:03.219Z] time:   0.649; rss:  291MB ->  361MB (  +71MB)	late_resolve_crate
[2024-01-09T01:46:03.219Z] time:   0.041; rss:  361MB ->  362MB (   +0MB)	resolve_check_unused
[2024-01-09T01:46:03.475Z] time:   0.081; rss:  362MB ->  362MB (   +0MB)	resolve_postprocess
[2024-01-09T01:46:03.475Z] time:   0.816; rss:  289MB ->  362MB (  +72MB)	resolve_crate
[2024-01-09T01:46:03.475Z] time:   0.044; rss:  362MB ->  362MB (   +0MB)	write_dep_info
[2024-01-09T01:46:03.475Z] time:   0.046; rss:  362MB ->  362MB (   +0MB)	complete_gated_feature_checking
[2024-01-09T01:46:04.839Z] time:   0.107; rss:  493MB ->  494MB (   +0MB)	drop_ast
[2024-01-09T01:46:05.095Z] time:   1.554; rss:  362MB ->  460MB (  +98MB)	looking_for_derive_registrar
[2024-01-09T01:46:05.351Z] time:   1.807; rss:  362MB ->  463MB ( +101MB)	misc_checking_1
[2024-01-09T01:46:05.607Z] time:   0.386; rss:  463MB ->  502MB (  +39MB)	type_collecting
[2024-01-09T01:46:05.862Z] time:   0.113; rss:  502MB ->  515MB (  +13MB)	coherence_checking
[2024-01-09T01:46:09.129Z] time:   2.793; rss:  515MB ->  599MB (  +84MB)	wf_checking
[2024-01-09T01:46:13.294Z] time:   7.677; rss:  463MB ->  686MB ( +223MB)	type_check_crate
[2024-01-09T01:46:19.830Z] time:   6.148; rss:  686MB ->  886MB ( +200MB)	MIR_borrow_checking
[2024-01-09T01:46:21.736Z] time:   2.683; rss:  886MB ->  956MB (  +71MB)	MIR_effect_checking
[2024-01-09T01:46:23.102Z] time:   0.354; rss:  962MB ->  967MB (   +5MB)	module_lints
[2024-01-09T01:46:23.102Z] time:   0.355; rss:  962MB ->  967MB (   +5MB)	lint_checking
[2024-01-09T01:46:23.358Z] time:   0.362; rss:  967MB ->  968MB (   +1MB)	privacy_checking_modules
[2024-01-09T01:46:23.358Z] time:   1.166; rss:  957MB ->  968MB (  +12MB)	misc_checking_3
[2024-01-09T01:46:27.526Z] time:   4.251; rss:  968MB -> 1090MB ( +122MB)	generate_crate_metadata
[2024-01-09T01:46:27.526Z] time:   0.017; rss: 1090MB -> 1090MB (   +0MB)	monomorphization_collector_root_collections
[2024-01-09T01:46:29.418Z] time:   1.944; rss: 1090MB -> 1146MB (  +55MB)	monomorphization_collector_graph_walk
[2024-01-09T01:46:31.307Z] time:   1.705; rss: 1146MB -> 1182MB (  +36MB)	partition_and_assert_distinct_symbols
[2024-01-09T01:46:31.563Z] time:  30.141; rss: 2523MB -> 3476MB ( +953MB)	MIR_borrow_checking
[2024-01-09T01:46:43.763Z] time:  12.219; rss: 3476MB -> 3761MB ( +285MB)	MIR_effect_checking
[2024-01-09T01:46:50.315Z] time:   1.548; rss: 3776MB -> 3782MB (   +6MB)	module_lints
[2024-01-09T01:46:50.315Z] time:   1.549; rss: 3776MB -> 3782MB (   +6MB)	lint_checking
[2024-01-09T01:46:50.877Z] time:   1.663; rss: 3782MB -> 3797MB (  +15MB)	privacy_checking_modules
[2024-01-09T01:46:50.877Z] time:   5.817; rss: 3761MB -> 3797MB (  +36MB)	misc_checking_3
[2024-01-09T01:47:12.764Z] time:  21.787; rss: 3797MB -> 4225MB ( +428MB)	generate_crate_metadata
[2024-01-09T01:47:12.764Z] time:   0.107; rss: 4225MB -> 4225MB (   +0MB)	monomorphization_collector_root_collections
[2024-01-09T01:47:22.707Z] time:   8.825; rss: 4225MB -> 4436MB ( +211MB)	monomorphization_collector_graph_walk
[2024-01-09T01:47:54.736Z] time:  28.788; rss: 4436MB -> 4534MB (  +98MB)	partition_and_assert_distinct_symbols
[2024-01-09T01:50:59.545Z] Cannot contact i-0c01329bc530749c2: java.lang.InterruptedException // Boom - Jenkins agent got nuked by linux watchdog (as well as journald, this build and a few others)
[2024-01-09T01:58:55.094Z] Could not connect to i-0c01329bc530749c2 to send interrupt signal to process

Full log is here:
build.log

Crate Highlights

I'd like to note that I'm aware that RSS based metrics for memory usage aren't the most accurate, but they are decent indicators. To that end, there are some seriously big numbers here.

AWS IAM

Most notably in the IAM crate, with type checking and borrow checking. The first round is ok, but the second round of it is what concerns me the the most.

time:  36.337; rss: 1743MB -> 2523MB ( +780MB)	type_check_crate
time:   6.148; rss:  686MB ->  886MB ( +200MB)	MIR_borrow_checking
time:   7.677; rss:  463MB ->  686MB ( +223MB)	type_check_crate
time:  30.141; rss: 2523MB -> 3476MB ( +953MB)	MIR_borrow_checking

This crate doesn't even finish compiling, and by the time the stage after partition_and_assert_distinct_symbols (whatever it is, since it hasn't been logged yet) has been underway, the
memory usage goes from 4436MB to 7900MB+.

image

AWS EC2

Similar behaviour in the EC2 crate as well, but it seems to be more centred around translation to IR and codegen. It seems the EC2 crate has a smaller footprint in terms of checkable usage
for contexts passed around than IAM though.

[2024-01-09T01:45:11.935Z] time:   9.141; rss:   49MB -> 1016MB ( +967MB)	expand_crate
[2024-01-09T01:45:11.935Z] time:   9.142; rss:   49MB -> 1016MB ( +967MB)	macro_expand_crate
[2024-01-09T01:46:00.366Z] time:   4.718; rss:  813MB -> 1577MB ( +764MB)	codegen_to_LLVM_IR
[2024-01-09T01:46:00.366Z] time: 117.170; rss:  637MB -> 1577MB ( +940MB)	codegen_crate

Crate Structure Considerations

Immediately, this rings a bell to me as poorly structured code that relies on insane amounts of context passed around, or transitive overuse of contexts between function calls. Another avenue
is async lifetimes dominating the context transfers on mutable objects.

Looking at the IAM crate in particular, first thing is that the crate seems match that, from what I can see it relies on async state passed around through invocations of various kinds. Within
those methods, transitive state can be constructed or copied to be used in other invocations for the IAM API. The bit that also adds to that by the example, everything is within an async context
and most operations are build from the client context itself.

#[::tokio::main]
async fn main() -> Result<(), iam::Error> {
    let config = aws_config::load_from_env().await;
    let client = aws_sdk_iam::Client::new(&config);
	// ... make some calls with the client
    Ok(())
}

I feel like this presents a nightmare situation for the borrow checker with production-level usage of this crate. Could be wrong, but the compilation data seems to indicate something in this realm.

I haven't looked too far into the EC2 crate characteristics though

@rcoh
Copy link
Contributor

rcoh commented Jan 9, 2024

The crates are all fundamentally identical in terms of structure—these are all interesting statistics. We're planning on spending time this year to see if there any quick wins that are achievable.

We have some inherent challenges—EC2 has literally hundreds of APIs. We need to generate all of them. But some things can probably be improved.

@gahooa
Copy link

gahooa commented Jan 10, 2024

Immediately, this rings a bell to me as poorly structured code ...

Well said. I have run out of memory on 16GB laptop just trying to compile something that used one EC2 API call. It's unacceptable.

AWS Developers, thank you for the efforts to make this usable. However, your team really needs to address this even if it means releasing v2.0.0 of the crates with an entirely different approach.

@rcoh
Copy link
Contributor

rcoh commented Jan 10, 2024

I totally agree—especially about memory. The compile time itself I expect to be improved by the upcoming rustc parallel frontend, but memory is very difficult to work around and can make compiling the Rust SDK in CI prohibitively expensive. We're prioritizing investigating this and will keep folks posted.

We're also very cognizant of the fact that without decisive action, this issue is only going to get worse as more and more APIs are added to these existing crates.

I think, however, a lot of this comes from a very fundamental limitation: EC2 (edit: and other large crates like S3, SQS, IAM etc.) currently have hundreds of operations (EC2 has 615 as of today). They each need their own serializers and deserializers, inner types, etc.—unavoidably an absurdly large amount of code. There are definitely things we can do to shrink how much code we generate; I expect that these will yield improvements in the low double digit percentages. This is helpful, but I think ultimately an EC2 that needs 12GB of memory and compiles in 1m30s is not that much better than 16GB compiling in 2m assuming the rest of your build only needs 1GB and 10s.

We're concurrently investigating ways that we can allow customers to compile only parts of the EC2 crate. These are unfortunately also limited because it was recently discovered that features do not scale well on crates.io, which would prevent us from doing things like feature-per-operation.

Another last ditch item that we're floating is a way to generate ad-hoc SDKs with only a subset of operations included. This is obviously non-ideal for a number of reasons, but we agree that's it's incredibly frustrating to run out of memory on your build because you want to make one HTTP request to start an instance.

In any case, bear with us, this is among our top priorities to improve for this quarter.

@nkconnor
Copy link

It seems there is a big emphasis on EC2 here but I am curious if the other SDKs are being included as well. I am experimenting with aws-sdk-s3 and aws-sdk-sqs in an established project and they are considerably longer to build than the other 500 dependencies we have.

@rcoh
Copy link
Contributor

rcoh commented Jan 10, 2024

that's a good point—I updated the note. EC2 is the biggest, but there are a lot that are close behind.

@chinedufn
Copy link

These are unfortunately also limited because it was recently discovered that features do not scale well on crates.io, which would prevent us from doing things like feature-per-operation.

What do you mean by "does not scale well on crates.io"?
Is there a source for this that you could link to? I couldn't find anything after about 2 mins or so of searching.

web-sys has over 1500 different features and I haven't noticed anything as a user. I never use more than a couple dozen feature flags, however.
Or do you mean that having hundreds of feature flags in a single crate causes problem for crates.io's internal processes?

@jdisanti
Copy link
Contributor

jdisanti commented Jan 10, 2024

There was a Rust blog post on the feature scalability issue. Although, 23,000 is quite a bit larger than 600 😄

This is the important bit that prevents us from taking this approach though:

Now comes the important part: on 2023-10-16 the crates.io team deployed a change limiting the number of features a crate can have to 300 for any new crates/versions being published.

@xxchan
Copy link

xxchan commented Jan 11, 2024

We are aware of a couple of crates that also have legitimate reasons for having more than 300 features, and we have granted them appropriate exceptions to this rule, but we would like to ask everyone to be mindful of these limitations of our current systems.

The limitation can be relaxed for a project. aws-sdk-ec2 should also belong to "have legitimate reasons" (I'm not sure, maybe @Turbo87 can comment this?), and 600 is also much smaller than 1500 of web-sys.

We also invite everyone to participate in finding solutions to the above problems.


Therefore, I think it's still worth considering the approach, because IMHO it's quite straightforward and improves the situation a lot immediately.

Maybe not at the first resort, but can give it a try if other ways don't work or require too much effort.

@gahooa
Copy link

gahooa commented Jan 11, 2024

Another last ditch item that we're floating is a way to generate ad-hoc SDKs with only a subset of operations included. This is obviously non-ideal for a number of reasons, but we agree that's it's incredibly frustrating to run out of memory on your build because you want to make one HTTP request to start an instance.

My understanding of the aws-* crates currently is they are auto-generated based on an API definition. Would it be crazy to create 1 additional aws crate, aws-sdk-builder which you use in this fashion?

Cargo.toml

[build-dependencies]
aws-sdk-builder = "*"

build.rs

fn main() {
    aws_sdk_builder::generate_rust_file(
        "src/my_aws_api.rs", 
        vec![
            "ec2/run_instances", 
            "ec2/list_instances", 
            "s3/get_object",
        ]
    );
}

Into src/my_aws_api.rs would be placed an exact set of functions and generated structs required to use the 3 APIs mentioned above. No more, no less.

--
I'd like to suggest this as an optional addition to what we already have, not a replacement. It would 100% solve the issue for a number of folks like me, and allow you to give people an "out" while you work on a more formal solution.

--
Note: The output file should be user-specified:

  • Some users would want to commit it
  • Others would place it in the build tree but ignore it
  • others would want to put it in temp build dir and include!(...)

@rukai
Copy link

rukai commented Jan 28, 2024

I thought it worth mentioning that I've ported our usage of aws-sdk-ec2 to shell out to the aws CLI as a workaround until the compile time issues are fixed: shotover/aws-throwaway#41

@jeffparsons
Copy link

I think, however, a lot of this comes from a very fundamental limitation: EC2 (edit: and other large crates like S3, SQS, IAM etc.) currently have hundreds of operations (EC2 has 615 as of today). They each need their own serializers and deserializers, inner types, etc.—unavoidably an absurdly large amount of code. There are definitely things we can do to shrink how much code we generate; I expect that these will yield improvements in the low double digit percentages. This is helpful, but I think ultimately an EC2 that needs 12GB of memory and compiles in 1m30s is not that much better than 16GB compiling in 2m assuming the rest of your build only needs 1GB and 10s.

I was having a read of the generated code in aws-sdk-ec2 to get a feeling for whether and how much doing more things dynamically might help. E.g. what if the various request/response types exposed a reflection API (read/write key/values by name paired with embedding the smithy models themselves into the crate), and then there was only one implementation of serialization/deserialization, pagination, etc.? I suppose the reflection code itself could end up costing as much to compile... but maybe not. Is this something that's been considered?

We're concurrently investigating ways that we can allow customers to compile only parts of the EC2 crate. These are unfortunately also limited because it was recently discovered that features do not scale well on crates.io, which would prevent us from doing things like feature-per-operation.

What about features for groups of operations? Is there any kind of internal grouping of operations in the underlying model?

Or otherwise by "commonness of use"? From quickly skimming the operation list, it looked to me that most of the operations I would never need because they cover some use case that I think of as obscure. I wonder how big the subset is of "stuff that most people use a lot"? I.e. would it be useful to have features for:

  • default: loads of people will need these
  • obscure: very special people use these, but most people won't
  • arcane-dark-magic: there are three people who use these operations, and AWS knows them by name.

🤷‍♂️

I'd be keen to know if there are particular avenues that are currently favored / considered most promising by the AWS team.

Thanks!

🙇

@Velfi
Copy link
Contributor

Velfi commented May 15, 2024

The Parallel Rustc Working Groups is working on releasing a new parallel compiler front-end this year.

I tested it out on EC2 v1.42.0 and it's a big improvement over the current front-end. I ran the following compiles on my work laptop: a 2021 16" M1 MacBook Pro 32GB.

  • Current front-end cargo build -p aws-sdk-ec2 --timings: 148.6s (2m 28.6s)
  • Parallel front-end RUSTFLAGS="-Z threads=8" cargo +nightly build -p aws-sdk-ec2 --timings: 85.9s (1m 25.9s)

It looks like they still have many issues to solve before it can be the default, though.

@xxchan
Copy link

xxchan commented May 16, 2024

Parallel frontend won't help if the CPU is already fully occupied, e.g., busy with compiling other crates together with aws-sdk-ec2 (interprocess parallelism). So I think it doesn't make the users of aws-sdk-ec2's lives better :)

@jeffparsons
Copy link

Parallel frontend will also not help with memory usage — I'd expect it to make it worse.

For a really meaningful improvement, I think a solution is still needed that allows doing less work, not just trying to do the same amount of work faster. E.g. splitting it up into separate crates, feature flags, etc.

If Rust had better support for dynamic linking (e.g. via the proposed crABI and #[export] RFCs) then I'd be pushing for a solution that allows using pre-built binaries...

@gahooa
Copy link

gahooa commented May 18, 2024 via email

@vultix
Copy link

vultix commented Jul 26, 2024

@gahooa Would your team be willing to open source your implementation?

@gahooa
Copy link

gahooa commented Jul 31, 2024

@vultix yes we would. We are working on updating it to account for recent changes in the aws crates (service definitions moved to another repo) which prevented us from upgrading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request A feature should be added or improved. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests