Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace dependencies with long build times when used together (closes #3571) #3773

Merged
merged 9 commits into from Mar 27, 2022

Conversation

chippers
Copy link
Member

What kind of change does this PR introduce?

  • Bugfix
  • Feature
  • Docs
  • New Binding issue #___
  • Code style update
  • Refactor
  • Build-related changes
  • Other, please describe:

Does this PR introduce a breaking change?

  • Yes, and the changes were approved in issue #___
  • No

Checklist

  • When resolving issues, they are referenced in the PR's title (e.g fix: remove a typo, closes #___, #___)
  • A change file is added if any packages will require a version bump due to this PR per the instructions in the readme.
  • I have added a convincing reason for adding this feature, if necessary

Other information

When ring, zstd, and blake3 where built at the same time, their build scripts that span cc will run concurrently (even if limited by --jobs=1). This would cause massively inflated build times for dependencies that otherwise aren't too bad to build.

The following dependency changes were made after looking into it in #3571

original new note
zstd brotli Hand-picked compression levels were picked for good time vs size ratio
blake3 none We now vendor the blake3 reference implementation inside tauri-codegen
ring getrandom getrandom is also a CSPRNG that utilizes the OS's calls for secure random numbers. It is much more focused and smaller for our requirements.

@chippers chippers requested a review from a team March 24, 2022 20:59
@chippers chippers requested a review from a team as a code owner March 24, 2022 21:19
@chippers
Copy link
Member Author

That udeps failure seems to be a bug on the latest nightly. I also ran into it while comparing old build times and it ran successfully nightly-2022-03-18.

@chippers
Copy link
Member Author

I ran a simple comparison using the API Example, without any changes. rm -rf target && cargo +nightly-2022-03-18 build --timings (and --release for the release profile). I changed nothing else, so there is still LTO and all that stuff enabled, along with using Linux which comes with a couple long building dependencies like gio and gtk.

This was performed on a T410 (2 core 4 thread) w/ 8GiB RAM & an SSD.

profile dev this PR delta
debug 7m 13.9s 6m 52.5s -5.2%
release 10m 46.0s 9m 51.3s -9.3%

This is a decent gain for a clean build without any real downsides IMO. I think it might have even greater effect for when stuff like LTO is disabled and possibly using the LLD linker due to it taking more of the CPU time. I don't really have the time to test all the configuration of that stuff right now however.

@JamesYeoman maybe you would like to compare this branch on your project you opened the issue with.

@lucasfernog
Copy link
Member

Documentation for all three targets is still working, nice!

@JamesYeoman
Copy link

Ooh, nice improvement in performance! How do I use this branch instead of the release tauri crate?

@FabianLars
Copy link
Member

@JamesYeoman in your Cargo.toml:

[build-dependencies]
tauri-build = { git = "https://github.com/tauri-apps/tauri", branch = "feat/no-dep-c-build-script" }

[dependencies]
tauri= { git = "https://github.com/tauri-apps/tauri", branch = "feat/no-dep-c-build-script", features = ["leave this unchanged"] }

Make sure to run cargo update in your src-tauri dir after changing from/to git dependencies.

@JamesYeoman
Copy link

I can confirm, there is a performance improvement 😄
image

Some napkin math puts that at a -20% delta for dev building 🎉 🎉 🎉

Still not sub-2-minutes like I was hoping with such a reduction in time, but still phenominally impressive!

Weird thing though, that screenshot is from a second run using this branch...

This is the result of the first run using this branch
image

Do you know if the windows crate gets cached globally somehow? I don't use sccache, so it can't be that, and global caching is the only explanation I can think of...

@JamesYeoman
Copy link

Wow, and this is with --release!
image

Sure, when my release config is

[profile.release]
panic = "abort"    # Strip expensive panic clean-up logic
lto = "thin"       # Enables link time optimizations
opt-level = "s"    # Optimize for binary size
strip = true       # Strips symbol information from the executable
incremental = true # Ensure incremental building in enabled for the release profile

[profile.release.package."*"]
opt-level = 0
incremental = true

then of course everything other than my binary is gonna compile quickly. But this is for baseline measurements, eliminating as much time cost as possible. The bare minimum execution, eliminating as much variability as possible.

And of course the windows crate had to be compiled from scratch for the first release mode run.
image

I'm really confused by that. What reason could there be for this one-off long build for the windows crate?

@chippers
Copy link
Member Author

chippers commented Mar 25, 2022

I don't have any insight about the windows crate, that does seem weird if you are removing the target/ dir. Only thoughts off the top of my head is that it's a very large crate with a lot of features and perhaps IO cache is kicking in for reading the source? Wouldn't think it would have this much of an effect though...

@amrbashir
Copy link
Member

amrbashir commented Mar 25, 2022

We should look into using windows-sys it has faster compile time and smaller size.

@wravery can webview2-rs use windows-sys too?

@JamesYeoman
Copy link

I guess it could be IO cache tbh. I've just changed my release profile to remove the [profile.release.package."*"] section that was disabling optimisations there, and updated the core release profile as follows

[profile.release]
panic = "abort"     # Strip expensive panic clean-up logic
lto = "thin"        # Enables link time optimizations
opt-level = 2     # Optimize for binary size
strip = "debuginfo"
incremental = true  # Ensure incremental building in enabled for the release profile

(this was the original release config)

[profile.release]
panic = "abort"    # Strip expensive panic clean-up logic
lto = "thin"       # Enables link time optimizations
opt-level = "s"    # Optimize for binary size
strip = true       # Strips symbol information from the executable
incremental = true # Ensure incremental building in enabled for the release profile

and then ran 2 builds. An 8 minute build followed by a 4 minute build, despite doing cargo clean in-between each.

@lucasfernog
Copy link
Member

I'll merge this one, we're making a new release tomorrow if tao+wry can be updated too. We can make further improvements on another PR.

@lucasfernog lucasfernog merged commit 8661e3e into dev Mar 27, 2022
@lucasfernog lucasfernog deleted the feat/no-dep-c-build-script branch March 27, 2022 23:52
@wravery
Copy link
Contributor

wravery commented Mar 30, 2022

We should look into using windows-sys it has faster compile time and smaller size.

@wravery can webview2-rs use windows-sys too?

Doing so would make a lot of callers uglier because the parameter conversions wouldn't be implemented for us. For instance, anytime we pass a &str and windows uses something like IntoParam<PWSTR> to convert to UCS-2/UTF-16 and wrap it in the PWSTR type, we'd have to do our own UTF-16 conversion, null terminate the buffer, and get a raw pointer to the buffer.

The bigger problem would be anything that uses the #[implement] macro to implement a COM interface probably wouldn't work. I don't think the code it generates is compatible with windows-sys. We'd have to hand-build v-tables and such, or use another crate that supports implementing COM interfaces, but without the interface definition which windows builds from the metadata.

@amrbashir
Copy link
Member

The bigger problem would be anything that uses the #[implement] macro to implement a COM interface probably wouldn't work. I don't think the code it generates is compatible with windows-sys. We'd have to hand-build v-tables and such, or use another crate that supports implementing COM interfaces, but without the interface definition which windows builds from the metadata.

Is it worth opening an issue in windows-rs to request #[implement] macro that uses windows-sys types instead of windows? I can only imagine that manually implementing the COM interfaces of Webview2 is a lot of work to do.

A number of crates, has already moved to windows-sys like parking_lot. We also plan to move back to winit instead of tao and they already moved to windows-sys so we are definitely gonna end up with multiple versions of windows-sys in our build pipeline but it should be a problem since the compile time of windows-sys is fast (faster or equal to winapi crate). It would be awesome if wry could also use windows-sys.

@wravery
Copy link
Contributor

wravery commented Mar 30, 2022

Is it worth opening an issue in windows-rs to request #[implement] macro that uses windows-sys types instead of windows? I can only imagine that manually implementing the COM interfaces of Webview2 is a lot of work to do.

There are some COM interfaces implemented in tao and wry as well, particularly around drag-drop support. IIRC, the only interface in webview2-com that is implemented by hand is the environment options interface, almost everything else uses webview2-com-macros to generate boilerplate implementations on top of the windows-macros/windows-implement #[implement] macro for callback interfaces. So if #[implement] were supported with windows-sys (or maybe something like #[implement-sys]), it should be possible to convert everything in webview2-com.

However, post 0.30.0, the #[implement] macro depends on the implement feature in windows and the windows-bindgen crates. It adds more generated code for all of the interface implementation traits (and that's what depends on new features that are only available as of 1.61, and that's why we can't update to 0.34.0 until that version of rustc is available...). Given what I know about how #[implement] works, I suspect adding it on top of windows-sys would add enough code and slow it down almost as much as just using windows, and it would still be harder to consume without the parameter conversion helpers.

It doesn't hurt to ask, the windows crate owners know their code a lot better than I do. But I wouldn't get my hopes up too high that they'll want to invest in that if it doesn't pay off like I suspect.

@amrbashir
Copy link
Member

amrbashir commented Mar 30, 2022

There are some COM interfaces implemented in tao and wry as well, particularly around drag-drop support.

Correct. ITaskbarList, ITaskbarList2 and IDropTarget. Manually defining those is not a big deal tbh considering how small they are.

Given what I know about how #[implement] works, I suspect adding it on top of windows-sys would add enough code and slow it down almost as much as just using windows, and it would still be harder to consume without the parameter conversion helpers.

It doesn't hurt to ask, the windows crate owners know their code a lot better than I do. But I wouldn't get my hopes up too high that they'll want to invest in that if it doesn't pay off like I suspect.

Alright. Thanks anyway. We appreciate your help as always.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants