Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasi-adapter: Implement provider crate that embeds the adapter binaries #8792

Merged
merged 18 commits into from
Jun 18, 2024

Conversation

juntyr
Copy link
Contributor

@juntyr juntyr commented Jun 13, 2024

Fixes #8759

This PR adds a new wasi-preview1-component-adapter-provider crate that embeds the binary contents of the three wasi adapters. As per @alexcrichton's suggestion, this crate contains the adapter files as artefacts, which are checked in CI to match the freshly compiled versions of the wasi-preview1-component-adapter crate. The new crate is intended to be published on crates.io and allow Rust code which programmatically builds WASM components to use the adapter without needing more than a normal crate dependency (instead of having to manually download the file).

@juntyr juntyr requested review from a team as code owners June 13, 2024 05:37
@juntyr juntyr requested review from alexcrichton and removed request for a team June 13, 2024 05:37
pub const WASI_SNAPSHOT_PREVIEW1_COMMAND_ADAPTER: &[u8] =
include_bytes!("../artefacts/wasi_snapshot_preview1.command.wasm");

/// The "proxy" adapter provides implements a HTTP proxy.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexcrichton I didn't find as much information on the proxy online and in code. Does it implement a pre-defined HTTP proxy world?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It corresponds to this world, notably wasi:http/proxy which doesn't have filesystem/env vars/cli arguments/etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think what confused me is that the adapter defines the world inline here:

#[cfg(feature = "proxy")]
wit_bindgen_rust_macro::generate!({
path: "../wasi-http/wit",
inline: r#"
package wasmtime:adapter;
world adapter {
import wasi:clocks/wall-clock@0.2.0;
import wasi:clocks/monotonic-clock@0.2.0;
import wasi:random/random@0.2.0;
import wasi:cli/stdout@0.2.0;
import wasi:cli/stderr@0.2.0;
import wasi:cli/stdin@0.2.0;
}
"#,
std_feature,
raw_strings,
runtime_path: "crate::bindings::wit_bindgen_rt_shim",
skip: ["poll"],
});

which includes most imports but not the incoming HTTP request handler import and not the outgoing HTTP request export.

@juntyr
Copy link
Contributor Author

juntyr commented Jun 13, 2024

@alexcrichton Well at least the CI check works :)

I tried rebuilding the adapters locally with 1.76.0 stable and the pinned wasm-tools version and got the local CI script to pass, but the GitHub CI check still failed. Is there something else I need to do to make the build reproducible and push a matching artefact?

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! For the CI failure you say you're using Rust 1.76 but CI is using Rust 1.78, so that may be the culprit?

In addition to that there's a few more things here I think:

  • By default the version of Rust used to compile the adapter will change whenever we bump the MSRV. This seems a little unfortunate and I think it'd be better to do this as a discrete step for the adapter to reduce the annoyance of updating our MSRV. Could the rust version be explicitly specified for the CI job that builds the adapter as exactly 1.78.0? This might require splitting out the CI job so it's only build the adapter and doing nothing else.
  • Currently there's not an easy way to rebuild the adapter locally and then check in the results. Could the cmp commands in the build script be conditional somehow? For example they should only run on CI but locally you can skip them if you want to regenerate all the adapters.

pub const WASI_SNAPSHOT_PREVIEW1_COMMAND_ADAPTER: &[u8] =
include_bytes!("../artefacts/wasi_snapshot_preview1.command.wasm");

/// The "proxy" adapter provides implements a HTTP proxy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It corresponds to this world, notably wasi:http/proxy which doesn't have filesystem/env vars/cli arguments/etc

@juntyr
Copy link
Contributor Author

juntyr commented Jun 14, 2024

Thanks for this! For the CI failure you say you're using Rust 1.76 but CI is using Rust 1.78, so that may be the culprit?

In addition to that there's a few more things here I think:

  • By default the version of Rust used to compile the adapter will change whenever we bump the MSRV. This seems a little unfortunate and I think it'd be better to do this as a discrete step for the adapter to reduce the annoyance of updating our MSRV. Could the rust version be explicitly specified for the CI job that builds the adapter as exactly 1.78.0? This might require splitting out the CI job so it's only build the adapter and doing nothing else.
  • Currently there's not an easy way to rebuild the adapter locally and then check in the results. Could the cmp commands in the build script be conditional somehow? For example they should only run on CI but locally you can skip them if you want to regenerate all the adapters.

I'm a bit stumped. I check in CI and it now seems to build with 1.79 but even when I build locally with 1.79 it still produces different artefacts.

I absolutely agree we should explicitly lock down the adapter build settings to ensure that local build and CI builds remain the same unless someone explicitly changes that setting (including the Rust version). Once I have a successful build, I'll implement that.

What we could do perhaps is to offer two (mutually exclusive) options for the CI script:

  • no options just build the adapters to the target folder, as before
  • --replace also copies the newly built adapters into the provider crate
  • --check checks the newly built adapters against the provider crate (used in CI)

I'll add that once CI works.

@juntyr
Copy link
Contributor Author

juntyr commented Jun 14, 2024

I restarted my dev container, cleared my target folder, followed all toolchain setup steps from the job output log, and reran the build script locally, which produced no change in the adapter binaries. So locally the build is reproducible, but something must still be different in CI.

@alexcrichton
Copy link
Member

Ah yesterday it was Rust 1.78 but we upgraded to 1.79 yesterday too, explaining that discrepancy. What os are you on? I've seen that macOS produces different binaries than Linux, for example, and that might be happening here. I can try to dig in today as well and see what's going on.

@juntyr
Copy link
Contributor Author

juntyr commented Jun 14, 2024

I was running on a Linux GitHub codespace container and tried both 1.78 and 1.79 today. I also used the same version of wasm tools as the one in CI and checked the Rust version hash from the CI job log. If you do have a bit of time to spare to investigate, that would be brilliant :)

@alexcrichton
Copy link
Member

Hm interestingly I checked out this branch, ran the script, and it didn't fail. Locally I've got rustc 1.78 and wasm-tools 1.201, as opposed to CI which for the last build used rustc 1.79 and wasm-tools 1.0.27.

I added some debugging in the hopes of seeing what's happening in CI

@alexcrichton
Copy link
Member

Ok now I think it's more clear what's happening:

Reference copy of adapter is not the same as the generated copy of
the adapter
  reference copy: crates/wasi-preview1-component-adapter/provider/artefacts/wasi_snapshot_preview1.command.wasm
      built copy: target/wasm32-unknown-unknown/release/wasi_snapshot_preview1.command.wasm
To automatically update the reference copy set `BLESS=1` in the
environment
--- /dev/fd/63	2024-06-14 18:50:51.726335825 +0000
+++ /dev/fd/62	2024-06-14 18:50:51.726335825 +0000
@@ -1,4 +1,4 @@
-(module $wasi_preview1_component_adapter.command.adapter:
+(module $wasi_preview1_component_adapter.command.adapter:9a152d0ad902adeb995c2ab83556f62119151301
   (type (;0;) (func))
   (type (;1;) (func (param i32)))
   (type (;2;) (func (result i64)))

CI only sets the VERSION environment variable to the current git commit. That I think will no longer be viable with this approach so we'll need to remove that from CI and the script here.

@juntyr
Copy link
Contributor Author

juntyr commented Jun 14, 2024

Ok now I think it's more clear what's happening:

Reference copy of adapter is not the same as the generated copy of
the adapter
  reference copy: crates/wasi-preview1-component-adapter/provider/artefacts/wasi_snapshot_preview1.command.wasm
      built copy: target/wasm32-unknown-unknown/release/wasi_snapshot_preview1.command.wasm
To automatically update the reference copy set `BLESS=1` in the
environment
--- /dev/fd/63	2024-06-14 18:50:51.726335825 +0000
+++ /dev/fd/62	2024-06-14 18:50:51.726335825 +0000
@@ -1,4 +1,4 @@
-(module $wasi_preview1_component_adapter.command.adapter:
+(module $wasi_preview1_component_adapter.command.adapter:9a152d0ad902adeb995c2ab83556f62119151301
   (type (;0;) (func))
   (type (;1;) (func (param i32)))
   (type (;2;) (func (result i64)))

CI only sets the VERSION environment variable to the current git commit. That I think will no longer be viable with this approach so we'll need to remove that from CI and the script here.

Oooh, that makes sense! I never set the version variable locally and of course we can only know it once a commit is made.

I do like the idea that the version points to the last commit when the adapter was changed. Could git be used to check the file tree, excluding the artefacts folder, for the latest commit hash, and then use that?

@alexcrichton
Copy link
Member

Yeah I think that'd also be reasonable to do!

@juntyr
Copy link
Contributor Author

juntyr commented Jun 15, 2024

Yeah I think that'd also be reasonable to do!

I tried, I tried. It unfortunately doesn't work out as any commit that would change the adapter would need to refer to itself to provide the updated artefacts (especially since in CI we run on merge-into-main commits which don't exist yet). So I've reverted to the most basic solution of just using the crate version :/

@juntyr
Copy link
Contributor Author

juntyr commented Jun 15, 2024

If we do want more info in the version, it could also include a hash of the source files (but this would only help to distinguish git versions, though without relation to their commit hash, not published versions)

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's reasonable to remove the hash yeah. One difficulty of using the crate version though is that these binaries would need to be updated whenever we bump the version of the repo which is currently an automated process. It's possible to update that automation, but I think it's probably just easiest to remove the version info entirely for now.

Otherwise can you also move the CI checks for the adapter to their own build job? That way it can pin rustc to a fixed version which isn't updated when we update the rest of CI in the future (done by updating our MSRV right now).

As some minor follow-up work:

  • Can this update scripts/publish.rs as well? I think that'll end up getting caught when full CI runs on merge.
  • Mind updating crates/test-programs/artifacts/build.rs to depend on this crate with a [build-dependencies]`? That otherwise builds the adapters right now and it shouldn't need to any longer after this.

@juntyr
Copy link
Contributor Author

juntyr commented Jun 17, 2024

I think it's reasonable to remove the hash yeah. One difficulty of using the crate version though is that these binaries would need to be updated whenever we bump the version of the repo which is currently an automated process. It's possible to update that automation, but I think it's probably just easiest to remove the version info entirely for now.

What if the adapter crate got its own explicit version, that could be bumped independently of the other crates? I've prototyped this for now, but I can also revert the change and use the workspace version and not add the version to the metadata.

Otherwise can you also move the CI checks for the adapter to their own build job? That way it can pin rustc to a fixed version which isn't updated when we update the rest of CI in the future (done by updating our MSRV right now).

Done - the adapter crate now lists an explicit MSRV which the build script uses.

As some minor follow-up work:

* Can this update `scripts/publish.rs` as well? I think that'll end up getting caught when full CI runs on merge.

Done!

* Mind updating `crates/test-programs/artifacts/build.rs to depend on this crate with a `[build-dependencies]`? That otherwise builds the adapters right now and it shouldn't need to any longer after this.

Done!

Comment on lines 22 to 27
rustup toolchain install $RUST_VERSION --profile minimal
rustup target add wasm32-wasi wasm32-unknown-unknown --toolchain $RUST_VERSION

export RUSTUP_TOOLCHAIN=$RUST_VERSION

cargo --version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be deferred to CI configuration rather than built-in to this script? It might make sense to double-check that the rustc version activated matches the rust-version on the crate itself but otherwise installation of the toolchain can get unwieldy to work with when running this script locally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@@ -1,6 +1,7 @@
[package]
name = "wasi-preview1-component-adapter"
version.workspace = true
version = "23.0.0+wasi-0.2.0"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While having an explicit version here improves things my worry is that this will basically never get updated, even when it needs to. Currently all other version management in this repository is automated so I would expect that this would never change as a result.

I can think of two other primary alternatives for continuing to embed version information in the adapter:

  • Update the bump-version script/CI configuration to additionally rebuild the adapters. I find this not super appealing though because it adds binary file churn to this repository which feels unfortunate.
  • Add a build.rs to the provider crate which injects version information. To be maximally useful this would probably want to have zero build-dependencies which makes it a bit difficult to do, though.

Overall I'd probably still be most in favor of just dropping the version information in the binary. AFAIK we haven't really ended up using it all that much in practice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's just remove the version info (for now) then

@juntyr
Copy link
Contributor Author

juntyr commented Jun 18, 2024

@alexcrichton I hope this is now good to go - thank you for all of your rounds of feedback!

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you as well for pushing on this!

@alexcrichton alexcrichton added this pull request to the merge queue Jun 18, 2024
Merged via the queue into bytecodealliance:main with commit 2dbf8f1 Jun 18, 2024
36 checks passed
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Jun 20, 2024
github-merge-queue bot pushed a commit that referenced this pull request Jun 21, 2024
…r binaries (#8792)" (#8856)

* Revert "wasi-adapter: Implement provider crate that embeds the adapter binaries (#8792)"

This reverts commit 2dbf8f1.

* Enable requisite feature for cranelift-frontend testing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Publish wasi-preview1-component-adapter to crates.io
2 participants