Skip to content

Latest commit

 

History

History
147 lines (105 loc) · 5.72 KB

architecture.md

File metadata and controls

147 lines (105 loc) · 5.72 KB

Architecture

turbopack

Currently turbopack splits into turbopack-core, turbopack-css , turbopack-ecmascript and turbopack (facade).

The turbopack-* crates (except for turbopack-core) add support for a certain type of asset.

Each of them export a ModuleAsset, an implementation of Asset, which is able to extract references from this kind of asset (and in the future generate an optimized (either for dev or prod) version of it).

We are currently missing:

  • turbopack-node (node native modules)
  • turbopack-wasm (WASM in node and on the web)
  • turbopack-image (images on the web)
  • probably more? (e.g. turbopack-svg).

turbo-tasks

#[turbo_tasks::value]

#[turbo_tasks::value] is a macro to generate a XxxVc wrapper type for some data (e.g. a struct Xxx).

  • XxxVc is like a Promise to some data stored somewhere
  • You can read the data via .await? (e.g. let x: XxxVc; let data: Xxx = x.await?);

turbo-tasks values can also implement traits, see #[turbo_tasks::value_trait] for examples.

#[turbo_tasks::function]

#[turbo_tasks::function] infuses a function with turbo-tasks magic.

This means:

  • the function is cached (calling it twice returns the same XxxVc).
  • dependencies are tracked (reading a XxxVc via .await? is tracked).
  • turbo-tasks will take care of re-executing the function when any dependency has changed.

From the outside #[turbo_tasks::function]s will always return an XxxVc (not a Result<XxxVc> or Future<Output = XxxVc>).

From the inside you can write an async fn() -> Result<XxxVc> and turbo-tasks will hide that async and error in the XxxVc.

turbo-tasks functions are mostly pure. Data is immutable once stored.

Tasks

A combination of a function and its arguments is called a Task (basically an invocation of a function).

It's also possible to store data in a Task via XxxVc::cell(value: Xxx). This returns an XxxVc.

When #[turbo_tasks::value(shared)] is used, let data: Xxx; let x: XxxVc = x.into(); does the same.

Registry

Here is the global registry: registry For each #[turbo_tasks::value] we create a ValueType.

When serialization is enabled we use ValueType::new_with_any_serialization. This stores Serialization and Deserialization implementations in the ValueType.

There is some rust generic magic happening e.g. fn any_as_serialize(...) which casts an Any to a Serialize for a concrete type. In the background rust instantiates the Serialize logic for it based on serde.

Deserialize is mostly the same idea, but a bit more involved in serde. It looks like this: AnyDeserializeSeed

Why XxxVc instead of Vc<Xxx> and what are all the build scripts for?

Both of these are relevant for persistent caching and serialization of values. We need to deserialize values without knowing the type of the value at compile-time.

We want to deserialize something like Box<dyn Any>. For that we need to have a map from some kind of type identifier to a concrete deserialization implementation, that's what the register methods do. They instantiate a concrete implementation/type and register that in a global map.

A similar problem exists with these #[turbo_tasks::function]. We need a global map from identifier to the method.

Usually you would use something like ctor for that, to hide all these manual register calls, but that won't work in WebAssembly or when dynamically loading libraries or plugins.

That's why we went the more manual approach with register methods that work without special linker logic.

Most of that is automated via this build script. It's worth looking into the generated file:

TODO will be a hash of the crate + deps in the future.

// target/debug/build/turbo-tasks-{hash}/out/register.rs

{
crate::nothing::NOTHINGVC_IMPL_NEW_FUNCTION.register(r##"turbo-tasks@TODO::::nothing::NothingVc::new"##);
crate::display::VALUETOSTRING_TRAIT_TYPE.register(r##"turbo-tasks@TODO::::display::ValueToString"##);
crate::primitives::STRING_VALUE_TYPE.register(r##"turbo-tasks@TODO::::primitives::String"##);
crate::primitives::BOOL_VALUE_TYPE.register(r##"turbo-tasks@TODO::::primitives::Bool"##);
crate::nothing::NOTHING_VALUE_TYPE.register(r##"turbo-tasks@TODO::::nothing::Nothing"##);
crate::native_function::NATIVEFUNCTION_VALUE_TYPE.register(r##"turbo-tasks@TODO::::native_function::NativeFunction"##);
crate::completion::COMPLETION_VALUE_TYPE.register(r##"turbo-tasks@TODO::::completion::Completion"##);
}

This code is generated by the build script by looking for all #[turbo_tasks::function] and #[turbo_tasks::value] in the source code.

The string is the global identifier:

{crate_name}@{hash}::{mod_path}::{name}
 ^            ^       --------    ----
 |            |       |           |
 |            |       |           the name of the item
 |            |       the full path of the module the item is in
 |            hash of crate + deps
 the name of the cargo crate the item is in

The hash will allow invalidating the cache, but also being able to differentiate between versions when accessing a remote cache.