Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workers Sites support #54

Open
SeokminHong opened this issue Sep 18, 2021 · 15 comments
Open

Workers Sites support #54

SeokminHong opened this issue Sep 18, 2021 · 15 comments

Comments

@SeokminHong
Copy link

Hi 馃憢

I had tried to re-implement the Workers Sites using Rust, but I was stuck by KV access.

As far as I know, the kv-asset-handler package uses the STATIC_CONTENT_MANIFEST variable from global context generated by wrangler. Am I right?

If it is, can you provide some ways to accessing the manifest or Rust version of getAssetFromKV function?

@nilslice
Copy link
Contributor

I don't have a solution in mind for this, but it is a good question.
The team is generally recommending the use of Cloudflare Pages instead of Worker Sites, but I understand the desire to use it.

I'm not super familiar with the asset manifest, or with Workers Sites, but will start tracking this request and see if there is something we will officially support.

For now, the code in the Wrangler codebase (https://github.com/cloudflare/wrangler) may point you in the right direction as to reading/deconstructing the asset manifest for use in a Rust KvAssetHandler.

@Dav1dde
Copy link
Contributor

Dav1dde commented Jan 6, 2022

You can access static files through the__STATIC_CONTENT KV:

#[event(fetch)]
pub async fn main(req: worker::Request, env: worker::Env) -> worker::Result<worker::Response> {
    let kv = worker::kv::KvStore::from_this(&env, "__STATIC_CONTENT")?;

    let index = kv.get("index.html").text().await?.expect("index html");
    worker::Response::from_html(index)
}

There is still a lot missing, like caching, mime types etc, butyou can just follow the JS implementation of getAssetFromKV to do that yourself, e.g. I have a small helper like this:

pub async fn serve_asset(req: Request, store: KvStore) -> worker::Result<Response> {
    let path = req.path();
    let path = path.trim_start_matches('/');
    let value = match store.get(path).bytes().await? {
        Some(value) => value,
        None => return Response::error("Not Found", 404),
    };
    let mut response = Response::from_bytes(value)?;
    response
        .headers_mut()
        .set("Content-Type", get_mime(path).unwrap_or("text/plain"))?;
    Ok(response)
}

fn get_mime(path: &str) -> Option<&'static str> {
    let ext = if let Some((_, ext)) = path.rsplit_once(".") {
        ext
    } else {
        return None;
    };

    let ct = match ext {
        "html" => "text/html",
        "css" => "text/css",
        "js" => "text/javascript",
        "json" => "application/json",
        "png" => "image/png",
        "jpg" => "image/jpeg",
        "jpeg" => "image/jpeg",
        "ico" => "image/x-icon",
        "wasm" => "application/wasm",
        _ => return None,
    };

    return Some(ct);
}

@nilslice
Copy link
Contributor

nilslice commented Jan 7, 2022

Thank you for the explanation here, @Dav1dde!

One minor note, is that you should be able to use the kv method on Env to access a KV namespace. So instead of:

let kv = worker::kv::KvStore::from_this(&env, "__STATIC_CONTENT")?;

you can do:

let kv = env.kv("__STATIC_CONENT")?;

The caveat is that you're forced to use the worker::kv::KvStore type from the version we have pinned in workers-rs dependencies and you might need another version. If so, please let me know :)

@Dav1dde
Copy link
Contributor

Dav1dde commented Jan 7, 2022

Thanks, I was looking for something like env.kv("__STATIC_CONENT") after moving away from the router, so I just ended up copying the implementation of the router, this is a lot nicer!

The 0.5 version of the workers-kv crate has lots of improvements (which the master branch has already been updated to), luckily cargo makes referencing a git revision really easy, but a new release would help here.

Improvements like:

  • A nicer API (imo)
  • Support for actually querying binary data (like e.g. a wasm file or images) <-- that was what I needed

@Dav1dde
Copy link
Contributor

Dav1dde commented Jan 15, 2022

@nilslice now I am actually trying to deploy this on cloudflare with wrangler, I am running into the issue that the files are hashed and I don't seem to have access to the manifest __STATIC_CONTENT_MANIFEST, how do I read the manifest?

@Dav1dde
Copy link
Contributor

Dav1dde commented Jan 15, 2022

I have a solution now, but I wish it wasn't necessary:

Have a post-processing script:

cat <<EOF > build/worker/assets.mjs
import manifestJSON from '__STATIC_CONTENT_MANIFEST'
const assetManifest = JSON.parse(manifestJSON)

export function get_asset(name) {
    return assetManifest[name];
}
EOF

Then you can access it in Rust:

#[wasm_bindgen(raw_module = "./assets.mjs")]
extern "C" {
    fn get_asset(name: &str) -> Option<String>;
}

pub fn resolve(name: &str) -> Cow<'_, str> {
    match get_asset(name) {
        Some(name) => Cow::Owned(name),
        None => Cow::Borrowed(name),
    }
}

@SeokminHong
Copy link
Author

SeokminHong commented Apr 6, 2022

Thanks to @Dav1dde , I also found a solution without using JavaScript directly.

#[wasm_bindgen(module = "__STATIC_CONTENT_MANIFEST")]
extern "C" {
    #[wasm_bindgen(js_name = "default")]
    static MANIFEST: String;
}

pub fn resolve(name: &str) -> Cow<'_, str> {
    match serde_json::from_str::<HashMap<&str, &str>>(&MANIFEST)
        .ok()
        .and_then(|m| m.get(name).map(|v| v.to_string()))
    {
        Some(val) => Cow::Owned(val),
        None => Cow::Borrowed(name),
    }
}

@SeokminHong
Copy link
Author

SeokminHong commented Apr 29, 2022

Thanks to @Dav1dde , I also found a solution without using JavaScript directly.

#[wasm_bindgen(module = "__STATIC_CONTENT_MANIFEST")]
extern "C" {
    #[wasm_bindgen(js_name = "default")]
    static MANIFEST: String;
}

pub fn resolve(name: &str) -> Cow<'_, str> {
    match serde_json::from_str::<HashMap<&str, &str>>(&MANIFEST)
        .ok()
        .and_then(|m| m.get(name).map(|v| v.to_string()))
    {
        Some(val) => Cow::Owned(val),
        None => Cow::Borrowed(name),
    }
}

This solution wouldn't work with the latest worker-build because of the swc-bundler. SWC bundler tries to resolve import {default as default0} from "__STATIC_CONTENT_MANIFEST", and it will fail during bundling.

Fortunately, the worker-build using SWC hasn't been published yet, but I'll try find the better way

@allsey87
Copy link

allsey87 commented Jun 17, 2022

@nilslice there seems to be a couple good solutions proposed above. Would you be open to accepting a PR for either @Dav1dde or @SeokminHong solution?

The team is generally recommending the use of Cloudflare Pages instead of Worker Sites

Perhaps I am missing something, but I came to the conclusion today that Cloudflare Pages is not at all compatible with Rust/WebAssembly. It seems to me that functions do not support WebAssembly and even if one were to try to use the legacy worker support via the _worker.js file, this won't include any of the files that it imports (e.g., the WebAssembly module)?

I guess it would be possible to encode the entire module as base64 and inline it into a _worker.js file, but that feels like a very roundabout workflow....

@nilslice
Copy link
Contributor

Hi @allsey87 - I'm not at Cloudflare anymore and don't have a good line of sight into the priorities here, nor the ability to merge anything.

Maybe @zebp could provide some feedback though.

@allsey87
Copy link

allsey87 commented Jun 20, 2022

This is my complete solution based on the answers above

// asset.rs
use once_cell::sync::Lazy;
use std::collections::HashMap;
use worker::*;
use worker::wasm_bindgen::prelude::*;

#[wasm_bindgen(module = "__STATIC_CONTENT_MANIFEST")]
extern "C" {
    #[wasm_bindgen(js_name = "default")]
    static MANIFEST: String;
}

static MANIFEST_MAP: Lazy<HashMap<&str, &str>> = Lazy::new(|| {
    serde_json::from_str::<HashMap<&str, &str>>(&MANIFEST)
        .unwrap_or_default()
});

pub async fn serve(context: RouteContext<()>) -> worker::Result<Response> {
    let assets = context.kv("__STATIC_CONTENT")?;
    let asset = context.param("asset")
        .map(String::as_str)
        .unwrap_or("index.html");
    /* if we are using miniflare (or wrangler with --local), MANIFEST_MAP is empty and we just
       fetch the requested name of the asset from the KV store, otherwise, MANIFEST_MAP
       provides the hashed name of the asset */
    let path = MANIFEST_MAP.get(asset).unwrap_or(&asset);
    match assets.get(path).bytes().await? {
        Some(value) => {
            let mut response = Response::from_bytes(value)?;
            response.headers_mut()
                .set("Content-Type", path.rsplit_once(".")
                    .map_or_else(|| "text/plain", |(_, ext)| match ext {
                        "html" => "text/html",
                        "css" => "text/css",
                        "js" => "text/javascript",
                        "json" => "application/json",
                        "png" => "image/png",
                        "jpg" => "image/jpeg",
                        "jpeg" => "image/jpeg",
                        "ico" => "image/x-icon",
                        "wasm" => "application/wasm",
                        _ => "text/plain",
                    })
                )
                .map(|_| response)
        }
        None => Response::error("Not Found", 404),
    }
}

For my router, I then can just write:

// lib.rs
use worker::*;
mod utils;
mod asset;

#[event(fetch)]
pub async fn main(req: Request, env: Env, _: worker::Context) -> Result<Response> {
    utils::set_panic_hook();
    Router::new()
        .get_async("/", |_, context| asset::serve(context)) // for index.html
        .get_async("/:asset", |_, context| asset::serve(context))
        .run(req, env).await
}

This solution wouldn't work with the latest worker-build because of the swc-bundler. SWC bundler tries to resolve import {default as default0} from "__STATIC_CONTENT_MANIFEST", and it will fail during bundling.

This issue wasn't relevant to me since I don't use worker-build or swc-bundler

@SeokminHong
Copy link
Author

@allsey87 That's right. And I also made a commit to handle the latest worker-build for my own use: 5c7051d

At that time, the cache API wasn't merged so I stopped working for the kv asset handler.

@allsey87
Copy link

@SeokminHong not really the place to ask, but what pattern did you use for matching assets in sub-directories? I am finding that
.get_async("/:asset", |_, context| asset::serve(context)) doesn't match on /images/some_image.png.

@andyredhead
Copy link

You may not need the leading "/" on "/images/some_image.png" - perhaps just "images/some_image.png".

I haven't tried accessing assets in a cloudflare workers site from rust/wasm yet (was just browsing about to see if anyone else has done it already) but I have done it from JavaScript, where not including the leading slash in an asset path worked ok.

@armfazh
Copy link
Contributor

armfazh commented Apr 18, 2023

In #308 I propose a function that allows to make the translation of asset names,
For example: favicon.ico was mangled as favicon.<HASH>.ico.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants