New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce the concept of a WASM manifest; use cacache for local storage #39
Conversation
- Use the newly-published tokio version of cacache. - Rework the blob storage module to use cacache. - Extend its api a bunch because cacache lets us do more. - Rework endpoints in the mesh agent to support the new slightly-larger API. Also, take a few minor updates to crates; revert change to the workspace syntax because it is making VSCode angry. Dry-coded a new agent API on top of a new blob API There is a new concept called a "Manifest" that has some of the fields specified in @shinypb's job manifest spec. (I skipped a bunch just so I could get started with the implementation.) The blob storage now knows about wasm executables and manifests and can build type-aware keys for each kind of data. The agent endpoints now look like this: `GET /monitor/ping` cheap health check `POST /v1/jobs/:name/run` body is interpreted as job input `GET /v1/storage/manifests` fetch a list of manifests `POST /v1/storage/manifests` store a new manifest `GET /v1/storage/manifests/:name` get a manifest by name `HEAD /v1/storage/manifests/:name` 404s if manifest not found `PUT /v1/storage/manifests/:name/executable/:version` store a new wasm executable associated with the named task manifest `GET /v1/storage/manifests/:name/executable/:version` get a wasm executable by manifest name & version `GET /v1/monitor/history` very cheap run history; to be deleted `GET /v1/jobs` list all running jobs; not implemented The underlying blob store API supports this external API. Up next: making the CLI use the new API. IT'S ALIVE The CLI can now store wasm jobs in the mesh (okay, in one node) and then run those jobs only passing up input data.
This needs more than the seat-of-the-pants design I've given it just now, but this pass removes redundant information. Also, we now standardize on using the fully-qualified name everywhere. While I was there, introduced some `prettytable` to the CLI to start formatting some displays nicely. This is throwaway work in the long term, but it makes things a little nicer now.
oh oh oh oh
) -> impl IntoResponse { | ||
let mut lock = state.lock().await; | ||
let storage = lock.storage.as_ref().unwrap(); | ||
let Ok(manifest) = storage.manifest(&name).await else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that only nodes that have storage locally available can run jobs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, yes. I think the immediate next followup work is to make a node ask if if has the appropriate binary, and if not, find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Thoughts on swapping BlobStore
out for an abstract trait that has CacacheBlobStore
as one implementation and RemoteBlobStore
(something that proxies to a serval_storage instance) as the other? If that sounds reasonable to me I am happy to start chipping away at it.
executable: Vec<u8>, | ||
/// Input data | ||
input: Vec<u8>, | ||
// TODO: might have version chosen to run here, plus run options; might also store the input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, something like an Option<u16>
to reference a specific version number (rather "latest") would make sense.
@@ -86,55 +87,56 @@ fn build_url(path: String, version: Option<&str>) -> String { | |||
} | |||
|
|||
fn upload_manifest(manifest_path: PathBuf) -> Result<()> { | |||
println!("Reading manifest at {}", manifest_path.display()); | |||
println!("Reading manifest: {}", manifest_path.display()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this is the equivalent of
println!("Reading manifest: {manifest_path:?}");
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a debugging print, but converting a pointer to a pathbuf to its display version. I think it's cleaner; haven't looked at what the output is with the :? formatter.
pub fn name(&self) -> String { | ||
self.fq_name.clone() | ||
pub fn fq_name(&self) -> String { | ||
let name = self.name.to_ascii_lowercase().replace('-', "_"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the -
support just because of our wasm-samples naming convention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should get a TODO next to it, because IMO it needs to be cleaned more definitively to restrict it to allowed characters.
table.add_row(row!["WASM task name:", manifest.fq_name()]); | ||
table.add_row(row!["Version:", manifest.version()]); | ||
|
||
let url = build_url("storage/manifests".to_string(), Some("1")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We pass Some("1")
around often enough that a build_url_v1
helper function probably wouldn't be a bad idea at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed! It annoyed me.
This looks good to me, modulo the rustfmt error giving you grief. 😅 Calling out that landing this means that |
This PR reworks the agent API and the cli to support a new concept of a "manifest". This is a toml specification of a WASM job type. It includes a name, a namespace, a path to a binary, and some human-readable descriptions.
Here's an example manifest:
Once a WASM manifest + executable has been stored in the system, you can execute that wasm using the
name in the manifest:
pounce run sh.serval.loudify inputfile.txt
The local blob storage has been reimplemented to use the cacache
crate. This is a content-adddressable store that also allows lookup of items by key. In our case
we store manifests with keys like
name.manifest.toml
and wasm executables with names likename.wasm
. Execution of jobs is now more indirect than it was, in that we look up the manifest byname, and from that look up where the binary is by version. This indirection allows us to support
versioning more fully in the long term.
This PR also significantly reworks the agent API and changes the CLI to support this new API. The agent endpoints now look like this:
GET /monitor/ping
: cheap health checkPOST /v1/jobs/:name/run
: body is interpreted as job inputGET /v1/storage/manifests
: fetch a list of manifestsPOST /v1/storage/manifests
: store a new manifestGET /v1/storage/manifests/:name
: get a manifest by nameHEAD /v1/storage/manifests/:name
: 404s if manifest not foundPUT /v1/storage/manifests/:name/executable/:version
: store a new wasm executable associated with the named task manifestGET /v1/storage/manifests/:name/executable/:version
: get a wasm executable by manifest name & versionGET /v1/monitor/history
: very cheap run history; to be deletedGET /v1/jobs
: list all running jobs; not implementedAlso, take a few minor updates to crates; revert change to the workspace syntax because it is making VSCode angry.