Skip to content

Commit

Permalink
Major refactoring (#125)
Browse files Browse the repository at this point in the history
  • Loading branch information
itowlson committed Oct 6, 2021
1 parent 9abecad commit c48bdd6
Show file tree
Hide file tree
Showing 48 changed files with 3,790 additions and 2,852 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ wagi-cache/
/ssl-example.*
.vscode/
_scratch/
tests_working_dir/
2 changes: 2 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,4 @@ wasmtime = "0.30"
wasmtime-wasi = "0.30"
wasmtime-cache = "0.30"
wat = "1.0.37"
chrono = "0.4.19"
85 changes: 82 additions & 3 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,85 @@ In the future, as WASI matures, we will relax the restrictions on outbound netwo
- WAGI does NOT support NPH (Non-Parsed Header) mode
- The value of `args` is NOT escaped for borne-style shells (See section 7.2 of CGI spec)

It should be noted that while the daemon (the WAGI server) runs constantly, both the `modules.toml` and the `.wasm` file are loaded for each request, much as they were for CGI.
In the future, the WAGI server may cache the WASM modules to speed loading.
But in the near term, we are less concerned with performance and more concerned with debugging.
In previous releases, although the daemon (the WAGI server) runs constantly,
both the `modules.toml` and the `.wasm` file were loaded from disk each request, much as they were for CGI.
As of the time of writing, the WAGI server now reads the WASM modules at startup and keeps
them in memory. This abstracts serving code away from filesystem interactions, and also
improves performance.

## Design notes

This implementation of WAGI falls into two parts:

* Initialisation (the bulk of `main.rs`)
* Request serving (`wagi_server` and the components it calls)

After initialisation we should know everything we need to know to handle requests,
and we should have failed if we can determine that anything is missing or
invalid. All configuration files have been parsed and validated, all modules have
been downloaded and read, all dependencies have been readied, etc. If initialisation
fails, WAGI stops rapidly with an error message.

**Caveat:** We could probably perform even more validation during the initialisation
phase. For example, at the time of writing, we don't check if route entry points
exist.

Because any failure during initialisation should cause an immediate exit, we do
only minimal tracing during this phase; the exit and error message should provide
enough information to diagnose any problems.

### Principles of the initialisation phase

* Parse, don't validate. That is, convert raw data such as config files into a
form that minimises further checking or special case handling later on.
* Don't make downstream components care about how upstream components got their
data. This is not always practical, but the idea is to minimise how much, say,
the route builder needs to care about whether it is dealing with an OCI reference
in a `modules.toml` or a parcel in a local standalone bindle. Separate the stages;
keep `main()` as simple and as linear as possible.
* Fail fast. Related to the above, check that everything
you need is present, in the right place, and usable. Ideally parse it into
a form such that the next stage doesn't need to repeat the checks.
* Fail informatively. Be generous with error context and values. Rust has
an awful habit of reporting things like "key not in dictionary" and "file
or directory does not exist." Err on the side of saying _which_ thing
went wrong.
* Provide entry points for automated testing.

### Key types and function groups

* Initialisation is geared to producing a `RoutingTable` which maps routes to handlers.
A `RoutingTable` consists primarily of a vector of `RoutingTableEntry`. ('Map' is
a slight misnomer here, because of ordering and wildcard routes.)
* `RoutingTableEntry` contains a route (represented by `RoutePattern`) and all the data
required to handle that route (represented by the `RouteHandler` enum).
* The types with "handler" in the name can be a bit confusing. We need them because
we have different representations of handlers as we assemble the data we need to
run them.
- `RouteHandler` is the final, "runnable" form of handler.
- `WasmRouteHandler` is the data for the interesting case of `RouteHandler`.
- `WagiHandlerInfo` aggregates the information about a route and associated parcels
specified in a bindle.
- `HandlerConfigurationSource` represents the combination of flags passed on the
command line to say where routing and handling is specified, e.g. a `modules.toml`
file or a bindle.
- `HandlerConfiguration` represents the parsed form of whatever the
`HandlerConfigurationSource` points to. Note that `HandlerConfigurationSource` is the
_reference_ to the source (e.g. file path or bindle ID); `HandlerConfiguration` is
_the content of that the file or bindle_.
- `LoadedHandlerConfiguration` is a `HandlerConfiguration` augmented with the binary
content of the Wasm modules specified in that configuration.
- Note that all those last three are different _again_ from `WagiConfiguration`
which contains a whole bunch of other configuration like TLS and stuff.
- I am very very sorry for everything.
* `WasmModuleSource` represents data that can be instantiated as a Wasm module. At the
time of writing, the only case is `Blob`, which is the raw bytes of the Wasm binary.
In future, this could have an additional case (or have a single different case!) of
a pre-instantiated module - the point of the type is to insulate other code from making
assumptions about the representation.
* The `wasm_runner` module provides services for executing Wasm modules that communicate
via stdin/stdout. This allows commonality between dynamic route discovery and handler
execution. There is scope for more encapsulation here though!

We welcome improvements to and tidying of the module structure and placement of
functions.
Loading

0 comments on commit c48bdd6

Please sign in to comment.