Milo is a fast and embeddable HTTP/1.1 parser written in Rust.
Milo supports strict HTTP/1.1 message parsing for Rust, C++, and JavaScript via WebAssembly.
| Target | Status | Artifact |
|---|---|---|
| Rust | Supported | milo crate |
| C++ | Supported | milo.h and libmilo.a |
| JavaScript | Supported | @perseveranza-pets/milo WebAssembly package |
Milo intentionally rejects HTTP/0.9, HTTP/1.0, RTSP, obs-fold, bare LF, and bare CR. It does not parse HTTP/2 messages, except for strict PRI * HTTP/2.0 switch-over detection.
For security and ambiguity reduction, Milo rejects request bodies on GET and HEAD. Responses to HEAD are application context: callers must use skip_body when they know a response has no body because it belongs to a HEAD request.
For full parser scope, strictness, and protocol behavior, see Milo Design.
Install it from npm:
npm install @perseveranza-pets/milo
Then create a sample source file:
import { setup } from '@perseveranza-pets/milo'
/*
Milo works using callbacks.
All callbacks have the same signature, which characterizes the payload:
* The current parser
* from: The payload offset.
* size: The payload length.
The payload parameters above are relative to the last data sent to the milo.parse method.
If the current callback has no payload, both values are set to 0.
The callbacks must be provided using setup and are named in snake case.
*/
const milo = setup({
on_data (p, from, size) {
console.log(`Pos=${from} Body: ${message.slice(from, from + size).toString()}`)
}
})
// Prepare a message to parse.
const message = Buffer.from('HTTP/1.1 200 OK\r\nContent-Length: 3\r\n\r\nabc')
// Allocate a memory in the WebAssembly space. This speeds up data copying to the WebAssembly layer.
const ptr = milo.alloc(message.length)
// Create a buffer we can use normally.
const buffer = Buffer.from(milo.memory.buffer, ptr, message.length)
// Create the parser.
const parser = milo.create()
// Toggle on the callbacks you want to receive
milo.setActiveCallbacks(parser, milo.CALLBACK_ACTIVE_ON_DATA)
// Now perform the main parsing using milo.parse. The method returns the number of consumed characters.
buffer.set(message, 0)
milo.parse(parser, ptr, message.length)
// Cleanup used resources.
milo.destroy(parser)
milo.dealloc(ptr, message.length)The default JavaScript entry point uses the SIMD WebAssembly build. Use @perseveranza-pets/milo/no-simd when SIMD is not available, and add /unbundled to either entry point to load the external .wasm file instead of the bundled JavaScript module.
Finally build and execute it using node:
node index.js
# Pos=38 Body: abcAdd milo-parser to your Cargo.toml:
[package]
name = "milo-example"
version = "0.1.0"
edition = "2024"
publish = false
[dependencies]
milo-parser = "0.4.0"Create a sample source file:
use core::ffi::c_void;
use core::slice;
use milo_parser::{Parser, CALLBACK_ACTIVE_ON_DATA};
fn main() {
// Create the parser.
let mut parser = Parser::new();
// Prepare a message to parse.
let message = String::from("HTTP/1.1 200 OK\r\nContent-Length: 3\r\n\r\nabc");
parser.context = message.as_ptr() as *mut c_void;
// Milo works using callbacks.
//
// All callbacks have the same signature, which characterizes the payload:
//
// p: The current parser.
// from: The payload offset.
// size: The payload length.
//
// The payload parameters above are relative to the last data sent to the parse
// method.
//
// If the current callback has no payload, both values are set to 0.
parser.callbacks.on_data = |p: &mut Parser, from: usize, size: usize| {
let message =
unsafe { std::str::from_utf8_unchecked(slice::from_raw_parts(p.context.add(from) as *const u8, size)) };
// Use the callback data.
println!("Pos={} Body: {}", from, message);
};
// Toggle on the callbacks you want to receive
parser.active_callbacks |= CALLBACK_ACTIVE_ON_DATA;
// Now perform the main parsing using milo.parse. The method returns the number
// of consumed characters.
parser.parse(message.as_ptr(), message.len());
}Finally build and execute it using cargo:
cargo run
# Pos=38 Body: abcFirst, let's download Milo release from GitHub.
You will need a static library file (for Linux/Unix/MacOS is libmilo.a) and a header file (milo.h).
Create a sample source file:
#include "milo.h"
#include <cinttypes>
#include <cstdio>
#include <cstring>
int main() {
// Create the parser.
milo_parser::Parser* parser = milo_parser::milo_create();
// Prepare a message to parse.
const char* message = "HTTP/1.1 200 OK\r\nContent-Length: 3\r\n\r\nabc";
parser->context = (char*) message;
/*
Milo works using callbacks.
All callbacks have the same signature, which characterizes the payload:
* p: The current parser.
* from: The payload offset.
* size: The payload length.
The payload parameters above are relative to the last data sent to the milo_parse method.
If the current callback has no payload, both values are set to 0.
*/
parser->callbacks.on_data = [](milo_parser::Parser* p, uintptr_t from, uintptr_t size) {
const char* payload = reinterpret_cast<const char*>(p->context) + from;
printf("Pos=%" PRIuPTR " Body: %.*s\n", from, static_cast<int>(size), payload);
};
// Toggle on the callbacks you want to receive
parser->active_callbacks |= milo_parser::CALLBACK_ACTIVE_ON_DATA;
// Now perform the main parsing using milo.parse. The method returns the number of consumed characters.
milo_parser::milo_parse(parser, reinterpret_cast<const unsigned char*>(message), strlen(message));
// Cleanup used resources.
milo_parser::milo_destroy(parser);
}And then you can compile using your preferred build system. For instance, let's try with Clang:
clang++ -std=c++11 -o example main.cc libmilo.a
./example
# Pos=38 Body: abcIf you want to build it locally, you need the following tools:
- cargo-make
- Rust toolchain - You can install it via rustup.
- rust-cbindgen
Make sure you have the nightly toolchain installed locally:
rustup toolchain install nightlyMake sure you have the wasm32-unknown-unknown target:
rustup target add wasm32-unknown-unknownInstall npm dependencies
pnpm installAfter all the requirements are met, you can then run:
makersThe command above will produce debug and release builds for each language in the top-level dist folder.
The WebAssembly release build uses immediate-abort panics to keep the artifact smaller. Panics trap without unwinding or rich panic messages.
The debug build also enables the on_state_change callback and is more verbose in case of WebAssembly errors.
See the following files, according to the language you are using:
Milo does not aim for byte-for-byte llhttp compatibility when lenient behavior would conflict with Milo's strict parser policy.
- Milo rejects HTTP/0.9, HTTP/1.0, RTSP, obs-fold, bare LF, and bare CR.
- Milo rejects normal HTTP/2 request and response messages.
- Milo rejects request bodies on
GETandHEAD. - Milo requires
skip_bodyfor application-known no-body response contexts such as responses toHEAD. - Milo validates protocol framing and syntax, but leaves application semantics to callers.
Milo validates HTTP/1.1 syntax, message framing, protocol switching, connection management, and data-after-close behavior. Milo does not validate routing, authorization, representation semantics, URI normalization, Host policy, method-specific request-target forms, CONNECT authority-form, or full header field semantics unless they affect safe framing.
Milo leverages Rust's procedural macro, syn and quote crates to allow an easy definition of actions and matchers for the parser.
See the macros internal crate for more information.
The resulting parser is a simple state machine which copies data in only one optional case: automatically handling the unconsumed portion of the input data.
In all other cases, no data is copied and the memory footprint is very small as only a few dozen bool, uintptr_t, or uint64_t fields can represent the entire parser state.
The scope of Milo is to replace llhttp as Node.js main HTTP parser.
This project aims to:
- Make it maintainable and verifiable using easy to read Rust code.
- Be performant by avoiding any unnecessary data copy.
- Be self-contained and dependency-free.
To see the rationale behind the replacement of llhttp, check Paolo's talk at Vancouver's Node Collab Summit in January 2023 (slides).
To see the initial disclosure of milo, check Paolo's talk at NodeConf EU 2023 in November 2023 (slides).
- Check out the latest master to make sure the feature hasn't been implemented or the bug hasn't been fixed yet.
- Check out the issue tracker to make sure someone already hasn't requested it and/or contributed it.
- Fork the project.
- Start a feature/bugfix branch.
- Commit and push until you are happy with your contribution.
- Make sure to add tests for it. This is important so I don't break it in a future version unintentionally.
Copyright (C) 2023 and above Paolo Insogna (paolo@cowtech.it) and NearForm (https://nearform.com).
Licensed under the ISC license, which can be found at https://choosealicense.com/licenses/isc or in the LICENSE.md file.
