A Rust crate + CLI for mocking LLM API endpoints. Fixture-driven, deterministic responses for testing.
Speaks 4 LLM API formats — OpenAI Chat Completions, Anthropic Messages, Gemini generateContent, and OpenAI Responses API — with SSE streaming and failure simulation.
Inspired by llmock. Built in Rust with zero runtime dependencies for users.
[dev-dependencies]
llmposter = "0.4"
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
reqwest = "0.13"
serde_json = "1"use llmposter::{ServerBuilder, Fixture};
#[tokio::test]
async fn test_llm_response() {
let server = ServerBuilder::new()
.fixture(
Fixture::new()
.match_user_message("hello")
.respond_with_content("Hi from the mock!")
)
.build()
.await
.unwrap();
// Point your LLM client at server.url()
let url = format!("{}/v1/chat/completions", server.url());
// ... make requests, get deterministic responses
// Server shuts down when dropped
}# Install via Homebrew
brew install SkillDoAI/tap/llmposter
# Or install via Cargo
cargo install llmposter
# Create fixtures
cat > fixtures.yaml << 'EOF'
fixtures:
- match:
user_message: "hello"
response:
content: "Hi from the mock!"
EOF
# Run server
llmposter --fixtures fixtures.yaml --port 8080
# Point your app at http://127.0.0.1:8080| Route | Provider |
|---|---|
POST /v1/chat/completions |
OpenAI Chat Completions |
POST /v1/messages |
Anthropic Messages |
POST /v1/responses |
OpenAI Responses API |
POST /v1beta/models/{model}:generateContent |
Gemini |
POST /v1beta/models/{model}:streamGenerateContent |
Gemini (streaming) |
GET /code/:status |
HTTP status echo (mini-httpbin) |
All providers support streaming and non-streaming. For OpenAI, Anthropic, and Responses API, just swap the base URL — the paths are identical to the real APIs. Gemini uses separate endpoints for streaming (streamGenerateContent) and non-streaming (generateContent).
Bearer token enforcement on LLM endpoints — off by default, fully backward compatible.
let server = ServerBuilder::new()
.with_bearer_token("test-token-123") // valid forever
.with_bearer_token_uses("short-lived", 1) // expires after 1 use
.fixture(Fixture::new().respond_with_content("hello"))
.build().await.unwrap();
// Requests must include: Authorization: Bearer test-token-123Full OAuth server via oauth-mock integration — PKCE, device code, token refresh, revocation.
let server = ServerBuilder::new()
.with_oauth_defaults() // spawns OAuth server on separate port
.fixture(Fixture::new().respond_with_content("hello"))
.build().await.unwrap();
let oauth_url = server.oauth_url().unwrap(); // e.g. http://127.0.0.1:12345
// Point your client's token_url at oauth_url
// Tokens issued by the OAuth server are automatically valid on LLM endpoints- Getting Started — Installation, first fixture, first test
- Fixtures — YAML format, matching rules, tool calls
- Failure Simulation — Error codes, latency, truncation, disconnect
- CLI Reference — Flags, validate mode, verbose logging
- Library API — Rust
ServerBuilder, programmatic fixtures - Spec Deviations — Known gaps from real APIs
- OpenAI Chat Completions — Fields, streaming, error shapes
- Anthropic Messages — Fields, streaming, error shapes
- Gemini generateContent — Fields, streaming, camelCase
- OpenAI Responses API — Fields, streaming events, envelopes
AGPL-3.0