llmfaker

A mock server and in-process faker for OpenAI and Anthropic APIs, built for Python testing. Monkey-patches official LLM client libraries to intercept calls without network overhead.

Features

In-process patching of openai, anthropic, litellm, and langchain clients
Fluent builder API for configuring responses
Pattern matching: exact, regex, predicate-based, and template rendering
Streaming support with realistic SSE emission (OpenAI and Anthropic formats)
Failure injection: rate limits, timeouts, mid-stream disconnects, malformed JSON
Latency simulation with configurable TTFT and inter-token delays
Record/replay cassettes for integration testing
Multi-turn conversation scripting and tool-call sequences
Token counting and cost estimation via pricing tables
Pytest plugin with llm_faker and llm_recording fixtures
Standalone mock server mode via CLI

Installation

pip install llmfaker

Quick Start

In-process (for unit tests)

from llmfaker import LLMFaker
import openai

client = openai.OpenAI(api_key="fake")

with LLMFaker() as faker:
    faker.when(prompt_contains="weather").respond("It's sunny!")
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "What's the weather?"}],
    )
    print(response.choices[0].message.content)  # "It's sunny!"

Pytest plugin

def test_my_feature(llm_faker):
    llm_faker.when(prompt_contains="hello").respond("Hi!")
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "hello"}],
    )
    assert "Hi!" in response.choices[0].message.content
    assert len(llm_faker.calls) == 1

Cassette record/replay

def test_real_api_behavior(llm_recording):
    # First run: calls real API and records to cassette
    # Subsequent runs: replays from cassette file
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "hello"}],
    )
    assert response.choices[0].message.content

Failure injection

with LLMFaker() as faker:
    with faker.fail(rate=1.0, status=429, retry_after=30):
        # All calls will get a 429 rate limit error
        ...

Standalone mock server

mockllm start --responses responses.yml --port 8000

YAML Configuration

responses:
  "what colour is the sky?": "The sky is blue due to Rayleigh scattering."
  "tell me a joke": "Why don't programmers like nature? Too many bugs!"

defaults:
  unknown_response: "I don't know the answer to that."

settings:
  lag_enabled: true
  lag_factor: 10

Development

pip install -r requirements.txt
pip install -e .

# Run tests
python -m pytest tests/ -v

License

Apache-2.0

_{Inspired by mockllm.}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
src		src
tests		tests
.ant		.ant
.clinerules		.clinerules
.env.sample		.env.sample
.gitignore		.gitignore
.runtime		.runtime
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
example.responses.yml		example.responses.yml
main.py		main.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
zzem.py		zzem.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

llmfaker

Features

Installation

Quick Start

In-process (for unit tests)

Pytest plugin

Cassette record/replay

Failure injection

Standalone mock server

YAML Configuration

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

llmfaker

Features

Installation

Quick Start

In-process (for unit tests)

Pytest plugin

Cassette record/replay

Failure injection

Standalone mock server

YAML Configuration

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages