Skip to content

rajacsp/llmfaker

Repository files navigation

llmfaker

PyPI version License

A mock server and in-process faker for OpenAI and Anthropic APIs, built for Python testing. Monkey-patches official LLM client libraries to intercept calls without network overhead.

Features

  • In-process patching of openai, anthropic, litellm, and langchain clients
  • Fluent builder API for configuring responses
  • Pattern matching: exact, regex, predicate-based, and template rendering
  • Streaming support with realistic SSE emission (OpenAI and Anthropic formats)
  • Failure injection: rate limits, timeouts, mid-stream disconnects, malformed JSON
  • Latency simulation with configurable TTFT and inter-token delays
  • Record/replay cassettes for integration testing
  • Multi-turn conversation scripting and tool-call sequences
  • Token counting and cost estimation via pricing tables
  • Pytest plugin with llm_faker and llm_recording fixtures
  • Standalone mock server mode via CLI

Installation

pip install llmfaker

Quick Start

In-process (for unit tests)

from llmfaker import LLMFaker
import openai

client = openai.OpenAI(api_key="fake")

with LLMFaker() as faker:
    faker.when(prompt_contains="weather").respond("It's sunny!")
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "What's the weather?"}],
    )
    print(response.choices[0].message.content)  # "It's sunny!"

Pytest plugin

def test_my_feature(llm_faker):
    llm_faker.when(prompt_contains="hello").respond("Hi!")
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "hello"}],
    )
    assert "Hi!" in response.choices[0].message.content
    assert len(llm_faker.calls) == 1

Cassette record/replay

def test_real_api_behavior(llm_recording):
    # First run: calls real API and records to cassette
    # Subsequent runs: replays from cassette file
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "hello"}],
    )
    assert response.choices[0].message.content

Failure injection

with LLMFaker() as faker:
    with faker.fail(rate=1.0, status=429, retry_after=30):
        # All calls will get a 429 rate limit error
        ...

Standalone mock server

mockllm start --responses responses.yml --port 8000

YAML Configuration

responses:
  "what colour is the sky?": "The sky is blue due to Rayleigh scattering."
  "tell me a joke": "Why don't programmers like nature? Too many bugs!"

defaults:
  unknown_response: "I don't know the answer to that."

settings:
  lag_enabled: true
  lag_factor: 10

Development

pip install -r requirements.txt
pip install -e .

# Run tests
python -m pytest tests/ -v

License

Apache-2.0


Inspired by mockllm.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors