GitHub - lm15-dev/lm15-python: Universal LM core with pluggable provider adapters for OpenAI, Anthropic, and Gemini.

One interface for OpenAI, Anthropic, and Gemini. Zero dependencies.

	lm15	google-genai	litellm
install	72ms	137ms	184ms
import	95ms	2,656ms	4,534ms
total (install → response)	1,090ms	3,992ms	5,840ms
dependencies	0	25	55
disk footprint	408K	41M	155M

_{Median of 10 cold-start runs. Fresh venv, single completion against gemini-3.1-flash-lite-preview. Benchmark source.}

import lm15

resp = lm15.call("claude-sonnet-4-5", "Hello.")
print(resp.text)

Switch models by changing the string. Same types, same streaming, same tool calling. That's it.

Yes, we know.

Install

pip install lm15

Set at least one provider key:

export OPENAI_API_KEY=sk-...
export ANTHROPIC_API_KEY=sk-ant-...
export GEMINI_API_KEY=...         # or GOOGLE_API_KEY

Or use a .env file and configure once:

import lm15
lm15.configure(env=".env")

# No env= needed on any subsequent call
resp = lm15.call("gpt-4.1-mini", "Hello.")

Discover what is available:

import lm15

print(lm15.providers_info())
for m in lm15.models(provider="openai")[:5]:
    print(m.id)

Usage

Streaming

for text in lm15.call("gpt-4.1-mini", "Write a haiku."):
    print(text, end="")

Full event access:

for event in lm15.call("gpt-4.1-mini", "Write a haiku.").events():
    match event.type:
        case "text":     print(event.text, end="")
        case "thinking": print(f"💭 {event.text}", end="")
        case "finished": print(f"\n📊 {event.response.usage}")

Tools (auto-execute)

Pass Python functions — schema is inferred, execution is automatic:

def get_weather(city: str) -> str:
    """Get weather by city."""
    return f"22°C in {city}"

resp = lm15.call("gpt-4.1-mini", "Weather in Montreal?", tools=[get_weather])
print(resp.text)  # "It's 22°C in Montreal."

Tools (manual)

from lm15 import FunctionTool

weather = FunctionTool(name="get_weather", description="Get weather", parameters={...})
gpt = lm15.model("gpt-4.1-mini")

resp = gpt.call("Weather in Montreal?", tools=[weather])
results = {tc.id: "22°C, sunny" for tc in resp.tool_calls}
resp = gpt.submit_tools(results)
print(resp.text)

Inspect before sending

req = lm15.prepare("gpt-4.1-mini", "Weather?", tools=[get_weather])
print(req.tools[0].name)        # "get_weather"
print(req.tools[0].parameters)  # inferred JSON Schema
print(req.messages)             # constructed messages

resp = lm15.send(req)           # send when ready

Images, audio, video, documents

from lm15 import Part

# Image from URL
resp = lm15.call("gemini-2.5-flash", ["Describe this.", Part.image(url="https://example.com/cat.jpg")])

# Image generation → vision (cross-model)
resp = lm15.call("gpt-4.1-mini", "Draw a cat.", output="image")
resp2 = lm15.call("claude-sonnet-4-5", ["What's this?", resp.image])

# Document
resp = lm15.call("claude-sonnet-4-5", ["Summarize.", Part.document(url="https://example.com/paper.pdf")])

# Upload via provider file API
doc = lm15.upload("claude-sonnet-4-5", "contract.pdf")
resp = lm15.call("claude-sonnet-4-5", ["Find liability clauses.", doc])

Structured output (JSON)

resp = lm15.call("gpt-4.1-mini", "Extract: 'Alice is 30.'.",
    system="Return JSON: {name, age}", prefill="{")
data = resp.json  # parsed dict — raises ValueError if not valid JSON
print(data["name"], data["age"])  # Alice 30

Image and audio bytes

# Get generated image as raw bytes
resp = lm15.call("gpt-4.1-mini", "Draw a cat.", output="image")
with open("cat.png", "wb") as f:
    f.write(resp.image_bytes)  # decoded bytes, no base64 wrangling

# Same for audio
resp = lm15.call("gpt-4o-mini-tts", "Say hello.", output="audio")
with open("hello.wav", "wb") as f:
    f.write(resp.audio_bytes)

Reasoning

resp = lm15.call("claude-sonnet-4-5", "Prove √2 is irrational.", reasoning=True)
print(resp.thinking)  # chain of thought
print(resp.text)      # final answer

Conversation

gpt = lm15.model("gpt-4.1-mini", system="You remember everything.")

gpt.call("My name is Max.")
gpt.call("I like chess.")
resp = gpt.call("What do you know about me?")
print(resp.text)  # knows both

Prompt caching

Reduces cost and latency for repeated prefixes — system prompts, long documents, agent loops:

agent = lm15.model("claude-sonnet-4-5",
    system="<long system prompt>",
    tools=[read_file, write_file],
    prompt_caching=True,
)

resp = agent.call("Add tests for auth.")
while resp.finish_reason == "tool_call":
    results = execute(resp.tool_calls)
    resp = agent.submit_tools(results)
    print(f"Cache hit: {resp.usage.cache_read_tokens} tokens")

Prefill

resp = lm15.call("claude-sonnet-4-5", "Output JSON for a person.", prefill="{")

Reusable model with config

gpt = lm15.model("gpt-4.1-mini", system="You are terse.", retries=3, cache=True, temperature=0)
resp = gpt.call("Hello.")

# Override per call
resp = gpt.call("Be creative.", temperature=1.5)

# Derive new models
claude = gpt.copy(model="claude-sonnet-4-5")

Config from dicts

config = {"model": "gpt-4.1-mini", "system": "You are terse.", "temperature": 0}
resp = lm15.call(prompt="Summarize DNA.", **config)

Built-in tools

resp = lm15.call("gpt-4.1-mini", "Latest AI news", tools=["web_search"])
for c in resp.citations:
    print(c.title, c.url)

Provider support

Capability	OpenAI	Anthropic	Gemini
complete	✅	✅	✅
stream	✅	✅	✅
embeddings	✅	—	✅
files	✅	✅	✅
batches	✅	✅	✅
images	✅	—	✅
audio	✅	—	✅
prompt caching	auto	✅	✅

Architecture

lm15.call / lm15.acall / lm15.model   ← high-level surface
                │
                ▼
          Result / AsyncResult
                │
                ▼
LMRequest ──▶ UniversalLM ──▶ MiddlewarePipeline ──▶ ProviderAdapter ──▶ Transport
                  │                                        │
                  │ resolve_provider(model)                 │ build_request / parse_stream_event
                  ▼                                        ▼
            capabilities.py                         providers/{openai,anthropic,gemini}.py

The high-level surface (lm15.call, lm15.acall, lm15.model, Result) is a thin layer over LMRequest, UniversalLM, and provider adapters. Third parties can still build their own surface on top of the same internals.

Why this exists

Stdlib only. No requests, no httpx, no aiohttp. Transport is urllib or optional pycurl.
Frozen dataclasses all the way down. High level: Result out. Low level: LMRequest / LMResponse stay fully accessible. No mutable builder chains.
Nothing is hidden. Every internal type is importable. Provider escape hatches are always there.
Plugin discovery via entry points. Third-party providers install and register without touching lm15 core.

Docs

Topic	Path
API v2 spec (legacy)	`docs/API_SPEC_V2.md`
Getting started	`docs/GETTING_STARTED.md`
Core concepts	`docs/CONCEPTS.md`
Architecture	`docs/ARCHITECTURE.md`
Provider contract	`docs/CONTRACT.md`
Portability spec	`docs/PORTABILITY.md`
Transport design	`docs/DESIGN_TRANSPORT.md`
Error handling	`docs/ERRORS.md`
Streaming	`docs/STREAMING.md`
Writing an adapter	`docs/ADAPTER_GUIDE.md`
Adding a provider	`docs/ADD_PROVIDER_GUIDE.md`
Completeness testing	`docs/COMPLETENESS.md`
Production checklist	`docs/PRODUCTION_CHECKLIST.md`

Cookbooks v2: docs/COOKBOOKS_V2/ — practical examples + references:

Cookbooks v1 (low-level): docs/COOKBOOKS/ — 8 examples using the internal LMRequest/UniversalLM API directly.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
.github		.github
.pi/skills		.pi/skills
benchmarks		benchmarks
books		books
completeness		completeness
docs		docs
examples		examples
lm15		lm15
spec		spec
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
hello.wav		hello.wav
pyproject.toml		pyproject.toml
recording.wav		recording.wav
tmp.md		tmp.md
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install

Usage

Streaming

Tools (auto-execute)

Tools (manual)

Inspect before sending

Images, audio, video, documents

Structured output (JSON)

Image and audio bytes

Reasoning

Conversation

Prompt caching

Prefill

Reusable model with config

Config from dicts

Built-in tools

Provider support

Architecture

Why this exists

Docs

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Install

Usage

Streaming

Tools (auto-execute)

Tools (manual)

Inspect before sending

Images, audio, video, documents

Structured output (JSON)

Image and audio bytes

Reasoning

Conversation

Prompt caching

Prefill

Reusable model with config

Config from dicts

Built-in tools

Provider support

Architecture

Why this exists

Docs

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 1

Languages

Packages