Skip to content

MindMadeLab/quartermaster-sdk-py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

311 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Quartermaster

CI PyPI License Python 3.11+

Modular AI agent orchestration framework. Build agent workflows as directed graphs, wire them with a fluent Python API, and run them with any LLM provider.

Built by MindMade in Slovenia.

Install

# Everything (recommended)
pip install quartermaster-sdk

# With a specific LLM provider
pip install quartermaster-sdk[openai]
pip install quartermaster-sdk[anthropic]

# From source (for development or running examples)
git clone https://github.com/MindMadeLab/quartermaster-sdk-py.git
cd quartermaster-sdk-py
uv sync

Quick Start

The simplest possible graph — running against a local Ollama in four lines (no .start(), no .end(), no .build(), no FlowRunner import):

import quartermaster_sdk as qm

qm.configure(provider="ollama", base_url="http://localhost:11434", default_model="gemma4:26b")

result = qm.run(qm.Graph("chat").user().agent(), "Pozdravljen, koliko je ura?")
print(result.text)

qm.run() accepts the builder directly and finalises it internally — .build() is only needed when you want the validated GraphSpec for serialisation or inspection. For single-shot calls skip the graph entirely:

reply = qm.instruction(system="Respond in Slovenian.", user="Pozdravljen!")
# reply is a str.

For typed JSON extraction:

from pydantic import BaseModel

class Classification(BaseModel):
    category: str
    priority: str

data = qm.instruction_form(Classification, system="Classify.", user=email_body)
# data is a Classification instance.

For richer flows you keep the explicit per-node configuration:

agent = (
    qm.Graph("My Agent")
    .user("What can I help you with?")
    .instruction("Respond", model="gpt-4o", system_instruction="You are a helpful assistant.")
)
result = qm.run(agent, "...")

Reading specific node outputs with capture_as=

Attach a name to any node and read its output from result.captures:

graph = (
    qm.Graph("enrich")
    .agent("Research", tools=[...], capture_as="notes")
    .instruction_form(CustomerData, system="Extract.", capture_as="data")
)
result = qm.run(graph, "VT-Treyd Slovenija")
result["notes"].output_text    # agent's free-text research
result["data"].output_text     # form-parsed JSON

Streaming with filtered iterators

# Typewriter effect -- just the model tokens as they arrive
for token in qm.run.stream(graph, "Hello!").tokens():
    print(token, end="", flush=True)

# Dashboard view -- only the tool-call events
for call in qm.run.stream(graph, "Research Slovenia").tool_calls():
    print(f"[TOOL] {call.tool}({call.args})")

# Live progress cards -- filter by custom event name
for evt in qm.run.stream(graph, "Run the pipeline").custom(name="source_found"):
    ui.add_source(evt.payload["url"])

The raw for chunk in qm.run.stream(...) loop still works unchanged when you want every chunk type in one place. Streams are single-pass -- pick one consumer (.tokens(), .tool_calls(), .progress(), .custom(), or raw iteration) per stream.

Post-mortem Result.trace

After a synchronous run (or after draining a stream to its DoneChunk), result.trace exposes a structured view of every FlowEvent the engine emitted:

result = qm.run(graph, "Hello!")

result.trace.text                       # concatenated model output
result.trace.tool_calls                 # list[dict] across every agent node
result.trace.progress                   # list[ProgressEvent]
result.trace.custom(name="source_found")  # filtered CustomEvent list
result.trace.by_node["Researcher"].text   # tokens for a single node
print(result.trace.as_jsonl())           # JSONL export for logs / fixtures

Decision Routing

The LLM classifies input and picks ONE branch. No merge needed.

agent = (
    Graph("Router")
    .user("Describe your issue")
    .instruction("Classify", system_instruction="Classify as: Technical or General.")
    .decision("Category?", options=["Technical", "General"])
    .on("Technical")
        .instruction("Tech response", system_instruction="Give a technical answer.")
    .end()
    .on("General")
        .instruction("General response", system_instruction="Give a general answer.")
    .end()
    .end()
)

Parallel Execution

All branches run concurrently, then merge.

agent = (
    Graph("Code Review")
    .user("Paste your code")
    .parallel()
    .branch()
        .instruction("Security audit", system_instruction="Check for vulnerabilities.")
    .end()
    .branch()
        .instruction("Performance check", system_instruction="Check for performance issues.")
    .end()
    .static_merge("Collect results")
    .instruction("Final report", system_instruction="Combine all findings.")
    .end()
)

User Forms and Templates

agent = (
    Graph("Registration")
    .user("Welcome!")
    .user_form("Details", parameters=[
        {"name": "full_name", "type": "text", "label": "Name", "required": "true"},
        {"name": "email",     "type": "email", "label": "Email", "required": "true"},
    ])
    .var("Capture name", variable="name", expression="full_name")
    .text("Confirm", template="Thanks {{full_name}}, we'll email {{email}} with details.")
    .end()
)

Custom Tools with @tool()

from quartermaster_tools import tool

@tool()
def get_weather(city: str, units: str = "celsius") -> dict:
    """Get current weather for a city.

    Args:
        city: The city name to look up.
        units: Temperature units (celsius or fahrenheit).
    """
    return {"city": city, "temperature": 22, "units": units}

# Call it directly
result = get_weather(city="Amsterdam")

# Export JSON Schema for LLM function calling
schema = get_weather.info().to_input_schema()

# Or register in a ToolRegistry and export all at once
from quartermaster_tools import ToolRegistry

registry = ToolRegistry()
registry.register(get_weather)
schemas = registry.to_json_schema()

See examples/ for runnable examples covering every pattern.

Running Your Graph

from quartermaster_engine import run_graph

# Run — each node uses the provider/model it declares
run_graph(agent, user_input="What is quantum computing?")

# Interactive mode — pauses at User nodes and prompts stdin
run_graph(agent)  # no user_input = interactive

Nodes declare their own provider and model:

.instruction("Respond", model="claude-haiku-4-5-20251001", provider="anthropic", ...)
.instruction("Fast reply", model="llama-3.3-70b-versatile", provider="groq", ...)
.instruction("Local", model="gemma4:26b", provider="ollama", ...)

Set up your API keys in a .env file at the project root:

ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GROQ_API_KEY=gsk_...
XAI_API_KEY=xai-...

Output streams token-by-token in real time. Use show_output=False on nodes to hide internal steps (variables, conditions) from the output.

Packages

Package Description
quartermaster-sdk Meta-package -- installs all core packages
quartermaster-graph Graph schema, fluent builder API, validation
quartermaster-providers LLM provider abstraction (OpenAI, Anthropic, Google, Groq, Ollama, vLLM)
quartermaster-tools Tool definition, registry, built-in tools
quartermaster-nodes Node execution protocols and 40+ node implementations
quartermaster-engine Flow execution, traversal, memory, streaming
quartermaster-mcp-client MCP protocol client -- standalone, no framework dependency
quartermaster-code-runner Docker sandboxed code execution -- standalone FastAPI service

Architecture

Your Application
       |
       v
quartermaster-engine        Flow execution, traversal, streaming
  |         |         |
  v         v         v
graph     nodes     tools   Schema/builder, node executors, tool registry
            |
            v
         providers          OpenAI, Anthropic, Google, Groq, Ollama, vLLM, ...

quartermaster-mcp-client    Standalone MCP protocol client
quartermaster-code-runner   Standalone Docker code execution

Key Concepts

  • Graph -- A directed graph (supports cycles via connect() for loops) of nodes and edges. Built with the fluent Graph("name").user("Input")...end() API (Start node auto-inserted).
  • GraphSpec -- The serializable graph model (GraphSpec in quartermaster-graph). qm.run(graph, ...) finalises the builder for you; explicit Graph.build() only matters when you want the validated spec to serialise / inspect. AgentGraph remains as a deprecated backward-compat alias.
  • User Node -- Every graph typically begins with .user() to collect user input (Start node is auto-inserted).
  • Nodes -- Units of work: LLM calls, decisions, user input, memory, tools, templates.
  • Edges -- Directed connections between nodes. Decision/IF/Switch edges carry labels.
  • Thoughts -- Runtime containers that carry text and variables (metadata) between nodes.
  • Memory -- Flow-scoped persistent storage accessible from any node via write_memory/read_memory.
  • Providers -- Pluggable LLM backends. Model name auto-resolves to the right provider.
  • Tools -- @tool() decorator for custom tools, built-in tools, JSON Schema export via tool.info().to_input_schema().
  • Loops -- connect("Continue", "Start") creates back-edges for iterative flows.
  • Streaming -- Token-by-token output from LLM nodes in real time.
  • Multi-provider -- Different LLM providers for different nodes in the same graph.

Branching Rules

Node Type Behavior Merge Needed?
decision() LLM picks ONE branch No
if_node() Boolean expression picks ONE branch No
switch() Expression picks ONE branch No
parallel() ALL branches run concurrently Yes -- use static_merge()
connect() Manual edge by node name Creates loops/cycles

Documentation

Document Description
Getting Started Installation and first agent
Graph Building Builder API, node types, patterns
Architecture System overview and data flow
Providers LLM providers including local (Ollama, vLLM)
Tools Catalog All built-in tools with parameters
Engine Execution engine internals
Security Safe eval, sandboxing, API key management
Node Reference Detailed node documentation by category

Migrating from 0.4 → 0.5

The Ollama transport fork was collapsed. If you were on the v0.4 paths, apply these renames:

Removed (v0.4) Replacement (v0.5)
OllamaNativeProvider OllamaProvider (now inherits from OpenAICompatibleProvider)
OllamaProvider.chat(...) sync shim await provider.generate_native_response(...)
ChatResult NativeResponse
qm.configure(ollama_tool_protocol="auto"/"native"/"openai_compat") removed — tool-name hallucinations are now handled globally by the universal prefix strip
from quartermaster_providers import ChatResult removed from __all__
model_supports_native_tools(...) removed

Nothing else is behaviour-breaking. Parallel tool execution is opt-out-free (just emit multiple tool_calls from the model). program_runner(program=<str>) keeps working — the callable form is an addition.

What's new

Release notes live in GitHub Discussions — one thread per release with migration tables and known issues. Highlights per version:

v0.6.0 — legacy cleanup + 7 integrator-requested features
  • Stream cancellation now actually aborts the in-flight httpx call (vLLM slot freed on SSE disconnect). #68
  • .agent(extra_body={...}) / .instruction_form(extra_body={...}) — pass-through for Gemma-4's chat_template_kwargs, vLLM sampling knobs. #62
  • .agent(retry={"max_attempts": N, "on": predicate}) — node-level retry primitive. #67
  • qm.parse_partial(text, schema) — progressive-degradation parser for structured output. #64
  • Sliding-window truncation of oldest <tool_result> blocks when accumulated prompt exceeds max_input_tokens. #66
  • Client-side salvage of text-form <|tool_call|> blocks for mis-configured vLLM / Ollama servers. #63
  • Cleanup: 35 CamelCaseTool aliases dropped, AgentGraph/AgentVersion/_build_registry/NodeRegistry alias gone, .end(stop=) kwarg removed, lint rules QM002–QM004 pruned. See discussion #70 for full migration table.
v0.5.0 — Ollama transport collapse, parallel tools, callable program_runner
  • Ollama provider collapsed into a thin subclass of the OpenAI-compatible client. One transport.
  • Parallel tool execution: multiple tool_calls in one turn dispatch concurrently.
  • program_runner(program=<callable>) accepts @tool() functions directly.
  • Universal tool-name prefix strip (default_api:, functions:, mcp:, …) via rsplit(':' or '.').
  • duckduckgo_search UA fix.
v0.4.0 — timeouts, stream cancellation, per-node tool scoping
  • Application timeouts via qm.configure(timeout=/connect_timeout=/read_timeout=).
  • Stream cancellation via with qm.run.stream(...) as stream:.
  • Per-node tool scoping (agent(tools=[...]) strictly enforced).
  • Inline @tool callables in agent(tools=[my_func]).
  • Circuit breaker, session store, typed custom events, static graph linter.
v0.3.0 — filtered streams, live progress, structured trace
  • Filtered stream iterators: stream.tokens() / .tool_calls() / .progress() / .custom(name=...).
  • Live progress from tools via qm.current_context().emit_progress(...).
  • Structured post-mortem Result.trace with per-node breakdowns.
  • One-line OpenTelemetry instrumentation.

Development

# Clone
git clone https://github.com/MindMadeLab/quartermaster-sdk-py.git
cd quartermaster-sdk-py

# Install everything (uv workspace -- one command)
uv sync

# Run an example
uv run examples/01_hello_agent.py

# Run tests for a single package
uv run pytest quartermaster-graph/tests/

# Run all tests
uv run pytest quartermaster-graph/tests/ quartermaster-tools/tests/ quartermaster-engine/tests/

See CONTRIBUTING.md for the full development guide.

License

Apache 2.0 -- Built by MindMade in Slovenia.

About

No description, website, or topics provided.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors