ToastStack Starter

AI interactions, not transactions.

Run AI locally. Route intelligently. Use the cloud only when it matters.

What is ToastStack?

ToastStack Starter is an open-source reference implementation for building local-first, cloud-backed LLM workflows.

Instead of sending every prompt to expensive cloud models, ToastStack routes requests intelligently:

Local models handle most tasks
Cloud models handle critical moments
Routing logic decides automatically

Result: 80–95% cost reduction without sacrificing workflow quality

The Problem

Modern AI development is expensive, unpredictable, and inefficient.

Every prompt = cost
Iteration becomes constrained
Sensitive data leaves your environment
Teams lack visibility and control

Most setups look like this:

flowchart LR
    IDE["IDE / CLI"] --> CloudLLM["Cloud LLM"]
    CloudLLM --> Cost["High cost"]

The Shift

ToastStack flips the model:

flowchart TD
    Dev["Developer / Agent"] --> GW["ToastStack Gateway"]
    GW --> Local["Local models (Ollama)"]
    GW --> Cloud["Cloud models (Claude / GPT)"]

Local = default
Cloud = escalation

What This Repo Gives You

This is a starter system, not a full platform.

You get:

Pre-configured LiteLLM gateway
Local model setup (Ollama)
Example routing strategies
Developer workflows (local-first + validation)
Multi-agent patterns (planner, coder, reviewer)
Benchmarks (cost, latency, quality)

Clone, run, and you have a working hybrid AI stack (once the setup scripts and gateway config are in place).

Quick Start

Prerequisites: The commands below match the intended layout for this repo. Some paths (Ollama setup scripts, Docker Compose for the gateway, and the sample app entrypoint) may still be stubs on your clone. Add or generate those assets from the docs when they land, or adjust paths to match your environment.

1. Setup local models

./local/setup-ollama.sh

2. Pull recommended models

./local/pull-models.sh

3. Start the gateway

docker-compose up

4. Run an example

node examples/sample-app/index.js

Routing Example

Basic routing strategy:

routes:
  - match: "simple"
    provider: "ollama"

  - match: "complex"
    provider: "anthropic"

fallback:
  provider: "anthropic"

Local-first by default
Cloud when needed

Cost Impact

Example scenario: 1000 prompts

Setup	Cost
Cloud-only	$42.00
ToastStack	$4.80
Savings	~88%

See benchmarks/ for full breakdowns.

Workflows Included

Local-First Development

Fast iteration using local models for:

coding
debugging
drafting

Cloud Validation

Escalate only when needed for:

final review
complex reasoning
production checks

PR Review Flow

Agent-based workflow:

Planner: breaks tasks down
Coder: implements changes
Reviewer: validates output

Agents

Pre-defined agent roles:

Planner — breaks down tasks
Coder — implements changes
Reviewer — validates output

These mimic real-world dev workflows.

For Teams

ToastStack Starter is designed for developers and small teams.

As usage grows, teams typically need:

centralized routing
usage visibility
cost tracking
policy enforcement

This is where ToastStack evolves beyond this repo.

Architecture Philosophy

ToastStack is built on one core principle:

Run cheap and private by default.
Escalate to premium intelligence only when necessary.

This creates:

faster iteration
lower costs
better control
scalable workflows

What This Is NOT

This repository is:

NOT a production-ready routing engine
NOT a policy enforcement system
NOT a cost optimization platform
NOT a team-level control plane

It is a reference implementation.

Roadmap

Current

Local-first routing
Cloud fallback
Example workflows

In Progress

Smarter routing strategies
Performance-aware selection
Cost-aware execution

Future (ToastStack Platform)

Team-level policies
Cost dashboards
Prompt analytics
Shared workflows
Governance layer

Learn More

https://toaststack.com

Why This Exists

AI is becoming infrastructure.

But right now, it is:

expensive
fragmented
hard to control

ToastStack is an attempt to define a better pattern:

Hybrid, local-first AI development

If This Helps You

Star the repo.
Share it.
Build on it.

Final Thought

This is not just a starter kit.

It is the beginning of a new standard for how developers work with AI.

Run local. Route smart. Scale intentionally.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
agents		agents
benchmarks		benchmarks
docs		docs
examples		examples
gateway/routing-examples		gateway/routing-examples
guardrails		guardrails
local		local
workflows		workflows
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ToastStack Starter

What is ToastStack?

The Problem

The Shift

What This Repo Gives You

Quick Start

1. Setup local models

2. Pull recommended models

3. Start the gateway

4. Run an example

Routing Example

Cost Impact

Workflows Included

Local-First Development

Cloud Validation

PR Review Flow

Agents

For Teams

Architecture Philosophy

What This Is NOT

Roadmap

Current

In Progress

Future (ToastStack Platform)

Learn More

Why This Exists

If This Helps You

Final Thought

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Folders and files

Latest commit

History

Repository files navigation

ToastStack Starter

What is ToastStack?

The Problem

The Shift

What This Repo Gives You

Quick Start

1. Setup local models

2. Pull recommended models

3. Start the gateway

4. Run an example

Routing Example

Cost Impact

Workflows Included

Local-First Development

Cloud Validation

PR Review Flow

Agents

For Teams

Architecture Philosophy

What This Is NOT

Roadmap

Current

In Progress

Future (ToastStack Platform)

Learn More

Why This Exists

If This Helps You

Final Thought

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Packages