Production-grade LLM agents in Go.
Write familiar Go code: tools, prompts, streaming events. Let the runtime handle sessions, state, backpressure, persistence, and optional cluster placement.
Atto is built for people shipping agents, not for lecture slides about concurrency models. You wire tool.Func, agent.NewLLM, and atto.New. You stream session.Event values like any other Go iterator.
Below that small surface, each session is an actor. Messages are handled in order. Concurrent callers queue safely. When someone pushes too hard, the runtime gives you an explicit error instead of silent corruption or an OOM. With goakt, the same API can span multiple machines: sticky routing and reload from storage stay in the runtime.
If you have never said "actor" out loud, picture one goroutine's worth of discipline per user, enforced by the framework. You are not maintaining a map[string]*sync.Mutex or guessing what happens when two HTTP handlers touch the same chat history.
If you already like actors, Atto is where that idea meets first-class LLM tooling: native OpenAI, Anthropic, and Gemini adapters; JSON Schema tools from plain Go functions; optimistic concurrency on persisted snapshots; and one atto.Runtime whether you run single-node or on a cluster.
Status: pre-1.0. APIs may change before
v0.1.0is tagged. The module path is stable throughv1.
go get github.com/tochemey/attoRequires Go 1.26+.
The root atto package is the user-facing entry point; the rest of the surface lives under agent, session, tool, store/..., and llm/....
-
LLM agent loop.
agent.NewLLMstreams completions and runs tool round trips until the model answers without more tool calls, or untilWithMaxIterationsstops the loop (default 10). ConfigureWithInstruction,WithTemperature,WithTools,WithName(shown as the event author), and anyllm.LLMadapter. -
Typed tools.
tool.Funcreflects the argument struct into JSON Schema (unless you passWithSchema), attaches an optional description, and wraps ordinary Go functions. The registry validates JSON arguments before dispatch. -
Session state and history. Each
atto.Runtime.Runbuilds anagent.Invocationwith the session ID, userInput, priorHistory, and a mutablesession.StateplusStateDelta. State changes recorded inDeltacommit atomically with the assistant's final message for that turn. -
Persistence contract.
store.SessionStoresavesSnapshotvalues: history, state,UpdatedAt, andVersion.Saveenforces optimistic concurrency; stale versions returnstore.ErrConcurrentWrite.LoadandDeleteround out the interface. Implementations live understore/inmemory,store/bolt, andstore/postgres, all checked againststore/storetest. -
Streaming events.
Runtime.Runyieldsiter.Seq2[*session.Event, error]: text deltas, tool call and result notices, the final assistant message, and terminal errors (session.EventKind). -
One front door, always actor-backed.
atto.New(ctx, model, build)builds and starts an internal goakt actor system, registers atto's runtime extension, spawns the per-process model actor, and returns a*atto.Runtimeready for invocations. Thebuildclosure receives the model-actor-backedllm.LLM; the agent it returns uses that instance and inherits retry, passivation and (in cluster mode) placement transparently. -
Session actors and backpressure. Behind the front door, a
SessionActorper session owns the conversation; per-invocationRunWorkergoroutines stream events through a buffered channel (atto.WithEventBufferSize, default 64). While a turn is in flight, extra work for that session queues in the actor stash up toatto.WithStashBound(default 32); beyond that you getsession.ErrSessionBacklogFull. Idle sessions passivate afteratto.WithPassivationAfter(default 15 minutes) and re-hydrate from the configured store on next use. A failedSaveon commit rolls the turn back in memory before replying. -
Model actor. Every completion runs through a singleton
ModelActor: streaming via pipe-to tasks, exponential backoff retries for transient failures (caps configured viaatto.WithModelMaxRetries,atto.WithModelBaseBackoff,atto.WithModelMaxBackoff), and cancellation tied to the caller so one stream does not abort another. -
Cluster wiring. Pair
atto.WithActorSystem(sys)with an actor system you constructed yourself when you need goakt features beyond the option set (cluster discovery, custom logger, remote transport).atto.ClusterKindsandatto.RemoteSerializablesregister session/model actor types and wire structs on goakt's cluster and remote layers. Dependencies resolve inPreStartfromatto.Extensionso actors can relocate without captured globals. -
First-party model adapters. Streaming
llm.LLMimplementations for OpenAI-compatible HTTP APIs (llm/openai), native Anthropic (llm/anthropic, includingRequest.CacheKeyfor prompt caching), native Gemini (llm/gemini, AI Studio and Vertex), plus helpers for Azure, Ollama, and vLLM. -
Test doubles.
llm.NewFakereplays scripted chunks;tool.Fakerecords calls. Combine them withstore/inmemoryfor end-to-end tests without the network or goakt.
package main
import (
"context"
"encoding/json"
"fmt"
"log"
"github.com/tochemey/atto"
"github.com/tochemey/atto/agent"
"github.com/tochemey/atto/llm"
"github.com/tochemey/atto/session"
"github.com/tochemey/atto/tool"
)
type weatherArgs struct{ City string `json:"city"` }
func main() {
ctx := context.Background()
weather := tool.Func("get_weather",
func(_ context.Context, a weatherArgs) (string, error) {
return fmt.Sprintf("18C and cloudy in %s.", a.City), nil
},
tool.WithDescription("Look up the weather in a city."),
)
// Swap for llm/openai, llm/anthropic, or llm/gemini when you have a key.
model := llm.NewFake(
llm.Script{Chunks: []*llm.Chunk{{
ToolCalls: []session.ToolCall{{
ID: "call-1",
Name: "get_weather",
Arguments: json.RawMessage(`{"city":"Lagos"}`),
}},
Done: true,
}}},
llm.Script{Chunks: []*llm.Chunk{
{Delta: "The weather in Lagos is 18C and cloudy."},
{Done: true},
}},
)
build := func(m llm.LLM) agent.Agent {
return agent.NewLLM(
agent.WithModel(m),
agent.WithInstruction("You are a helpful assistant."),
agent.WithTools(weather),
)
}
rt, err := atto.New(ctx, model, build)
if err != nil {
log.Fatal(err)
}
defer rt.Stop(ctx)
for ev, err := range rt.Run(ctx, "user-123", session.UserText("What's the weather in Lagos?")) {
if err != nil {
log.Fatal(err)
}
if ev.Kind == session.EventTextDelta {
fmt.Print(ev.TextDelta)
}
}
}The surface area stays small: Agent, LLM, Tool, Event, Runtime. Session lifecycle, mailboxes, stash limits, passivation, and cluster serialisation live inside the runtime. You do not subclass actors or spawn a goroutine per chat by hand.
Go developers can choose from real agent frameworks, not only hand-rolled loops around HTTP APIs. Two widely referenced stacks are LangChainGo, the Go port of LangChain-style composability, and Google's Agent Development Kit for Go, published as google.golang.org/adk and documented with the rest of ADK at google.github.io/adk-docs. Many other agent-related modules ship from independent authors on pkg.go.dev; scopes and trade-offs vary, so each module deserves to be judged by its own docs and release cadence.
Across those projects the dominant pattern is composition inside your process: you wire models, tools, and storage using the library's abstractions, then you decide how to enforce per-session ordering, survive restarts, shed load, and run more than one replica. Clustering and sticky sessions are usually application architecture (databases, queues, orchestrators), not one shared runtime primitive baked into the toolkit.
Google ADK for Go sits in that broader ADK ecosystem and tooling story. Atto sits elsewhere: it assumes you want session actors, bounded mailboxes, centralised completion retries, and optional multi-node placement to come from an actor system (GoAkt) behind one stable atto.Runtime API, with store.SessionStore as the persistence contract instead of a session service you have to invent.
Atto is aimed at people who want:
- A typed, idiomatic Go API.
context, streaming results asiter.Seq2[*session.Event, error], and small functional options instead of a framework-shaped DSL. - A runtime that scales to multiple nodes without hand-written distribution code. Placement, supervision, scheduling, and backpressure belong to GoAkt; Atto maps sessions and completions onto actors so
Runtime.Runstays the same whether atto built the actor system or you handed it a clustered one. - Vendor-neutral models and portable tools. Several
llm.LLMadapters ship in-tree; tools are ordinary Go functions described with JSON Schema. MCP-first tooling and A2A wire compatibility are planned goals, not part of the initial release focus. - Fault semantics that are part of the product, not glue code. Examples include retries and cancellation around the model actor, per-session ordering, bounded backlog (
session.ErrSessionBacklogFull), and optimistic concurrency onstore.SessionStore. - Production persistence without a separate session microservice.
SessionStoreis a Go interface withinmemory,bolt, andpostgresimplementations and a sharedstoretestsuite.
GoAkt already supplies the hard runtime primitives. Atto stays comparatively thin: Agent, LLM, Tool, Event, Runtime, plus adapters and stores, rather than reimplementing scheduling or clustering itself.
That intent shows up in day-to-day behaviour:
- Same user, two concurrent HTTP requests: turns run one after another; history stays coherent.
- Different users: independent session actors run in parallel.
- A caller floods one session: the runtime returns
session.ErrSessionBacklogFullwhen the bounded stash is full. - Idle chats pile up in RAM: configurable passivation frees memory; the next message reloads from your store.
- Rolling restart or node loss: a shared
store.SessionStoreplus sticky placement keeps session state across process boundaries. - Race-free persistence:
Snapshot.Versionprovides optimistic concurrency; conflicting writes come back asstore.ErrConcurrentWrite.
atto.New is the façade. It builds and starts a private goakt ActorSystem, registers atto's runtime extension with your llm.LLM and store.SessionStore, spawns the per-process model actor, and returns a *atto.Runtime. The build closure receives the model-actor-backed LLM and hands it to the agent that the runtime drives. Per-invocation Worker goroutines talk to a SessionActor for history and commits and a ModelActor for retried completions.
internal/actor.SessionActor loads and saves through store.SessionStore, applies persisted turn deltas (CommitTurn), and uses goakt's stash so overlapping snapshot requests wait in line or error cleanly when the stash overflows.
The model-actor bridge wraps your raw llm.LLM so streaming completions go through ModelActor, where centralised retry and backoff live. That matters most when you run on a cluster — the wiring is internal and the agent only ever sees llm.LLM.
store.SessionStore is the contract behind inmemory, bolt, and postgres. The shared storetest suite keeps every backend honest.
Most callers stay on the single-process default atto.New(ctx, model, build). For cluster mode, bring your own actor system configured with goakt's remote and cluster layers, register atto's extension and kinds, then adopt the system with atto.WithActorSystem:
sys, _ := actor.NewActorSystem("agents",
actor.WithRemote(remote.NewConfig("0.0.0.0", 3330,
remote.WithSerializables(atto.RemoteSerializables()...))),
actor.WithCluster(actor.NewClusterConfig().
WithDiscovery(natsProvider).
WithKinds(atto.ClusterKinds()...).
WithMinimumPeersQuorum(2)),
actor.WithExtensions(atto.Extension(
atto.ExtensionWithStore(postgresStore),
atto.ExtensionWithLLM(model),
)),
)
_ = sys.Start(ctx)
rt, _ := atto.New(ctx, model, build, atto.WithActorSystem(sys))atto.ClusterKinds registers the session and model actor types on the cluster. atto.RemoteSerializables registers every wire message for goakt's remote layer. Use both together so you do not get the classic "works on one node, dies across the cluster" wiring mistake.
Dependencies resolve in PreStart through atto.Extension, so actors can relocate without holding stale pointers.
| Example | What it shows |
|---|---|
examples/quickstart |
Runnable tour with a scripted fake model. No API key required. |
examples/gemini/single-agent |
Live Gemini plus a real HTTP tool. |
examples/gemini/multi-agent |
Coordinator delegates to a specialist ("agent as tool"). |
llm.NewFake replays scripted chunks; tool.Fake records invocations. Together with store/inmemory, they give you deterministic, fast tests. New store backends plug into storetest.Run(t, factory) against the shared contract.
The five-concept public API is the stability target for v0.1.0. Cluster mode, three native model adapters, three in-tree stores, and the model actor are already in tree; details live in CHANGELOG.md. Workflow agents (Sequential, Parallel, Loop), an AgentTool helper for native sub-agents, and OTel spans are tentatively slated for v0.2.
See SECURITY.md for the disclosure process.
Bug fixes, adapters, and stores are welcome. Atto uses Conventional Commits and runs go test -race, go vet, and golangci-lint as in CI. See CONTRIBUTING.md.
In Italian, atto is a noun meaning an act, deed, or action. That fits the spirit of AI agents: they are useful when they can do more than produce text, such as call tools, update state, and carry work forward for a user.
atto is also the metric prefix for 10^-18. That second meaning fits the engineering goal: keep the public surface tiny, let the runtime carry the weight, and make the smallest useful abstraction feel sharp enough for production work.