Skip to content

Tochemey/atto

atto

Production-grade LLM agents in Go.
Write familiar Go code: tools, prompts, streaming events. Let the runtime handle sessions, state, backpressure, persistence, and optional cluster placement.

build Go Reference Go Report Card codecov Go version License

Atto is built for people shipping agents, not for lecture slides about concurrency models. You wire tool.Func, agent.NewLLM, and runner.Run. You stream session.Event values like any other Go iterator.

Below that small surface, each session is an actor. Messages are handled in order. Concurrent callers queue safely. When someone pushes too hard, the runtime gives you an explicit error instead of silent corruption or an OOM. With goakt, the same API can span multiple machines: sticky routing and reload from storage stay in the runtime.

If you have never said "actor" out loud, picture one goroutine's worth of discipline per user, enforced by the framework. You are not maintaining a map[string]*sync.Mutex or guessing what happens when two HTTP handlers touch the same chat history.

If you already like actors, Atto is where that idea meets first-class LLM tooling: native OpenAI, Anthropic, and Gemini adapters; JSON Schema tools from plain Go functions; optimistic concurrency on persisted snapshots; and one Runner interface whether you run in-process or on a cluster.

Status: pre-1.0. APIs may change before v0.1.0 is tagged. The module path is stable through v1.

Installation

go get github.com/tochemey/atto

Requires Go 1.26+.

Features

The main packages are agent, runner, session, tool, store/..., and llm/....

  • LLM agent loop. agent.NewLLM streams completions and runs tool round trips until the model answers without more tool calls, or until WithMaxIterations stops the loop (default 10). Configure WithInstruction, WithTemperature, WithTools, WithName (shown as the event author), and any llm.LLM adapter.

  • Typed tools. tool.Func reflects the argument struct into JSON Schema (unless you pass WithSchema), attaches an optional description, and wraps ordinary Go functions. The registry validates JSON arguments before dispatch.

  • Session state and history. Each runner.Run builds an agent.Invocation with the session ID, user Input, prior History, and a mutable session.State plus StateDelta. State changes recorded in Delta commit atomically with the assistant's final message for that turn.

  • Persistence contract. store.SessionStore saves Snapshot values: history, state, UpdatedAt, and Version. Save enforces optimistic concurrency; stale versions return store.ErrConcurrentWrite. Load and Delete round out the interface. Implementations live under store/inmemory, store/bolt, and store/postgres, all checked against store/storetest.

  • Streaming events. Runner.Run yields iter.Seq2[*session.Event, error]: text deltas, tool call and result notices, the final assistant message, and terminal errors (session.EventKind).

  • In-process runner. runner.New(nil, ...) runs the agent on the caller's goroutine, keeps per-session turns ordered with a mutex, and persists through runner.WithStore (default in-memory). It is useful for tests, demos, and small CLIs. It keeps one mutex per session ID for the lifetime of the process.

  • Actor-backed runner. runner.New(actorSystem, ...), with runner.Extension registered on the system, uses a SessionActor per session, RunWorker goroutines per invocation, goakt stashing for snapshot/commit ordering, optional passivation after idle time (runner.WithPassivationAfter, default 15 minutes), and a buffered event pipe (runner.WithEventBufferSize, default 64). While a turn is in flight, extra work for that session queues in the actor stash up to runner.ExtensionWithStashBound (runtime default 32); beyond that you get session.ErrSessionBacklogFull. A failed Save on commit rolls the turn back in memory before replying.

  • Model actor and proxy. runner.NewModelLLM sends every completion through ModelActor: streaming via pipe-to tasks, exponential backoff retries for transient failures (caps configured on runner.Extension), and CancelCompletion tied to the caller so one stream does not abort another.

  • Cluster wiring. runner.ClusterKinds registers session and model actor types; runner.RemoteSerializables registers wire structs for goakt remote. Dependencies resolve in PreStart from runner.Extension so actors can relocate without captured globals.

  • First-party model adapters. Streaming llm.LLM implementations for OpenAI-compatible HTTP APIs (llm/openai), native Anthropic (llm/anthropic, including Request.CacheKey for prompt caching), native Gemini (llm/gemini, AI Studio and Vertex), plus helpers for Azure, Ollama, and vLLM.

  • Test doubles. llm.NewFake replays scripted chunks; tool.Fake records calls. Combine them with store/inmemory for end-to-end tests without the network or goakt.

Five-minute tour

package main

import (
    "context"
    "encoding/json"
    "fmt"
    "log"

    "github.com/tochemey/atto/agent"
    "github.com/tochemey/atto/llm"
    "github.com/tochemey/atto/runner"
    "github.com/tochemey/atto/session"
    "github.com/tochemey/atto/store/inmemory"
    "github.com/tochemey/atto/tool"
)

type weatherArgs struct{ City string `json:"city"` }

func main() {
    ctx := context.Background()

    weather := tool.Func("get_weather",
        func(_ context.Context, a weatherArgs) (string, error) {
            return fmt.Sprintf("18C and cloudy in %s.", a.City), nil
        },
        tool.WithDescription("Look up the weather in a city."),
    )

    // Swap for llm/openai, llm/anthropic, or llm/gemini when you have a key.
    model := llm.NewFake(
        llm.Script{Chunks: []*llm.Chunk{{
            ToolCalls: []session.ToolCall{{
                ID:        "call-1",
                Name:      "get_weather",
                Arguments: json.RawMessage(`{"city":"Lagos"}`),
            }},
            Done: true,
        }}},
        llm.Script{Chunks: []*llm.Chunk{
            {Delta: "The weather in Lagos is 18C and cloudy."},
            {Done: true},
        }},
    )

    a := agent.NewLLM(
        agent.WithModel(model),
        agent.WithInstruction("You are a helpful assistant."),
        agent.WithTools(weather),
    )

    // nil ActorSystem uses the synchronous in-process runner.
    r, err := runner.New(nil, a, runner.WithStore(inmemory.New()))
    if err != nil {
        log.Fatal(err)
    }
    defer r.Stop(ctx)

    for ev, err := range r.Run(ctx, "user-123", session.UserText("What's the weather in Lagos?")) {
        if err != nil {
            log.Fatal(err)
        }
        if ev.Kind == session.EventTextDelta {
            fmt.Print(ev.TextDelta)
        }
    }
}

The surface area stays small: Agent, LLM, Tool, Event, Runner. Session lifecycle, mailboxes, stash limits, passivation, and cluster serialisation live inside the runtime. You do not subclass actors or spawn a goroutine per chat by hand.

Motivation

Go developers can choose from real agent frameworks, not only hand-rolled loops around HTTP APIs. Two widely referenced stacks are LangChainGo, the Go port of LangChain-style composability, and Google's Agent Development Kit for Go, published as google.golang.org/adk and documented with the rest of ADK at google.github.io/adk-docs. Many other agent-related modules ship from independent authors on pkg.go.dev; scopes and trade-offs vary, so each module deserves to be judged by its own docs and release cadence.

Across those projects the dominant pattern is composition inside your process: you wire models, tools, and storage using the library's abstractions, then you decide how to enforce per-session ordering, survive restarts, shed load, and run more than one replica. Clustering and sticky sessions are usually application architecture (databases, queues, orchestrators), not one shared runtime primitive baked into the toolkit.

Google ADK for Go sits in that broader ADK ecosystem and tooling story. Atto sits elsewhere: it assumes you want session actors, bounded mailboxes, centralised completion retries, and optional multi-node placement to come from an actor system (GoAkt) behind one stable Runner API, with store.SessionStore as the persistence contract instead of a session service you have to invent.

Atto is aimed at people who want:

  1. A typed, idiomatic Go API. context, streaming results as iter.Seq2[*session.Event, error], and small functional options instead of a framework-shaped DSL.
  2. A runtime that scales to multiple nodes without hand-written distribution code. Placement, supervision, scheduling, and backpressure belong to GoAkt; Atto maps sessions and completions onto actors so runner.Run stays the same whether you pass nil for the actor system or a clustered ActorSystem.
  3. Vendor-neutral models and portable tools. Several llm.LLM adapters ship in-tree; tools are ordinary Go functions described with JSON Schema. MCP-first tooling and A2A wire compatibility are planned goals, not part of the initial release focus.
  4. Fault semantics that are part of the product, not glue code. Examples include retries and cancellation around ModelActor, per-session ordering, bounded backlog (session.ErrSessionBacklogFull on the actor-backed runner), and optimistic concurrency on store.SessionStore.
  5. Production persistence without a separate session microservice. SessionStore is a Go interface with inmemory, bolt, and postgres implementations and a shared storetest suite.

GoAkt already supplies the hard runtime primitives. Atto stays comparatively thin: Agent, LLM, Tool, Event, Runner, plus adapters and stores, rather than reimplementing scheduling or clustering itself.

That intent shows up in day-to-day behaviour:

  • Same user, two concurrent HTTP requests: turns run one after another; history stays coherent.
  • Different users: independent session actors run in parallel with the goakt-backed runner.
  • A caller floods one session: the actor-backed runner returns session.ErrSessionBacklogFull when the bounded stash is full; the sync runner blocks concurrent callers on a per-session mutex.
  • Idle chats pile up in RAM: configurable passivation frees memory; the next message reloads from your store.
  • Rolling restart or node loss: a shared store.SessionStore plus sticky placement keeps session state across process boundaries.
  • Race-free persistence: Snapshot.Version provides optimistic concurrency; conflicting writes come back as store.ErrConcurrentWrite.

Architecture at a glance

runner is the façade. runner.New with nil gives you a synchronous runner: per-session serialisation uses an internal mutex, and options like WithStore apply directly. Hand it a goakt ActorSystem (with Atto's extension registered) and you get the distributed runner: Worker goroutines talk to SessionActor for history and commits and ModelActor for retried completions. The Runner interface is the same either way.

internal/actor.SessionActor loads and saves through store.SessionStore, applies persisted turn deltas (CommitTurn), and uses goakt's stash so overlapping snapshot requests wait in line or error cleanly when the stash overflows.

runner.NewModelLLM wraps your raw llm.LLM so streaming completions go through ModelActor, where centralised retry and backoff live. That matters most when you run on a cluster.

store.SessionStore is the contract behind inmemory, bolt, and postgres. The shared storetest suite keeps every backend honest.

Single node, then a cluster: the same Run call

sys, _ := actor.NewActorSystem("agents",
    actor.WithRemote(remote.NewConfig("0.0.0.0", 3330,
        remote.WithSerializables(runner.RemoteSerializables()...))),
    actor.WithCluster(actor.NewClusterConfig().
        WithDiscovery(natsProvider).
        WithKinds(runner.ClusterKinds()...).
        WithMinimumPeersQuorum(2)),
    actor.WithExtensions(runner.Extension(
        runner.ExtensionWithStore(postgresStore),
        runner.ExtensionWithLLM(model),
    )),
)

r, _ := runner.New(sys, rootAgent)

runner.ClusterKinds registers the session and model actor types on the cluster. runner.RemoteSerializables registers every wire message for goakt's remote layer. Use both together so you do not get the classic "works on one node, dies across the cluster" wiring mistake.

Dependencies resolve in PreStart through runner.Extension, so actors can relocate without holding stale pointers.

Examples

Example What it shows
examples/quickstart Runnable tour with a scripted fake model. No API key required.
examples/gemini/single-agent Live Gemini plus a real HTTP tool.
examples/gemini/multi-agent Coordinator delegates to a specialist ("agent as tool").

Testing

llm.NewFake replays scripted chunks; tool.Fake records invocations. Together with store/inmemory, they give you deterministic, fast tests. New store backends plug into storetest.Run(t, factory) against the shared contract.

Roadmap and status

The five-concept public API is the stability target for v0.1.0. Cluster mode, three native model adapters, three in-tree stores, and the model actor are already in tree; details live in CHANGELOG.md. Workflow agents (Sequential, Parallel, Loop), an AgentTool helper for native sub-agents, and OTel spans are tentatively slated for v0.2.

Security

See SECURITY.md for the disclosure process.

Community

GitHub Discussions GitHub Issues

Contributing

Bug fixes, adapters, and stores are welcome. Atto uses Conventional Commits and runs go test -race, go vet, and golangci-lint as in CI. See CONTRIBUTING.md.

Why the name?

In Italian, atto is a noun meaning an act, deed, or action. That fits the spirit of AI agents: they are useful when they can do more than produce text, such as call tools, update state, and carry work forward for a user.

atto is also the metric prefix for 10^-18. That second meaning fits the engineering goal: keep the public surface tiny, let the runtime carry the weight, and make the smallest useful abstraction feel sharp enough for production work.

Releases

No releases published

Packages

 
 
 

Contributors

Languages