A source-available agent runtime for creative work. Prompt, generate, review, and automate — all in a single binary.
Quick start · Features · Architecture · Docs · 中文
Saker fuses three things that creative teams usually run as separate stacks — an agent runtime, a Web workspace, and a browser-based video editor — into a single Go binary. It owns the full creative loop: write a prompt, plan the work, generate media, review it, and ship it through automation or messaging channels. Everything is local-first, embedded, and works the same way whether you launch it as a CLI, a TUI, an HTTP server, an IM bot, or a Wails desktop app.
| Problem | Typical approach | Saker |
|---|---|---|
| Creative pipeline scattered across tools | Separate backends for prompting, generation, and editing | One binary that embeds the workspace, editor, runtime, and gateways |
| Sandbox is either insecure or impractical | Docker-only or host-only | Five backends — host, Landlock, gVisor, Docker, govm — selected per host |
| Vendor lock-in for models and tools | One provider, one tool table | Multi-provider with failover and routing; 33 builtin tools, MCP servers, remote tools |
| Hard to deploy multi-tenant on a server | Local CLI only | Built-in OAuth/LDAP/Bearer auth, CSRF, CORS, SSRF guards, path-traversal hardening, per-project scopes |
| Observability bolted on later | Wire OTel after the fact | Prometheus metrics, structured slog, OTel spans, request IDs through the stack out of the box |
- Go 1.26 or newer
- Node.js 22 or newer
- pnpm (the repo is a pnpm workspace covering
web/,web-editor-next/,packages/) - Docker (optional — required by Docker / govm sandbox backends and the e2e suite)
git clone https://github.com/saker-ai/saker.git
cd saker
pnpm install # frontend dependencies
make run # build frontends, embed, start server on :10112Open http://localhost:10112 for the workspace, http://localhost:10112/editor/ for the video editor.
make saker # build the CLI
export ANTHROPIC_API_KEY=sk-ant-...
./bin/saker --print "Draft a 30-second product video concept"
./bin/saker # interactive TUImake web-dev # workspace at http://localhost:10111
make web-editor-dev # editor dev server| Capability | Notes |
|---|---|
| Core loop | Iteration cap, deadline, classified StopReason (completed / max_iterations / max_budget / max_tokens / repeat_loop / aborted variants / model_error) |
| Budget guard | Aborts on cumulative cost or token ceiling |
| Loop detection | Halts on identical repeated tool calls; optional self-correction |
| SSE streaming | Anthropic-compatible SSE with agent-specific event extensions |
| Session history | In-memory ring buffer (default 1000 turns, configurable) |
| Context compaction | compact and microcompact strategies, prompt summarisation, history trimming |
| Profiles | Named profiles isolate settings, memory, and history |
| Subagents | Forked sub-runtimes with optional git worktree, transcript streamed back |
| Checkpoints | Resumable session/run state via memory or file backend |
Backed by Bifrost (pkg/model/bifrost_adapter.go); saker Provider types wrap Bifrost as the SDK-level engine.
| Capability | Notes |
|---|---|
| Providers | 23+ via Bifrost: Anthropic, OpenAI, AWS Bedrock, Google Vertex, Azure OpenAI, Ollama, Cohere, Mistral, Groq, Gemini, XAI, DashScope (via OpenAI-compatible), Cerebras, Fireworks, OpenRouter, HuggingFace, Replicate, … |
| Auth | API key, AWS IAM (Bedrock), GCP service account or IAM role (Vertex), Azure API key or client_secret/tenant (Azure) |
| Failover | Bifrost SDK-level Fallbacks cross-provider routing; observer plugin emits per-request switch events |
| Prompt caching | System and recent-message ephemeral cache_control on Anthropic / Bedrock-Anthropic |
| Observability | Optional ObservationSink plugin captures per-request provider/model/usage/duration; OTel span attrs include cache & total tokens |
Expand to see the registered builtin tools
| Group | Tools |
|---|---|
core_io |
bash, file_read, file_write, file_edit, grep, glob |
bash_mgmt |
bash_output, bash_status, kill_task |
task_mgmt |
task (subagent spawn), task_create, task_list, task_get, task_update |
web |
web_fetch, web_search |
media |
image_read, video_sampler, stream_capture, stream_monitor, frame_analyzer, video_summarizer, analyze_video, media_index, media_search |
interaction |
ask_user_question, skill, slash_command |
canvas |
canvas_get_node, canvas_list_nodes, canvas_table_write |
browser |
browser (chromedp), webhook (SSRF-safe) |
| (auto) | memory_save, memory_read (enabled when MemoryDir is set) |
Each runtime mode selects a curated subset via mode presets:
| Preset | Groups | Use case |
|---|---|---|
cli |
core_io, bash_mgmt, task_mgmt, web, media, interaction | Interactive terminal / TUI |
server_web |
core_io, bash_mgmt, task_mgmt, web, media, canvas, browser | Web workspace with UI |
server_api |
core_io, bash_mgmt, task_mgmt, web, media, interaction | API-only backend (no canvas/browser) |
ci |
core_io, bash_mgmt | CI pipelines (minimal) |
Override with Options.ModePreset or --api-only flag. Further filter with Options.EnabledBuiltinTools (whitelist) or Options.DisallowedTools (blacklist). MCP and remote tools register on top of the preset.
Source of truth: pkg/api/tool_groups.go, pkg/api/runtime_tools_register.go.
| Capability | Notes |
|---|---|
| Five backends | host, landlock (LSM, helper process), gvisor (runsc, helper process), docker (network off by default), govm (microVM via godeps/govm) |
| Filesystem policy | Allow / deny lists with path mapping (pkg/sandbox/pathmap) and O_NOFOLLOW opens |
| SSRF guard | Blocks loopback, private ranges, link-local, metadata endpoints, plus DNS-rebinding safe close |
| Leak detection | Regex-based secret scanning with severity, masking, and cleanup |
| Permission matrix | Per-tool allow / deny / ask rules from permissions.json, runtime resolver, and approval prompts |
| Auth | Local credentials, OIDC, LDAP, Bearer tokens; per-project / per-user scope middleware |
- DAG document with typed nodes and edges (flow / reference / context)
- 40+ node types (Agent, AI, Audio, Composition, Export, ImageGen, LLM, Mask, Prompt, VideoGen, VoiceGen, …)
- Topological executor that dispatches generation nodes back into the agent runtime
- Media index with keyframes and
chromem-govector embeddings; full-text and semantic search - Audio transcription, video summarisation, and frame-level analysis pipelines
| Capability | Notes |
|---|---|
| Timeline | Multi-track audio / video / text / effects |
| Animation | Keyframes with Bezier interpolation |
| Effects | Registry, per-effect components, parameter channel animation |
| Subtitles | ASS / SRT parse / build / insert |
| Transcription | LLM-driven audio transcription with diagnostics |
| Preview | Render overlay, zoom, grid, snap |
| WASM rendering | Browser-side media rendering via WebAssembly |
| History | Command-pattern undo / redo with clipboard support |
Derived from OpenCut (MIT). Asset attributions live in web-editor-next/ASSET_LICENSES.md.
Saker can bridge to ten chat platforms so users interact with the agent through the apps they already use:
telegram · feishu · discord · slack · dingtalk · wecom · qq · qqbot · line · weixin
./bin/saker --gateway telegram --gateway-token "<bot-token>"
./bin/saker --gateway-config gateway.toml # multi-platformChannels can also be configured from the TUI (im_config tool) or the workspace settings panel.
For per-package roles see docs/architecture.md.
- Surface — CLI/TUI/HTTP/IM/ACP entry point parses input and resolves a profile.
- Runtime —
pkg/api.Runtimeloads settings, builds the sandbox, registers builtin + MCP + remote tools, attaches persona / memory / sessiondb / skills / subagents / cache. - Loop —
pkg/agent.Agent.Runiterates until aStopReasonfires; budget, loop-detect, and compaction guard the loop. - Model —
pkg/modelprovider wraps Bifrost; cross-provider failover is handled by Bifrost SDK-levelFallbacks. Calls are instrumented bypkg/metrics, the optionalObservationSinkplugin, and (when built with-tags otel) bypkg/api/otel.go. - Tool — resolved permission, PreToolUse hook, dispatched to a builtin / MCP / remote tool. File-touching tools cross the
pkg/sandboxboundary. - Stream — results flow back as
StreamEvents for SSE / WebSocket clients, the TUI waterfall, or the IM gateway.
saker/
├── cmd/ # CLI dispatcher (cmd/saker) and Wails desktop (cmd/desktop)
├── pkg/ # Go runtime: api, agent, model, tool, runtime, server, sandbox, security,
│ # canvas, pipeline, media, artifact, sessiondb, memory, persona, project,
│ # storage, config, middleware, metrics, clikit, mcp, acp, im, skillhub …
├── web/ # Next.js 16 web workspace (saker-web)
├── web-editor-next/ # Browser video editor derived from OpenCut (saker-web-editor)
├── packages/ # Shared TS workspace packages (editor-protocol)
├── examples/ # 20 numbered examples (01-basic … 20-realtime-video)
├── test/ # Integration, pipeline, runtime, security suites
├── e2e/ # Docker-based end-to-end suites
├── eval/ # Eval framework (offline + LLM + Terminal-Bench)
├── docs/ # Documentation, ADRs, diagrams (mermaid + rendered SVG)
├── bench/ # Benchmark baselines
└── scripts/ # Repo maintenance scripts
| Document | Description |
|---|---|
| Overview | High-level summary |
| Architecture | Detailed mermaid architecture and request sequence |
| Development guide | Local dev workflow, tests, conventions |
| Configuration | Settings, profiles, env vars |
| Deployment | Production deployment notes |
| Security model | Threat model and defences |
| Observability | Metrics, logs, OTel |
| Testing | Test taxonomy and harness |
| API reference | REST / WS / SSE surface |
| ADRs | Architecture decision records |
| Security policy | Reporting vulnerabilities |
| Third-party notices | Dependency licenses |
| Roadmap | Planned work |
| Changelog | Release history |
make test-short # quick subset, dev loop
make test-unit # unit tests with race detector
make test-pipeline # pipeline integration tests
make lint # golangci-lint
make bench # benchmarks → bench/baseline.txt
make server-dev # Go-only dev server (no embedded frontend)
make server # full build + embed + serve
make build # composite production build (web + editor + binary)
make diagrams # re-render docs/images/*.svg from docs/diagrams/*.mmdFrontend checks:
pnpm --filter saker-web run test
pnpm --filter saker-web run build
pnpm --filter saker-web-editor run buildProject-local runtime state lives under .saker/ (git-ignored).
ANTHROPIC_API_KEY= # Anthropic
OPENAI_API_KEY= # OpenAI
DASHSCOPE_API_KEY= # DashScope (via OpenAI-compatible)
SAKER_MODEL= # Default model, e.g. claude-sonnet-4-5-20250929
# Optional — Bifrost-backed providers
AWS_REGION= # Bedrock (or AWS_DEFAULT_REGION); IAM role used when AWS_ACCESS_KEY_ID empty
GOOGLE_CLOUD_PROJECT= # Vertex; GOOGLE_CLOUD_REGION optional (defaults us-central1)
AZURE_OPENAI_ENDPOINT= # Azure OpenAI; pair with AZURE_OPENAI_API_KEY or client_secret tuple
OLLAMA_BASE_URL= # Ollama (defaults http://localhost:11434)Server authentication:
./bin/saker --auth-user admin --auth-pass '<password>'
./bin/saker --serverIssues and pull requests are welcome. Run the relevant tests and builds before submitting and document setup steps in the PR description. See CONTRIBUTING.md.
Saker is released under the Saker Source License Version 1.0 (SKL-1.0) — source-available, based on Apache 2.0 with additional terms.
| Scenario | Terms |
|---|---|
| Small teams & individuals | Free for production if annual revenue ≤ ¥1,000,000 and registered users ≤ 100 |
| Commercial license required | Annual revenue > ¥1,000,000 or registered users > 100 |
| Non-production use | Always free — evaluation, testing, development, learning, research |
| Derivative works | Must display "Powered by Saker.cc" in product UI and documentation |
Commercial licensing: cinience@hotmail.com.
- Upstream notices live in NOTICE; dependency licenses in docs/third-party-notices.md.
- Code under
web-editor-next/is derived from OpenCut (MIT); asset credits inweb-editor-next/ASSET_LICENSES.md. - The
godeps/*packages (aigo,goim,govm) are remote Go modules resolved throughgo.mod, not local directories.