Saker

A source-available agent runtime for creative work. Prompt, generate, review, and automate — all in a single binary.

Quick start · Features · Architecture · Docs · 中文

Prompting → Media generation → Review & automation

What is Saker

Saker fuses three things that creative teams usually run as separate stacks — an agent runtime, a Web workspace, and a browser-based video editor — into a single Go binary. It owns the full creative loop: write a prompt, plan the work, generate media, review it, and ship it through automation or messaging channels. Everything is local-first, embedded, and works the same way whether you launch it as a CLI, a TUI, an HTTP server, an IM bot, or a Wails desktop app.

Why Saker

Problem	Typical approach	Saker
Creative pipeline scattered across tools	Separate backends for prompting, generation, and editing	One binary that embeds the workspace, editor, runtime, and gateways
Sandbox is either insecure or impractical	Docker-only or host-only	Five backends — host, Landlock, gVisor, Docker, govm — selected per host
Vendor lock-in for models and tools	One provider, one tool table	Multi-provider with failover and routing; 33 builtin tools, MCP servers, remote tools
Hard to deploy multi-tenant on a server	Local CLI only	Built-in OAuth/LDAP/Bearer auth, CSRF, CORS, SSRF guards, path-traversal hardening, per-project scopes
Observability bolted on later	Wire OTel after the fact	Prometheus metrics, structured slog, OTel spans, request IDs through the stack out of the box

Quick start

Prerequisites

Go 1.26 or newer
Node.js 22 or newer
pnpm (the repo is a pnpm workspace covering web/, web-editor-next/, packages/)
Docker (optional — required by Docker / govm sandbox backends and the e2e suite)

Build and run

git clone https://github.com/saker-ai/saker.git
cd saker

pnpm install                      # frontend dependencies
make run                          # build frontends, embed, start server on :10112

Open http://localhost:10112 for the workspace, http://localhost:10112/editor/ for the video editor.

CLI usage

make saker                        # build the CLI

export ANTHROPIC_API_KEY=sk-ant-...
./bin/saker --print "Draft a 30-second product video concept"
./bin/saker                       # interactive TUI

Frontend dev mode

make web-dev                      # workspace at http://localhost:10111
make web-editor-dev               # editor dev server

Features

Agent runtime

Capability	Notes
Core loop	Iteration cap, deadline, classified `StopReason` (`completed` / `max_iterations` / `max_budget` / `max_tokens` / `repeat_loop` / aborted variants / `model_error`)
Budget guard	Aborts on cumulative cost or token ceiling
Loop detection	Halts on identical repeated tool calls; optional self-correction
SSE streaming	Anthropic-compatible SSE with agent-specific event extensions
Session history	In-memory ring buffer (default 1000 turns, configurable)
Context compaction	`compact` and `microcompact` strategies, prompt summarisation, history trimming
Profiles	Named profiles isolate settings, memory, and history
Subagents	Forked sub-runtimes with optional git worktree, transcript streamed back
Checkpoints	Resumable session/run state via memory or file backend

Models

Backed by Bifrost (pkg/model/bifrost_adapter.go); saker Provider types wrap Bifrost as the SDK-level engine.

Capability	Notes
Providers	23+ via Bifrost: Anthropic, OpenAI, AWS Bedrock, Google Vertex, Azure OpenAI, Ollama, Cohere, Mistral, Groq, Gemini, XAI, DashScope (via OpenAI-compatible), Cerebras, Fireworks, OpenRouter, HuggingFace, Replicate, …
Auth	API key, AWS IAM (Bedrock), GCP service account or IAM role (Vertex), Azure API key or client_secret/tenant (Azure)
Failover	Bifrost SDK-level `Fallbacks` cross-provider routing; observer plugin emits per-request switch events
Prompt caching	System and recent-message ephemeral cache_control on Anthropic / Bedrock-Anthropic
Observability	Optional `ObservationSink` plugin captures per-request provider/model/usage/duration; OTel span attrs include cache & total tokens

Tools (33 builtin + memory + MCP)

Expand to see the registered builtin tools

Group	Tools
`core_io`	`bash`, `file_read`, `file_write`, `file_edit`, `grep`, `glob`
`bash_mgmt`	`bash_output`, `bash_status`, `kill_task`
`task_mgmt`	`task` (subagent spawn), `task_create`, `task_list`, `task_get`, `task_update`
`web`	`web_fetch`, `web_search`
`media`	`image_read`, `video_sampler`, `stream_capture`, `stream_monitor`, `frame_analyzer`, `video_summarizer`, `analyze_video`, `media_index`, `media_search`
`interaction`	`ask_user_question`, `skill`, `slash_command`
`canvas`	`canvas_get_node`, `canvas_list_nodes`, `canvas_table_write`
`browser`	`browser` (chromedp), `webhook` (SSRF-safe)
(auto)	`memory_save`, `memory_read` (enabled when `MemoryDir` is set)

Each runtime mode selects a curated subset via mode presets:

Preset	Groups	Use case
`cli`	core_io, bash_mgmt, task_mgmt, web, media, interaction	Interactive terminal / TUI
`server_web`	core_io, bash_mgmt, task_mgmt, web, media, canvas, browser	Web workspace with UI
`server_api`	core_io, bash_mgmt, task_mgmt, web, media, interaction	API-only backend (no canvas/browser)
`ci`	core_io, bash_mgmt	CI pipelines (minimal)

Override with Options.ModePreset or --api-only flag. Further filter with Options.EnabledBuiltinTools (whitelist) or Options.DisallowedTools (blacklist). MCP and remote tools register on top of the preset.

Source of truth: pkg/api/tool_groups.go, pkg/api/runtime_tools_register.go.

Sandbox & security

Capability	Notes
Five backends	`host`, `landlock` (LSM, helper process), `gvisor` (runsc, helper process), `docker` (network off by default), `govm` (microVM via `godeps/govm`)
Filesystem policy	Allow / deny lists with path mapping (`pkg/sandbox/pathmap`) and `O_NOFOLLOW` opens
SSRF guard	Blocks loopback, private ranges, link-local, metadata endpoints, plus DNS-rebinding safe close
Leak detection	Regex-based secret scanning with severity, masking, and cleanup
Permission matrix	Per-tool `allow / deny / ask` rules from `permissions.json`, runtime resolver, and approval prompts
Auth	Local credentials, OIDC, LDAP, Bearer tokens; per-project / per-user scope middleware

Canvas & media

DAG document with typed nodes and edges (flow / reference / context)
40+ node types (Agent, AI, Audio, Composition, Export, ImageGen, LLM, Mask, Prompt, VideoGen, VoiceGen, …)
Topological executor that dispatches generation nodes back into the agent runtime
Media index with keyframes and chromem-go vector embeddings; full-text and semantic search
Audio transcription, video summarisation, and frame-level analysis pipelines

Browser video editor

Capability	Notes
Timeline	Multi-track audio / video / text / effects
Animation	Keyframes with Bezier interpolation
Effects	Registry, per-effect components, parameter channel animation
Subtitles	ASS / SRT parse / build / insert
Transcription	LLM-driven audio transcription with diagnostics
Preview	Render overlay, zoom, grid, snap
WASM rendering	Browser-side media rendering via WebAssembly
History	Command-pattern undo / redo with clipboard support

Derived from OpenCut (MIT). Asset attributions live in web-editor-next/ASSET_LICENSES.md.

IM gateway

Saker can bridge to ten chat platforms so users interact with the agent through the apps they already use:

telegram · feishu · discord · slack · dingtalk · wecom · qq · qqbot · line · weixin

./bin/saker --gateway telegram --gateway-token "<bot-token>"
./bin/saker --gateway-config gateway.toml          # multi-platform

Channels can also be configured from the TUI (im_config tool) or the workspace settings panel.

Architecture

For per-package roles see docs/architecture.md.

Data flow (one request)

Surface — CLI/TUI/HTTP/IM/ACP entry point parses input and resolves a profile.
Runtime — pkg/api.Runtime loads settings, builds the sandbox, registers builtin + MCP + remote tools, attaches persona / memory / sessiondb / skills / subagents / cache.
Loop — pkg/agent.Agent.Run iterates until a StopReason fires; budget, loop-detect, and compaction guard the loop.
Model — pkg/model provider wraps Bifrost; cross-provider failover is handled by Bifrost SDK-level Fallbacks. Calls are instrumented by pkg/metrics, the optional ObservationSink plugin, and (when built with -tags otel) by pkg/api/otel.go.
Tool — resolved permission, PreToolUse hook, dispatched to a builtin / MCP / remote tool. File-touching tools cross the pkg/sandbox boundary.
Stream — results flow back as StreamEvents for SSE / WebSocket clients, the TUI waterfall, or the IM gateway.

Repository structure

saker/
├── cmd/                  # CLI dispatcher (cmd/saker) and Wails desktop (cmd/desktop)
├── pkg/                  # Go runtime: api, agent, model, tool, runtime, server, sandbox, security,
│                         # canvas, pipeline, media, artifact, sessiondb, memory, persona, project,
│                         # storage, config, middleware, metrics, clikit, mcp, acp, im, skillhub …
├── web/                  # Next.js 16 web workspace (saker-web)
├── web-editor-next/      # Browser video editor derived from OpenCut (saker-web-editor)
├── packages/             # Shared TS workspace packages (editor-protocol)
├── examples/             # 20 numbered examples (01-basic … 20-realtime-video)
├── test/                 # Integration, pipeline, runtime, security suites
├── e2e/                  # Docker-based end-to-end suites
├── eval/                 # Eval framework (offline + LLM + Terminal-Bench)
├── docs/                 # Documentation, ADRs, diagrams (mermaid + rendered SVG)
├── bench/                # Benchmark baselines
└── scripts/              # Repo maintenance scripts

Documentation

Document	Description
Overview	High-level summary
Architecture	Detailed mermaid architecture and request sequence
Development guide	Local dev workflow, tests, conventions
Configuration	Settings, profiles, env vars
Deployment	Production deployment notes
Security model	Threat model and defences
Observability	Metrics, logs, OTel
Testing	Test taxonomy and harness
API reference	REST / WS / SSE surface
ADRs	Architecture decision records
Security policy	Reporting vulnerabilities
Third-party notices	Dependency licenses
Roadmap	Planned work
Changelog	Release history

Development

make test-short        # quick subset, dev loop
make test-unit         # unit tests with race detector
make test-pipeline     # pipeline integration tests
make lint              # golangci-lint
make bench             # benchmarks → bench/baseline.txt

make server-dev        # Go-only dev server (no embedded frontend)
make server            # full build + embed + serve
make build             # composite production build (web + editor + binary)
make diagrams          # re-render docs/images/*.svg from docs/diagrams/*.mmd

Frontend checks:

pnpm --filter saker-web        run test
pnpm --filter saker-web        run build
pnpm --filter saker-web-editor run build

Configuration

Project-local runtime state lives under .saker/ (git-ignored).

ANTHROPIC_API_KEY=    # Anthropic
OPENAI_API_KEY=       # OpenAI
DASHSCOPE_API_KEY=    # DashScope (via OpenAI-compatible)
SAKER_MODEL=          # Default model, e.g. claude-sonnet-4-5-20250929

# Optional — Bifrost-backed providers
AWS_REGION=           # Bedrock (or AWS_DEFAULT_REGION); IAM role used when AWS_ACCESS_KEY_ID empty
GOOGLE_CLOUD_PROJECT= # Vertex; GOOGLE_CLOUD_REGION optional (defaults us-central1)
AZURE_OPENAI_ENDPOINT=  # Azure OpenAI; pair with AZURE_OPENAI_API_KEY or client_secret tuple
OLLAMA_BASE_URL=      # Ollama (defaults http://localhost:11434)

Server authentication:

./bin/saker --auth-user admin --auth-pass '<password>'
./bin/saker --server

Contributing

Issues and pull requests are welcome. Run the relevant tests and builds before submitting and document setup steps in the PR description. See CONTRIBUTING.md.

License

Saker is released under the Saker Source License Version 1.0 (SKL-1.0) — source-available, based on Apache 2.0 with additional terms.

Scenario	Terms
Small teams & individuals	Free for production if annual revenue ≤ ¥1,000,000 and registered users ≤ 100
Commercial license required	Annual revenue > ¥1,000,000 or registered users > 100
Non-production use	Always free — evaluation, testing, development, learning, research
Derivative works	Must display "Powered by Saker.cc" in product UI and documentation

Commercial licensing: cinience@hotmail.com.

Upstream notices live in NOTICE; dependency licenses in docs/third-party-notices.md.
Code under web-editor-next/ is derived from OpenCut (MIT); asset credits in web-editor-next/ASSET_LICENSES.md.
The godeps/* packages (aigo, goim, govm) are remote Go modules resolved through go.mod, not local directories.

Built by Saker.cc

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
.devcontainer		.devcontainer
.docs		.docs
.github		.github
bench		bench
cmd		cmd
docs		docs
e2e		e2e
eval		eval
examples		examples
packages/editor-protocol		packages/editor-protocol
pkg		pkg
proto/synapse/v1		proto/synapse/v1
scripts		scripts
test		test
web-editor-next		web-editor-next
web		web
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
NOTICE		NOTICE
README.md		README.md
README_zh.md		README_zh.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
cliff.toml		cliff.toml
docker-compose-pg.yml		docker-compose-pg.yml
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum
install.sh		install.sh
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
text_Trump_visit_China_2026_results_20260517_113653.json		text_Trump_visit_China_2026_results_20260517_113653.json
text_川普访华的结果_20260517_113644.json		text_川普访华的结果_20260517_113644.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Saker

What is Saker

Why Saker

Quick start

Prerequisites

Build and run

CLI usage

Frontend dev mode

Features

Agent runtime

Models

Tools (33 builtin + memory + MCP)

Sandbox & security

Canvas & media

Browser video editor

IM gateway

Architecture

Data flow (one request)

Repository structure

Documentation

Development

Configuration

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Saker

What is Saker

Why Saker

Quick start

Prerequisites

Build and run

CLI usage

Frontend dev mode

Features

Agent runtime

Models

Tools (33 builtin + memory + MCP)

Sandbox & security

Canvas & media

Browser video editor

IM gateway

Architecture

Data flow (one request)

Repository structure

Documentation

Development

Configuration

Contributing

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages