Super AI Graph Ecosystem
A unified Go SDK for streaming AI agents, knowledge graphs, and RAG pipelines.
Install
·
Report Bug
·
Go Docs
- Streaming-first agent loop with 15 typed delta events and parallel tool execution
- Conversation tree with branching, checkpoints, rewind, and RLHF feedback
- Sub-agent delegation — child agents as tools, deltas forwarded with attribution
- Human-in-the-loop markers — gate tool execution pending approval
- Knowledge graph construction — LLM-powered entity extraction, fuzzy dedup, temporal tracking
- Multi-retriever RAG — vector + BM25 + graph retrieval fused via Reciprocal Rank Fusion
- Reranking — MMR diversity and cross-encoder scoring built in
- 4 LLM providers (Ollama, OpenAI, Anthropic, Google) behind one
Providerinterface - Provider resilience — retry + fallback composition out of the box
- Structured output — constrain LLM responses to JSON schema
Agent orchestration, knowledge graphs, and RAG pipelines are deeply interconnected — RAG benefits from graph retrieval, agents need both for grounded responses, and all three share providers and embedders. saige unifies them under shared Provider, Embedder, and Tool interfaces, eliminating the wiring complexity of combining separate libraries.
go get github.com/urmzd/saigeimport (
"github.com/urmzd/saige/agent"
"github.com/urmzd/saige/agent/types"
"github.com/urmzd/saige/agent/provider/ollama"
)
client := ollama.NewClient("http://localhost:11434", "qwen2.5", "nomic-embed-text")
a := agent.NewAgent(agent.AgentConfig{
Name: "assistant",
SystemPrompt: "You are a helpful assistant.",
Provider: ollama.NewAdapter(client),
Tools: types.NewToolRegistry(myTool),
})
stream := a.Invoke(ctx, []types.Message{types.NewUserMessage("Hello!")})
for delta := range stream.Deltas() {
switch d := delta.(type) {
case types.TextContentDelta:
fmt.Print(d.Content)
}
}import (
"github.com/urmzd/saige/knowledge"
"github.com/urmzd/saige/knowledge/types"
"github.com/urmzd/saige/agent/provider/ollama"
)
client := ollama.NewClient("http://localhost:11434", "qwen2.5", "nomic-embed-text")
graph, _ := knowledge.NewGraph(ctx,
knowledge.WithSurrealDB("ws://localhost:8000", "default", "knowledge", "root", "root"),
knowledge.WithExtractor(knowledge.NewOllamaExtractor(client)),
knowledge.WithEmbedder(knowledge.NewOllamaEmbedder(client)),
)
defer graph.Close(ctx)
graph.IngestEpisode(ctx, &types.EpisodeInput{
Name: "meeting-notes",
Body: "Alice presented the Q4 roadmap. Bob raised concerns about the timeline.",
})
results, _ := graph.SearchFacts(ctx, "Who presented the roadmap?")import (
"github.com/urmzd/saige/rag"
"github.com/urmzd/saige/rag/types"
"github.com/urmzd/saige/rag/memstore"
)
pipe, _ := rag.NewPipeline(
rag.WithStore(memstore.New()),
rag.WithContentExtractor(myExtractor),
rag.WithEmbedders(myEmbedderRegistry),
rag.WithRecursiveChunker(512, 50),
rag.WithBM25(nil),
rag.WithMMR(0.7),
)
defer pipe.Close(ctx)
pipe.Ingest(ctx, &types.RawDocument{
SourceURI: "https://example.com/paper.pdf",
Data: pdfBytes,
})
result, _ := pipe.Search(ctx, "attention mechanism", types.WithLimit(5))
fmt.Println(result.AssembledContext.Prompt) // context with citations- agent — AI Agent Framework (providers, deltas, tools, sub-agents, markers, feedback/RLHF, compaction, tree, TUI)
- kg — Knowledge Graph SDK
- rag — RAG Pipeline SDK
- Examples
- Agent Skill
- Architecture
Streaming-first agent loop with parallel tool execution, sub-agent delegation, human-in-the-loop markers, conversation tree persistence, and multi-provider resilience.
Implement one method to integrate any LLM backend:
type Provider interface {
ChatStream(ctx context.Context, messages []Message, tools []ToolDef) (<-chan Delta, error)
}Built-in providers:
| Provider | Package | Structured Output | Content Negotiation | Embedder |
|---|---|---|---|---|
| Ollama | agent/provider/ollama |
yes | JPEG, PNG | yes |
| OpenAI | agent/provider/openai |
yes | JPEG, PNG, GIF, WebP, PDF | yes |
| Anthropic | agent/provider/anthropic |
yes | JPEG, PNG, GIF, WebP, PDF | — |
agent/provider/google |
yes | JPEG, PNG, GIF, WebP, PDF | yes |
Three roles. Tool results are content blocks, not a separate role.
| Type | Role | Content Types |
|---|---|---|
SystemMessage |
system | TextContent, ToolResultContent, ConfigContent |
UserMessage |
user | TextContent, ToolResultContent, ConfigContent, FileContent |
AssistantMessage |
assistant | TextContent, ToolUseContent |
15 concrete types across five categories — LLM-side, execution-side, marker, feedback, and metadata:
| Type | Category | Purpose |
|---|---|---|
TextStartDelta |
LLM | Text block opened |
TextContentDelta |
LLM | Text chunk |
TextEndDelta |
LLM | Text block closed |
ToolCallStartDelta |
LLM | Tool call generation started |
ToolCallArgumentDelta |
LLM | JSON argument chunk |
ToolCallEndDelta |
LLM | Tool call complete |
ToolExecStartDelta |
Execution | Tool began executing |
ToolExecDelta |
Execution | Streaming delta from tool/sub-agent |
ToolExecEndDelta |
Execution | Tool finished |
MarkerDelta |
Marker | Tool gated pending approval |
FeedbackDelta |
Feedback | RLHF rating recorded on a node |
UsageDelta |
Metadata | Token usage + wall-clock timing |
ErrorDelta |
Terminal | Provider or tool error |
DoneDelta |
Terminal | Stream complete |
tool := &types.ToolFunc{
Def: types.ToolDef{
Name: "greet",
Description: "Greet a person",
Parameters: types.ParameterSchema{
Type: "object",
Required: []string{"name"},
Properties: map[string]types.PropertyDef{
"name": {Type: "string", Description: "Person's name"},
},
},
},
Fn: func(ctx context.Context, args map[string]any) (string, error) {
return fmt.Sprintf("Hello, %s!", args["name"]), nil
},
}When the LLM requests multiple tool calls, all tools execute concurrently.
Sub-agents are registered as tools and execute within parallel tool dispatch. Their deltas are forwarded through the parent's stream:
a := agent.NewAgent(agent.AgentConfig{
Provider: adapter,
SubAgents: []agent.SubAgentDef{
{
Name: "researcher",
Description: "Searches the web for information",
SystemPrompt: "You are a research assistant.",
Provider: adapter,
Tools: types.NewToolRegistry(searchTool),
},
},
})Gate tool execution pending consumer approval:
safeTool := types.WithMarkers(myTool,
types.Marker{Kind: "human_approval", Message: "This modifies production data."},
)
// Consumer resolves:
stream.ResolveMarker(d.ToolCallID, approved, nil)Constrain LLM responses to a JSON schema:
schema := types.SchemaFrom[MyResponse]()
a := agent.NewAgent(agent.AgentConfig{
Provider: adapter,
ResponseSchema: schema,
})import (
"github.com/urmzd/saige/agent/provider/retry"
"github.com/urmzd/saige/agent/provider/fallback"
)
provider := fallback.New(
retry.New(primary, retry.DefaultConfig()),
retry.New(backup, retry.DefaultConfig()),
)Data-driven context management:
| Strategy | Behavior |
|---|---|
CompactNone |
No compaction |
CompactSlidingWindow |
Keep system prompt + last N messages |
CompactSummarize |
Summarize older messages via the provider |
Persistent branching conversation graph with checkpoints, rewind, and archive:
tr := a.Tree()
tr.Branch(nodeID, "experiment", msg)
tr.Checkpoint(branchID, "before-refactor")
tr.Rewind(checkpointID)Attach positive/negative ratings and comments to any node in the conversation tree. Feedback is stored as permanent leaf nodes branching off the target — never sent to the LLM, available for post-analysis and training.
// Rate an assistant response.
tip, _ := a.Tree().Tip(a.Tree().Active())
a.Feedback(tip.ID, types.RatingPositive, "Clear and helpful")
a.Feedback(tip.ID, types.RatingNegative, "Too verbose")
// Collect all feedback across the tree.
for _, entry := range a.FeedbackSummary() {
fmt.Printf("node=%s rating=%d comment=%q\n",
entry.TargetNodeID, entry.Rating, entry.Comment)
}Feedback nodes have NodeFeedback state — they cannot have children added, forming dead-end branches that don't interfere with the conversation flow. During Replay, feedback emits FeedbackDelta for consumers that track ratings.
Automatic URI resolution and content negotiation for multi-modal input:
a := agent.NewAgent(agent.AgentConfig{
Provider: adapter,
Resolvers: map[string]types.Resolver{
"file": myFileResolver,
"s3": myS3Resolver,
},
Extractors: map[types.MediaType]types.Extractor{
types.MediaPDF: myPDFExtractor,
},
})Two display modes for streaming agent progress:
import "github.com/urmzd/saige/agent/tui"
// Non-interactive (works in pipes/CI)
result := tui.StreamVerbose(header, stream.Deltas(), os.Stdout)
// Interactive (bubbletea)
model := tui.NewStreamModel(header, stream.Deltas())
tea.NewProgram(model).Run()import "github.com/urmzd/saige/agent/agenttest"
provider := &agenttest.ScriptedProvider{
Responses: [][]types.Delta{
agenttest.ToolCallResponse("id-1", "greet", map[string]any{"name": "Alice"}),
agenttest.TextResponse("Hello, Alice!"),
},
}Build and query knowledge graphs with LLM-powered entity extraction, fuzzy deduplication, and hybrid search.
type Graph interface {
ApplyOntology(ctx, ontology) error
IngestEpisode(ctx, episode) (*IngestResult, error)
GetEntity(ctx, uuid) (*Entity, error)
SearchFacts(ctx, query, opts...) (*SearchFactsResult, error)
GetGraph(ctx) (*GraphData, error)
GetNode(ctx, uuid, depth) (*NodeDetail, error)
GetFactProvenance(ctx, factID) ([]Episode, error)
Close(ctx) error
}| Type | Purpose |
|---|---|
Entity |
Node — UUID, Name, Type, Summary, Embedding |
Relation |
Edge — Source/Target UUID, Type, Fact, ValidAt/InvalidAt |
Fact |
Relation with resolved source/target entities |
Episode |
Text input with Name, Body, Source, GroupID, Metadata |
Ontology |
Schema constraints — EntityTypes, RelationTypes |
Combines vector similarity (HNSW) and full-text (BM25) via Reciprocal Rank Fusion:
results, _ := graph.SearchFacts(ctx, "Who works at Acme?",
types.WithLimit(10),
types.WithGroupID("project-alpha"),
)
for _, fact := range knowledge.FactsToStrings(results.Facts) {
fmt.Println(fact) // "Alice -> Acme Corp: works at"
}- Exact match by (name, type) pair
- Fuzzy match via Levenshtein distance (threshold 0.8)
- Relation dedup by text similarity (threshold 0.92)
detail, _ := graph.GetNode(ctx, entityUUID, 2) // BFS to depth 2
sub := knowledge.Subgraph(detail) // extract visualization dataAutomatic schema provisioning with HNSW vector index (768D cosine), BM25 fulltext indexes, unique constraints, and temporal tracking.
Multi-modal document ingestion with pluggable chunking, retrieval, reranking, and context assembly.
Document (fingerprint for dedup, metadata, source URI)
└── Section[] (ordered by index, optional heading)
└── ContentVariant[] (text, image, table, audio — each with bytes, embedding, MIME)
Every ContentVariant has a .Text field that is always populated, enabling uniform search and entity extraction.
type Pipeline interface {
Ingest(ctx, raw) (*IngestResult, error)
Search(ctx, query, opts...) (*SearchPipelineResult, error)
Lookup(ctx, variantUUID) (*SearchHit, error)
Update(ctx, documentUUID, raw) (*IngestResult, error)
Delete(ctx, documentUUID) error
Reconstruct(ctx, documentUUID) (*Document, error)
Close(ctx) error
}| Strategy | Description |
|---|---|
| Recursive | Tries separators (\n\n, \n, . , ) with configurable overlap |
| Semantic | Splits where embedding similarity drops below threshold |
rag.WithRecursiveChunker(512, 50) // maxSize, overlap
rag.WithSemanticChunker(0.1, 100, 1000) // threshold, minSize, maxSize| Retriever | Description |
|---|---|
| Vector | Embed query, cosine similarity search |
| BM25 | In-memory inverted index with configurable K1/B |
| Graph | Knowledge graph facts resolved to document variants via episode provenance |
| Parent | Wraps any retriever, expands hits to full parent section context |
Multiple retrievers are combined via Reciprocal Rank Fusion.
rag.WithBM25(nil) // default K1=1.2, B=0.75
rag.WithParentContext() // expand to parent sections| Reranker | Description |
|---|---|
| MMR | Maximal Marginal Relevance — balances relevance and diversity |
| Cross-Encoder | Pair-wise scoring via custom Scorer interface |
rag.WithMMR(0.7) // lambda=0.7
rag.WithCrossEncoder(myScorer) // custom scorerBuilt-in citation support:
// Default: numbered citations with source URIs
// Compressing: LLM-based extraction of relevant sentences
rag.WithCompression(myLLM)HyDE (Hypothetical Document Embeddings) — generates hypothetical documents via LLM for better retrieval:
rag.WithHyDE(myLLM, 3) // generate 3 hypothetical docsimport "github.com/urmzd/saige/rag/rageval"
precision := rageval.ContextPrecision(results, relevantUUIDs)
recall := rageval.ContextRecall(results, relevantUUIDs)
faithfulness, _ := rageval.Faithfulness(ctx, llm, query, answer, context)
relevancy, _ := rageval.AnswerRelevancy(ctx, embedder, query, answer)5 tools for integrating RAG into agent workflows:
import "github.com/urmzd/saige/rag/adktool"
tools := adktool.NewTools(pipeline)
// rag_search, rag_lookup, rag_update, rag_delete, rag_reconstruct| Example | Path | Description |
|---|---|---|
| Basic Agent | examples/agent/basic/ |
Single tool with Ollama |
| Sub-agents | examples/agent/subagents/ |
Parent delegating to researcher |
| Resilient | examples/agent/resilient/ |
Retry + fallback composition |
| Streaming | examples/agent/streaming/ |
All delta types with ANSI output |
| Multimodal | examples/agent/multimodal/ |
File pipeline with file:// resolver |
| TUI | examples/agent/tui/ |
Interactive and verbose modes |
| Runner | examples/agent/runner/ |
Multi-turn conversation loop |
| Concurrent | examples/agent/concurrent-subagents/ |
Parallel sub-agent execution |
| Knowledge Graph | examples/knowledge/basic/ |
Build and query a knowledge graph |
| RAG | examples/rag/arxiv/ |
Full pipeline with arXiv papers |
go run ./examples/agent/basic/
go run ./examples/knowledge/basic/
go run ./examples/rag/arxiv/npx skills add urmzd/saigegraph TB
subgraph agent["agent/ -- AI Agent Framework"]
agenttypes["agent/types/<br/>Provider, Tool, Delta,<br/>Message, Node, WAL"]
agentloop["agent/<br/>Agent loop, streaming,<br/>sub-agents"]
providers["agent/provider/<br/>ollama, openai,<br/>anthropic, google"]
resilience["agent/provider/<br/>retry, fallback"]
tree["agent/tree/<br/>Conversation graph"]
tui["agent/tui/<br/>Terminal UI"]
agenttest["agent/agenttest/<br/>Test utilities"]
end
subgraph kg["knowledge/ -- Knowledge Graph"]
kgtypes["knowledge/types/<br/>Graph, Store, Extractor"]
engine["knowledge/internal/engine/<br/>Extraction, dedup"]
surrealdb["knowledge/surrealdb/<br/>SurrealDB backend"]
end
subgraph rag["rag/ -- RAG Pipeline"]
ragtypes["rag/types/<br/>Pipeline, Store, Retriever"]
pipeline["rag/internal/pipeline/<br/>Ingest, search, RRF"]
retrievers["rag/vector, bm25,<br/>parent, graph retrievers"]
rerankers["rag/reranker/<br/>MMR, cross-encoder"]
chunkers["rag/chunker/<br/>Recursive, semantic"]
adktool["rag/adktool/<br/>Agent tool bindings"]
end
agentloop --> agenttypes
providers --> agenttypes
resilience --> providers
tree --> agenttypes
tui --> agentloop
engine --> kgtypes
surrealdb --> kgtypes
pipeline --> ragtypes
retrievers --> ragtypes
rerankers --> ragtypes
chunkers --> ragtypes
adktool --> ragtypes
adktool -.->|integrates| agenttypes
retrievers -.->|graphretriever| kgtypes
| Package | Files | Purpose |
|---|---|---|
agent/ |
agent.go, stream.go, subagent.go, aggregator.go, runner.go |
Agent loop, streaming, sub-agent delegation |
agent/types/ |
message.go, delta.go, content.go, provider.go, tool.go, errors.go, marker.go, compactor.go, node.go |
Sealed types, interfaces, error classification, feedback |
agent/tree/ |
tree.go, flatten.go, compact.go, diff.go |
Branching conversation tree with feedback leaf nodes |
agent/provider/ |
ollama/, openai/, anthropic/, google/, retry/, fallback/ |
LLM adapters and resilience wrappers |
agent/tui/ |
stream.go, styles.go, runner.go |
Bubbletea + verbose streaming UI |
agent/agenttest/ |
agenttest.go |
ScriptedProvider, MockTool, assertions |
knowledge/ |
config.go, query.go, ollama.go |
Knowledge graph public API |
knowledge/types/ |
types.go |
Core knowledge graph types and interfaces |
knowledge/surrealdb/ |
store.go, schema.go, records.go |
SurrealDB store implementation |
knowledge/internal/ |
engine/, extraction/, fuzzy/ |
Engine orchestration, LLM extraction, dedup |
rag/ |
config.go, version.go |
RAG pipeline configuration |
rag/types/ |
types.go |
Core RAG types and interfaces |
rag/internal/ |
pipeline/pipeline.go |
Pipeline engine (ingest, search, RRF) |
rag/chunker/ |
chunker.go, semantic.go |
Recursive and semantic chunking |
rag/bm25retriever/ |
retriever.go |
In-memory BM25 lexical search |
rag/vectorretriever/ |
retriever.go |
Vector similarity search |
rag/graphretriever/ |
retriever.go |
Knowledge graph retrieval |
rag/parentretriever/ |
retriever.go |
Parent context expansion |
rag/reranker/ |
mmr.go, crossencoder.go |
MMR + cross-encoder reranking |
rag/hyde/ |
transformer.go |
HyDE query expansion |
rag/contextassembler/ |
compressing.go |
LLM-based context compression |
rag/rageval/ |
eval.go |
Evaluation metrics |
rag/adktool/ |
tools.go |
Agent tool bindings |
rag/memstore/ |
store.go |
In-memory store for testing |
Apache 2.0 — see LICENSE.
