GitHub - blackwell-systems/knowing: System of record for software system relationships: content-addressed, runtime-grounded, built for agents and teams

The system of record for how software systems behave, change, and relate over time.

Vision

Git is the system of record for source code. knowing is the system of record for what that source code means in the context of a running organization.

Software organizations have no single place that captures how their systems actually connect, who owns what, what changed since the last deploy, or whether production behavior matches what the code declares. That knowledge lives in people's heads, incident postmortems, and tribal memory. When someone leaves or an incident happens at 3 AM, it's gone.

knowing builds a versioned, content-addressed ledger of software system relationships: code, infrastructure, ownership, and runtime behavior. Every state is a hash. Every edge has provenance. Every question has an auditable answer.

Agents are the first consumer. But the actual audience is anyone who needs to understand a software organization as a system: platform teams, SREs, architects, security, compliance.

The Problem

Agents today are blind at repository boundaries. LSP tells you where a symbol is used inside one workspace. Code search finds matching text. Dependency graphs tell you which packages depend on which.

None of them answer the questions that actually matter:

If I change this symbol, what breaks across the rest of the system? Is this route actually called in production? When did this cross-repo dependency appear? Who do I need to notify? What did the system look like when we deployed on Tuesday?

What knowing Does

knowing builds a boundary-aware relationship graph across repositories, services, and infrastructure. It fuses static analysis with runtime observation to create a single, trustworthy model of how a software system actually works.

It is a persistent daemon with three components:

Indexer: crawls repositories in any language, parses ASTs with full type resolution (go/packages for Go, tree-sitter for everything else), computes content hashes, resolves cross-repo symbol references, and builds the graph. The graph model is language-agnostic; extractors produce nodes and edges, the graph doesn't care what language they came from. Watches for git changes and re-indexes incrementally.
Graph store: owns the content-addressed graph in SQLite. Manages the snapshot chain, runs garbage collection, and handles traversal queries with a multi-tier cache.
MCP server: exposes the graph to agents over stdio or HTTP.

Unlike tools that maintain mutable current-state graphs, knowing is content-addressed: every node, edge, and graph snapshot is a hash. This means:

History: the graph has a full audit trail; every previous state is queryable
Staleness: a hash mismatch is a structural fact, not a heuristic guess
Integrity: any graph state is provably derived from specific source commits
Runtime ground truth: production traces fused with static analysis tell you what actually runs, not just what the code declares

The Git analogy is exact: Git is a content-addressed graph of source code. knowing is a content-addressed graph of source code relationships.

What It Answers

For agents:

"I'm changing this function signature. Which other repos call it?"
"What is the blast radius of this change, and how confident are we in each edge?"
"What is the full data flow of this value across functions, services, queues, and repositories?"

For platform teams and SREs:

"What did the dependency graph look like when we deployed on Tuesday?"
"When did this cross-repo incompatibility first appear?"
"Is this route actually called in production, or just declared in code?"
"Static analysis says 47 callers; how many are active in production?"

For architects and tech leads:

"This PR adds 3 new cross-repo dependencies and spans 3 teams. Here's who to notify."
"What edges in the graph are stale after this week's changes?"
"This proto field has zero runtime reads in 90 days. Safe to deprecate."

For security and compliance:

"Prove that this graph was derived from these specific source commits."
"Show me every service that touches PII, traced through the actual runtime call graph."
"What changed in the system's dependency structure between these two audit dates?"

MCP Tools

Tool	Purpose
`cross_repo_callers`	All callers of a symbol across indexed repos
`blast_radius`	Full impact analysis for a proposed change
`trace_dataflow`	Follow a value across function and service boundaries
`repo_graph`	Repository and package-level dependency relationships
`stale_edges`	Edges invalidated by recent source changes (hash mismatch)
`ownership`	Who owns the code/service/consumers affected by a change
`snapshot_diff`	What changed in the graph between two points in time
`semantic_diff`	Relationship-level diff between any two snapshots
`pr_impact`	Semantic diff specialized for a PR (resolves base/head from git)
`index_repo`	Add a repo to the graph
`graph_query`	Raw graph query (Cypher or similar)

Relationship to agent-lsp

agent-lsp gives agents live semantic awareness inside a workspace: diagnostics, rename execution, edit simulation, symbol navigation.

knowing gives agents (and humans) persistent system-level awareness across repositories: relationships, impact, ownership, staleness, and runtime behavior.

Where agent-lsp answers "where is this symbol used in this repo?", knowing answers "where is this contract used across the system, who owns the consumers, and is it actually called in production?"

Roadmap

Five parallel workstreams, not sequential phases. See docs/roadmap.md for the full breakdown with dependency graph and parallelization notes.

Workstream	Focus
Graph Core	Content-addressed store, language-agnostic extractor framework, Go + tree-sitter extractors, traversal cache, MCP server, daemon
Edge Types	SCIP, protobuf/gRPC, HTTP routes, events, schemas, infrastructure, ownership
Runtime Intelligence	OpenTelemetry trace ingestion, runtime symbol resolution, confidence decay
Developer Visibility	Semantic PR diff, graph-native test selection, ownership routing, staleness dashboard
Agent Coordination	Pending mutations, temporal reasoning, federated graphs

Documentation

Architecture: design decisions, system overview, schemas, interfaces
Roadmap: workstreams, dependencies, parallelization notes

Tech Stack

Go (indexer, graph store, MCP server)
tree-sitter (multi-language AST parsing)
SCIP (ingest external indices)
SQLite (content-addressed persistent store)
MCP over stdio/HTTP

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
.polywave-state		.polywave-state
assets		assets
cmd/knowing		cmd/knowing
docs		docs
internal		internal
CHANGELOG.md		CHANGELOG.md
README.md		README.md
e2e_test.go		e2e_test.go
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision

The Problem

What knowing Does

What It Answers

MCP Tools

Relationship to agent-lsp

Roadmap

Documentation

Tech Stack

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vision

The Problem

What knowing Does

What It Answers

MCP Tools

Relationship to agent-lsp

Roadmap

Documentation

Tech Stack

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages