TypeGraph

A context layer for AI agents

How It Works • Benchmarks • Guides • Contributing

TypeGraph is a TypeScript SDK that gives AI agents ingest and retrieval for RAG, graph and memory in a single composable package. One SDK, one path, one Postgres database.

Why TypeGraph?

Building a context layer for AI agents in TypeScript today means cobbling together a vector DB, a graph DB, an embedding API, a caching layer, consolidation logic, and a conversation manager. The leading frameworks (Graphiti, Mem0, MemOS) are Python-first and do not unify ingest and retrieval accross RAG, graph and memory.

TypeGraph closes that gap:

Ingest + retrieval in one SDK - not two separate tools bolted together
TypeScript-native - no Python runtime, no managed service, no vendor lock-in
Any Postgres provider - Neon, Supabase, Amazon RDS, Nile, Prisma, self-hosted - production-ready with pgvector
Vercel AI SDK integration - memory tools and middleware for generateText() / streamText()
Composable - works alongside your stack, not inside a framework
Per-bucket embedding models, chunking, and graph extraction rules - different models for different content, merged at query time via RRF

TypeGraph is a focused primitive - it stores, indexes, and retrieves so you can focus on building.

How it works

Lifecycle

TypeGraph separates infrastructure provisioning from runtime initialization:

Method	When to call	What it does
`deploy(config)`	Once (setup script, CI/CD)	Creates tables and extensions. Idempotent.
`initialize(config)`	Every app boot / cold start	Loads state, registers adapters. Lightweight, no DDL.
`undeploy()`	Intentional teardown	Drops all TypeGraph tables. Refuses if any table has data.
`destroy()`	App shutdown	Closes adapter connections.

More setup options: Self-Hosted Initialization | Simple RAG

TypeGraph uses composable query signals - the caller chooses which retrieval systems to activate:

Signal	What It Does	Default
`semantic`	Semantic embedding search against chunk embeddings	On
`keyword`	BM25 keyword search, fused with semantic via RRF	Off
`graph`	PPR graph traversal via entity embeddings	Off
`memory`	Cognitive memory recall (facts, episodes)	Off

Signals compose freely - any combination works. The default ({ semantic: true }) gives fast semantic-only search (~10-30ms). Add signals for richer retrieval:

// Default: fast semantic search
d.query('sso')

// Semantic + keyword (hybrid)
d.query('how do I configure SSO?', { signals: { semantic: true, keyword: true } })

// All signals: semantic + keyword + graph + memory
d.query('what did Alice say about the SSO migration?', {
  signals: { semantic: true, keyword: true, graph: true, memory: true },
  userId: 'alice',
  tenantId: 'org1',
})

// Graph-only: entity-centric associative retrieval
d.query('how are Alice and Acme Corp connected?', {
  signals: { graph: true },
})

When graph and llm are configured, document indexing automatically builds a knowledge graph:

Triple extraction - each chunk is analyzed to extract entities (people, organizations, places, works, etc.) and their relationships as subject-predicate-object triples
Entity resolution - entities are deduplicated across chunks using a multi-tier resolver: exact match, trigram Jaccard fuzzy matching, and vector similarity with type guards
Predicate normalization - relationship types are canonicalized via a predicate ontology (~150 types) and synonym groups to prevent graph fragmentation
Cross-chunk context - entity context accumulates across chunks within a document, improving extraction consistency

At query time, enabling the graph signal seeds a Personalized PageRank walk from entities mentioned in the query, traversing the graph to surface associatively-connected passages across documents and memory. When combined with vector and keyword signals, results are fused via RRF, enabling multi-hop reasoning in a single retrieval step. Composite score weights are configurable per-query via scoreWeights, and graph result filtering is tunable via graphReinforcement ('only', 'prefer', or 'off').

The extraction pipeline supports configurable LLMs - using a reasoning model for extraction produces dramatically higher-quality graphs (fewer entities, richer predicate vocabulary, zero noise edges) at the cost of slower ingestion.

Deep dive: Graph RAG Guide - hybrid search, per-model fan-out, embedding providers, architecture

Benchmarks

TypeGraph is evaluated on published academic benchmarks using the exact methodology (chunk sizes, scoring functions, context windows) from each source paper.

Retrieval (Core)

Standard information retrieval benchmarks using semantic + keyword signals (BM25 with RRF fusion). Metrics are BEIR-standard at cutoff 10.

Dataset	nDCG@10	Baseline	Delta	Source
Australian Tax Guidance	0.7519	0.7431	+0.0088	MLEB Leaderboard
MLEB-ScaLR	0.6607	0.6528	+0.0079	MLEB Leaderboard
License TLDR	0.6485	0.5985	+0.0500	MLEB Leaderboard
MultiHop-RAG	0.6429	-	-	COLM 2024
Legal RAG Bench	0.3348	0.3704	-0.0356	MLEB Leaderboard

Baselines are text-embedding-3-small on the MLEB Leaderboard (Isaacus). TypeGraph uses the same embedding model with chunked retrieval + document-level deduplication.

Graph-RAG (Neural)

GraphRAG-Bench evaluates graph-augmented retrieval on long-form question answering over 20 Project Gutenberg novels. Scoring uses LLM-as-judge answer correctness (0.75 x factuality + 0.25 x semantic similarity) - a continuous 0.0-1.0 metric matching the paper's evaluation code.

Rank	System	Fact Retrieval	Complex Reasoning	Contextual Summarize	Creative Generation	Overall
#1	TypeGraph neural	61.7	53.1	60.4	47.7	58.4
#2	HippoRAG2	60.1	53.4	64.1	48.3	56.5
#3	Fast-GraphRAG	57.0	48.5	56.4	46.2	52.0
#4	GraphRAG (local)	49.3	50.9	64.4	39.1	50.9
#5	RAG w/ rerank	60.9	42.9	51.3	38.3	48.4
#6	LightRAG	58.6	49.1	48.9	23.8	45.1

TypeGraph overall ACC (58.4%) is statistically significant over HippoRAG2 (56.5%) at 95% confidence [CI: 57.2%, 59.5%]. Full eval: 2,009 queries, GPT-5.4-mini generation. Baselines from arXiv:2506.05690 Table 3 (GPT-4o-mini generation). See benchmarks/ for methodology and reproduction.

Guides

Guide	What you'll learn
Self-Hosted Setup	Neon Postgres + pgvector, AI Gateway, hybrid search internals
Getting Started (Local Dev)	SQLite + AI Gateway - minimal infrastructure setup
TypeGraph Cloud	Hosted API - just an API key
Agentic RAG	Retrieval architecture, embedding providers, landscape analysis
Agentic Memory	Cognitive memory system, lifecycle, extraction, landscape analysis

Contributing

TypeGraph is open source and contributions are welcome - new integrations, adapters, bug fixes, or documentation.

Fork the repo
Create a feature branch (git checkout -b feat/my-feature)
Make your changes
Run pnpm build && pnpm typecheck to verify
Open a PR

Development

pnpm install          # Install dependencies
pnpm build            # Build all packages (Turborepo)
pnpm test             # Run tests
pnpm typecheck        # Type checking

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 371 Commits
.github/workflows		.github/workflows
packages		packages
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.npmrc		.npmrc
.prettierrc		.prettierrc
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json
turbo.json		turbo.json
typegraph-logo-dark.png		typegraph-logo-dark.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TypeGraph

Why TypeGraph?

How it works

Lifecycle

Benchmarks

Retrieval (Core)

Graph-RAG (Neural)

Guides

Contributing

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TypeGraph

Why TypeGraph?

How it works

Lifecycle

Benchmarks

Retrieval (Core)

Graph-RAG (Neural)

Guides

Contributing

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages