graphrag-core

A domain-agnostic framework for building governed, auditable Knowledge Graphs from documents using LLM-powered extraction, provenance-native storage, and multi-agent orchestration.

Architecture

YOUR DOMAIN LAYER (Layer 2)
  Ontology, domain tools, domain agents, templates
                    |
                    | imports
                    v
graphrag-core (Layer 1)

  Ingestion   Extraction   Graph Store   Search
  Curation    Registry     Tool Library  Orchestration

Install

pip install graphrag-core                    # core (in-memory backends)
pip install graphrag-core[neo4j]             # + Neo4j graph store and search
pip install graphrag-core[anthropic]         # + Claude LLM client
pip install graphrag-core[all]               # everything

Quick Start

import asyncio
from graphrag_core import (
    TextParser, TokenChunker, IngestionPipeline,
    InMemoryGraphStore, InMemorySearchEngine,
    LLMExtractionEngine, OntologySchema, NodeTypeDefinition,
    PropertyDefinition, RelationshipTypeDefinition,
    ToolLibrary, register_core_tools,
)
from graphrag_core.models import ChunkConfig, DocumentChunk, GraphNode, ImportRun
from datetime import datetime

async def main():
    # 1. Ingest a document
    pipeline = IngestionPipeline(parser=TextParser(), chunker=TokenChunker())
    chunks = await pipeline.ingest(b"Alice works at Acme Corp.", "text/plain")

    # 2. Define your domain schema
    schema = OntologySchema(
        node_types=[
            NodeTypeDefinition(
                label="Person",
                properties=[PropertyDefinition(name="name", type="string", required=True)],
                required_properties=["name"],
            ),
            NodeTypeDefinition(
                label="Company",
                properties=[PropertyDefinition(name="name", type="string", required=True)],
                required_properties=["name"],
            ),
        ],
        relationship_types=[
            RelationshipTypeDefinition(type="WORKS_AT", source_types=["Person"], target_types=["Company"]),
        ],
    )

    # 3. Extract entities (requires an LLMClient implementation)
    # engine = LLMExtractionEngine(llm_client=your_client)
    # result = await engine.extract(chunks, schema, import_run)

    # 4. Store in graph
    store = InMemoryGraphStore()
    await store.merge_node(GraphNode(id="p1", label="Person", properties={"name": "Alice"}), "run-1")
    await store.merge_node(GraphNode(id="c1", label="Company", properties={"name": "Acme Corp"}), "run-1")

    # 5. Search
    search = InMemorySearchEngine(
        nodes=[await store.get_node("p1"), await store.get_node("c1")],
    )
    results = await search.fulltext_search("Acme", top_k=5)
    print(results)

    # 6. Wire up tools for agents
    library = ToolLibrary()
    register_core_tools(library, store, search)
    result = await library.execute("get_entity", entity_id="p1")
    print(result)

asyncio.run(main())

Building Blocks

#	Block	Interface	Implementation	Status
1	Document Ingestion	`DocumentParser`, `Chunker`	PDF, DOCX, Text, Markdown parsers; TokenChunker	Done
2	Entity Extraction	`ExtractionEngine`, `LLMClient`	LLMExtractionEngine, AnthropicLLMClient	Done
3	Knowledge Graph	`GraphStore`	InMemoryGraphStore, Neo4jGraphStore	Done
4	Hybrid Search	`SearchEngine`	InMemorySearchEngine, Neo4jHybridSearch (RRF)	Done
5	Governed Curation	`DetectionLayer`	DeterministicDetectionLayer, CurationPipeline	Done (detection layer)
6	Entity Registry	`EntityRegistry`	InMemoryEntityRegistry (fuzzy matching)	Done
7	Tool Library	`ToolLibrary`	4 core tools (get_entity, search, audit_trail, related)	Done
8	Orchestration	`Agent`, `Orchestrator`	SequentialOrchestrator, AgentContext	Done

Protocols marked with (Protocol only) have no default implementation yet:

LLMCurationLayer, ApprovalGateway (BB5 layers 2-3)
ReportRenderer (BB8)
EmbeddingModel (cross-cutting)

Extension Pattern

from graphrag_core import OntologySchema, ToolLibrary, Tool

# 1. Define your domain ontology
schema = OntologySchema(node_types=[...], relationship_types=[...])

# 2. Register domain-specific tools
library = ToolLibrary()
library.register(Tool(name="my_tool", description="...", parameters={}, handler=my_handler))

# 3. Implement domain agents
class MyAgent:
    name = "analyst"
    async def execute(self, context):
        result = await context.tool_library.execute("my_tool")
        context.workflow_state["analysis"] = result.data
        return AgentResult(agent_name=self.name, success=True)

Development

# Clone and install
git clone https://github.com/cdel1/graphrag-core.git
cd graphrag-core
uv sync --all-extras

# Run unit tests
uv run pytest tests/ -x -q

# Run integration tests (requires Neo4j)
docker run -d --name neo4j-test -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/development neo4j:5-community
uv run pytest tests/ -x --run-integration

# Build
uv build

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 105 Commits
.github/workflows		.github/workflows
docs		docs
src/graphrag_core		src/graphrag_core
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

graphrag-core

Architecture

Install

Quick Start

Building Blocks

Extension Pattern

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

graphrag-core

Architecture

Install

Quick Start

Building Blocks

Extension Pattern

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages