Noosphere

Turn any text into a knowledge graph. Find the gaps. Get AI-grounded insight.

Self-hostable. AGPL. Algorithm parity with the academic state-of-the-art on every benchmark we've measured.

Why it exists

Reading isn't the bottleneck anymore. Synthesizing is.

You have a hundred research notes, a dozen tabs of articles on the same topic, a transcript of an interview, a draft you can't quite finish — and somewhere in that pile is a structure you can't see. Which concepts are central? Which are isolated? What bridges what? What's missing?

Noosphere answers those questions. Feed it text — anything from a paragraph to a corpus — and it returns a concept graph with cluster structure, structural gaps, centrality rankings, and (if you want) LLM-generated advice grounded in the graph itself rather than hallucinated.

Same idea as InfraNodus. Self-hosted. AGPL. No subscription. Real algorithm rigour underneath.

What you can do with it

Use case	What Noosphere gives you
Personal knowledge management	Map your Obsidian/Logseq/Roam notes as a network. See dominant concepts, isolated ideas, bridges between sub-topics.
Research / literature review	Drop in 50 paper abstracts, get a cluster map of sub-fields and the gaps where research is sparse.
Content & SEO strategy	`webSearchVsIntent` returns the gap between what users ask and what top-ranking content covers — a content brief in JSON.
Discourse analysis	Qualitative researchers analyzing transcripts, focus groups, social media discourse.
Writing	Paste your draft. See which concepts are central, which are underdeveloped, where bridges are weak.
Teaching	Students map their understanding; teachers see what's covered vs. missing.
AI workflows	Wire Noosphere as an MCP server (planned) so Claude/Cursor/Zed can read your notes as a graph and answer "what's the bridge between my project notes about X and Y?"

What's in the box (today)

14 InfraNodus-compatible REST endpoints under /api/v1/
Graph engine — co-occurrence graph from sliding-window text analysis with deterministic UUIDv5 node ids
Two community-detection algorithms: Louvain + Leiden refinement (default) and Infomap (opt-in, better ground-truth recovery on benchmarks). Multi-resolution sweep, 64 deterministic restarts in parallel.
Five centralities: betweenness (Brandes), degree, closeness, eigenvector, PageRank
Structural gap detection between communities (Burt's structural holes)
Force-directed layout (Fruchterman–Reingold, Barnes–Hut quadtree) for visualization coordinates
Eight built-in language modules with auto-detection: English, German, French, Spanish, Portuguese, Russian, Japanese, Chinese
Web search integration via SearXNG — fetch SERP results / search-intent suggestions, build graphs from them, compare them
LLM advice through any OpenAI-compatible endpoint (OpenAI, LM Studio, Ollama, vLLM, llama.cpp server) — six "lens" modes: develop, reinforce, gaps, latent, imagine, optimize
Comparison engine — merge / overlap / difference between any number of texts, optionally with AI advice grounded in the comparison
API key auth + Redis rate limiting with optional Authentik OIDC layered on
Multilingual — script-aware tokenization handles Latin, Cyrillic, Hiragana/Katakana, and CJK ideographs without whitespace heuristics

Algorithm quality vs. literature

We test against canonical benchmarks with hand-crafted edge lists, real datasets from Mark Newman's network archive, and ground-truth-labeled graphs:

Benchmark	Metric	Our value	Literature reference
Zachary's Karate Club	Modularity	0.4198	0.4188 (NetworkX)
Karate Club	Communities	4	4
Lusseau Dolphins	Modularity	0.5277	~0.52
NCAA Football	Modularity	0.6046	~0.60
NCAA Football	ARI vs. ground truth (Louvain)	0.807	0.80–0.85
NCAA Football	ARI vs. ground truth (Infomap)	0.897	0.85–0.92

We match or exceed published values on every benchmark we've measured.

Getting started

Quickstart (Aspire dashboard, full stack)

# Prerequisites: .NET 10 SDK, Docker
git clone https://github.com/CySpiegel/noosphere
cd noosphere
dotnet run --project Noosphere.AppHost

The Aspire dashboard opens with health and telemetry for Postgres, Redis, the API, and (when configured) SearXNG. The API is reachable on the URL the dashboard prints.

First request

curl -X POST http://localhost:<port>/api/v1/graphAndStatements \
  -H "Content-Type: application/json" \
  -d '{"text": "Knowledge graphs organize concepts into clusters. Networks reveal patterns. Bridges expose gaps."}'

You get back the full Graphology JSON — nodes with centralities, edges with weights, communities, gaps, and structural metrics.

Pointing it at your LLM

Noosphere talks to any OpenAI-compatible chat-completions endpoint. Set in appsettings.json:

{
  "Llm": {
    "BaseUrl": "https://api.openai.com",   // or http://localhost:1234 for LM Studio
    "DefaultModel": "gpt-4o",
    "ApiKey": "sk-..."
  }
}

Verified live against OpenAI, LM Studio, and Ollama. MaxTokens = 0 lets the server decide.

Adding web search

Run any SearXNG instance, then set WebSearch:BaseUrl in appsettings.json (or WebSearch__BaseUrl env var). Endpoints 9–14 (/api/v1/import/webSearch*) light up automatically; without the config they stay disabled.

API at a glance

All endpoints under /api/v1/:

Endpoint	What it does
`POST /graphAndStatements`	Text → graph + statements + summary
`POST /graphAndAdvice`	Text → graph + AI advice
`POST /dotGraph`	Graphology JSON → DOT (Graphviz)
`POST /dotGraphFromText`	Text → DOT
`POST /graphAiAdvice`	Existing graph → AI advice
`POST /listGraphs`	List your saved graphs (filterable)
`POST /search`	Search statement content across saved graphs → build graph from results
`POST /compareGraphs`	Multi-context merge / overlap / difference
`POST /graphsAndAiAdvice`	Same comparison + AI advice grounded in the merged graph
`POST /import/webSearchResultsGraph`	SERP results → graph
`POST /import/webSearchResultsAiAdvice`	SERP results → graph + advice
`POST /import/webSearchIntentGraph`	Related-queries / "people also ask" → graph
`POST /import/webSearchIntentAiAdvice`	Intent → graph + advice
`POST /import/webSearchVsIntentGraph`	Content vs. demand gap — what users want that content doesn't cover
`POST /import/webSearchVsIntentAiAdvice`	Same + AI advice

Response shape is the standard Graphology JSON wrapped in InfraNodus-compatible envelope keys.

Why self-host

Concern	Self-host (Noosphere)	Hosted alternatives
Cost	Free	$14–$45 / month
Privacy	Your text never leaves your machine	Sent to a third party
LLM choice	Any OpenAI-compatible endpoint, including local	Their LLM, their pricing
Customization	Source-available, AGPL — extend or fork	Closed
Search backend	Self-hosted SearXNG (no API key)	Google API key required
Multi-tenancy	API keys + per-user rate limits built in	Per-seat pricing

What's coming

MCP server — expose every endpoint as MCP tools so Claude/Cursor/Zed can read your notes as a knowledge graph (Phase 5)
React frontend — interactive Sigma.js graph canvas with cluster/gap/statement panels (Phase 5)
Obsidian plugin — analyze the active note or your whole vault (planned, separate repo)
Docker production stack + Traefik + EF migrations + Authentik OIDC (Phase 6)
Statement-aware community detection prior — bias clusters by which sentences tokens co-occur in (Phase 6)
Overlapping communities — concepts that legitimately belong to multiple clusters (Phase 6)

See docs/phases/ for the full roadmap and what's done.

Tech stack

Layer	Tech
Runtime	.NET 10
Orchestration	.NET Aspire 13
API	ASP.NET Core Minimal API
Database	PostgreSQL 17 + EF Core 10
Cache / rate limiting	Redis 7
LLM	Any OpenAI-compatible endpoint
Web search	SearXNG (optional)
Frontend (planned)	React 19 + TypeScript + Sigma.js v3

Zero external NLP dependencies. Tokenization, stopwords, lemmatization, POS tagging — all pure C# with pluggable language modules. No Python sidecar, no spaCy, no NLTK.

Zero external graph library. TextGraph is a custom adjacency-list with O(1) lookups. No Neo4j, no JGraphT.

Test coverage

547 non-live tests + 9 live integration tests against real Postgres, Redis, OpenAI-compatible LLMs, and SearXNG. CI-friendly: live tests skip silently when their backends are unreachable.

Suite	Tests	Verifies
Algorithm correctness	Karate / Florentine / Krackhardt	Match academic literature on canonical small graphs
Ground-truth parity	Dolphins / NCAA Football	Match published modularity + ARI on real datasets
Self-consistency invariants	Multiple	Determinism, conservation laws, comparison-engine identities
Schema conformance	Per endpoint	InfraNodus-compatible JSON shape pinned
Property-based	Erdős–Rényi, Barabási–Albert, two-cliques	Universal graph properties hold for randomly-generated input
NLP edge cases	22	Emoji, RTL, mixed scripts, very long sentences, hashtags
Full stack E2E	API + Postgres + Redis + fake LLM	`/listGraphs`, `/search`, saved-graph compare end-to-end
Live LLM	OpenAI-compatible	Real API calls against your configured endpoint
Live web search	SearXNG	Real SERPs + intent extraction

Documentation

docs/ARCHITECTURE.md — full system design
docs/CODING_STANDARDS.md — required style and conventions
docs/phases/ — phased implementation roadmap with completion status
CLAUDE.md — concise guide for AI assistants working in this repo
docs/API_REFERENCE.md (planned, Phase 6)
docs/ALGORITHMS.md (planned, Phase 6)
docs/DEPLOYMENT.md (planned, Phase 6)

License

AGPL-3.0. Use it, fork it, run it on your own infrastructure. Network use counts as distribution — if you host a public instance, your modifications must be source-available too.

Contributing

Issues and PRs welcome. The codebase is structured around small, well-tested slices — see docs/CODING_STANDARDS.md for what "well-tested" means here (every algorithmic claim is pinned by a test against a published reference value).

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.claude/skills/aspire		.claude/skills/aspire
.github/skills/aspire		.github/skills/aspire
.opencode/skill/aspire		.opencode/skill/aspire
Noosphere.Api		Noosphere.Api
Noosphere.AppHost		Noosphere.AppHost
Noosphere.Core		Noosphere.Core
Noosphere.ServiceDefaults		Noosphere.ServiceDefaults
Noosphere.Tests		Noosphere.Tests
docs		docs
frontend		frontend
.gitignore		.gitignore
.mcp.json		.mcp.json
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Noosphere.sln		Noosphere.sln
README.md		README.md
aspire.config.json		aspire.config.json
opencode.jsonc		opencode.jsonc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Noosphere

Why it exists

What you can do with it

What's in the box (today)

Algorithm quality vs. literature

Getting started

Quickstart (Aspire dashboard, full stack)

First request

Pointing it at your LLM

Adding web search

API at a glance

Why self-host

What's coming

Tech stack

Test coverage

Documentation

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Noosphere

Why it exists

What you can do with it

What's in the box (today)

Algorithm quality vs. literature

Getting started

Quickstart (Aspire dashboard, full stack)

First request

Pointing it at your LLM

Adding web search

API at a glance

Why self-host

What's coming

Tech stack

Test coverage

Documentation

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages