Centralized, governed memory for AI agents on OpenShift AI. MemoryHub gives every agent in your organization a shared, persistent memory layer with multi-tier scoping (user / project / role / organizational / enterprise), version history, semantic search via pgvector, an immutable audit trail, and an OAuth 2.1 authorization story.
It works with any agent framework that speaks MCP — Claude Code, kagenti-deployed agents (LangGraph, CrewAI, AG2, …), LlamaStack workflows, custom Python agents — and ships a typed Python SDK and a CLI for direct use.
-
Governed memory operations. Every write, read, update, and deletion is access-controlled by five-tier scope isolation enforced at the SQL level. Memories carry version history with provenance branches, contradiction detection, and a three-layer curation rules engine with inline secrets/PII scanning. Enterprise-scope memories require human approval. This is the substrate that makes all other capabilities trustworthy.
-
Shared agent memory. Agents don't just remember for themselves — they build an organizational hive mind. Project-scoped memories surface for every agent working in that context, with auto-enrollment on first write to open projects so agents can start contributing without manual membership setup. Campaign scoping enables bounded cross-project initiatives where knowledge discovered by one project's agent is available to all enrolled projects. Domain tags enable crosscutting retrieval. Real-time push notifications keep agent swarms current. A planned promotion pipeline will lift patterns discovered by individual agents into organizational knowledge.
-
Inference cost optimization. Cache-optimized assembly returns memories in a deterministic, epoch-locked order designed for KV cache prefix hits across vLLM (2x throughput, 152x TTFT), Anthropic (90% cost reduction), OpenAI (50%), and Gemini (75-90%). The key insight: the first agent pays full inference cost; subsequent agents with overlapping memory contexts get the cached prefix nearly free. Token budget caps and weight-based stub/full injection keep context windows lean. Governed context compaction is on the roadmap.
-
Compliance-ready architecture. Version history, provenance branches, and a planned immutable audit trail position MemoryHub for EU AI Act transparency requirements (enforcement begins August 2026), GDPR data governance, HIPAA, and financial regulations. Compaction will use readable summaries — not opaque tokens — so the compliance team can inspect what was kept.
-
Framework-agnostic integration. Works with any agent framework that speaks MCP. A typed Python SDK, a CLI, a project config wizard that generates agent rule files, and designed integration paths for kagenti and LlamaStack.
-
Kubernetes-native on OpenShift AI. Single PostgreSQL backend handling relational, vector, and graph queries. FIPS compliance by delegation. Air-gap deployable with on-cluster embedding models. Red Hat UBI images. An llm-d integration path for automatic cache-aware routing at the infrastructure level.
Status (2026-04-15). Core memory operations, OAuth 2.1 + JWT auth with service-layer RBAC, the dashboard UI, the published Python SDK, the agent-memory-ergonomics work (search shape, session focus vector with cross-encoder reranking, project config + rule generation), and cache-optimized memory assembly with compilation epochs are all shipped. The Kubernetes operator and the curator-as-background-agent layer are still on the roadmap. See docs/SYSTEMS.md for the per-subsystem status table.
| Component | Path | What it is |
|---|---|---|
| MCP server | memory-hub-mcp/ |
FastMCP 3 server exposing 14 tools (search, read, write, update, delete, similarity, relationships, curation, contradiction, session registration, session focus, project discovery) over streamable-HTTP. The primary agent surface. |
| Server-side library | src/memoryhub_core/ |
SQLAlchemy models, service layer, embedding integration, RBAC enforcement (core/authz.py). Distribution name memoryhub-core; import name memoryhub_core. The MCP server, BFF, alembic migrations, and the seed-OAuth-clients script all import from here. |
| Python SDK | sdk/ |
pip install memoryhub — typed async client wrapping the MCP tools. OAuth 2.1 token management is automatic. See sdk/README.md. |
| CLI | memoryhub-cli/ |
pip install memoryhub-cli — terminal client for search/read/write/delete plus memoryhub config init for generating project-level .memoryhub.yaml and .claude/rules/memoryhub-loading.md rule files. |
| Dashboard UI | memoryhub-ui/ |
React + PatternFly 6 frontend behind a FastAPI BFF, deployed as a single container. Six panels: Memory Graph, Status Overview, Users & Agents, Client Management, Curation Rules, Contradiction Log. OAuth-proxy sidecar in front of OpenShift login. |
| Auth service | memoryhub-auth/ |
Standalone OAuth 2.1 authorization server. FastAPI with client_credentials and refresh_token grants, RSA-2048 JWT signing, JWKS endpoint, admin client management API. |
| Database migrations | alembic/ |
Schema migrations for the server-side library. PostgreSQL with the pgvector extension. |
| Design docs | docs/ |
Subsystem designs, the agent-memory-ergonomics design cluster, package layout, auth and identity model. Start at docs/ARCHITECTURE.md. |
| Planning | planning/ |
In-flight designs for unimplemented features (operator, observability, org-ingestion, session-persistence) and the kagenti/LlamaStack integration plans. |
| Research | research/ |
Investigations and explorations — FIPS storage analysis, agent-memory-ergonomics research notes. |
| Demos | demos/ |
Conference demo scripts (HIMSS, RSA, IACP, IAEM, World AgriTech) and the RHOAI dashboard demo material. |
| Retrospectives | retrospectives/ |
Per-session retros documenting decisions, gaps, and patterns. Read these for the "why" behind major design choices. |
MemoryHub installs to an OpenShift cluster with Red Hat OpenShift AI (RHOAI) already running. A single make install brings up PostgreSQL + pgvector, runs migrations, builds and deploys the MCP server, the OAuth 2.1 auth service, the dashboard UI, and the RHOAI Applications tile.
git clone https://github.com/redhat-ai-americas/memory-hub.git
cd memory-hub
make check-prereqs # verify cluster state (non-destructive)
make install # full stack deployAt the end of make install, the summary banner prints the UI Route, MCP endpoint, auth endpoint, and pointers to the API-key setup. Expect 10–15 minutes on a first install — the MCP server, auth service, and UI each go through an OpenShift BuildConfig.
To remove everything:
make uninstall # prompts for confirmation; use --yes for CIUse make uninstall --skip-db to preserve the database across a reinstall (useful when testing config changes without losing memories).
Prerequisites: oc and podman on your PATH, cluster-admin on a cluster with RHOAI installed, a default StorageClass. make check-prereqs verifies all of these — run it first. See CONTRIBUTING.md for the "new contributor no-deploy" rule: if you're onboarding to this codebase, work against a local SQLite or Podman PostgreSQL instead of deploying to a cluster.
Partial installs (advanced): make deploy-db, make deploy-mcp, make deploy-auth, make deploy-ui, make deploy-tile — each skips the others. make help lists everything.
The deployed server exposes a streamable-HTTP MCP endpoint. Add it to your agent's MCP configuration:
claude mcp add --transport http \
-s project \
memoryhub \
https://memory-hub-mcp-memory-hub-mcp.apps.<your-cluster>.com/mcp/Then run memoryhub config init (from the CLI, see below) to generate a .claude/rules/memoryhub-loading.md that tells the agent when and how to call the tools. The generated rule covers session start, working-set loading, pivot detection, memory hygiene, and contradiction handling — all parameterized by your project's session shape (focused / broad / adaptive).
pip install memoryhubimport asyncio
from memoryhub import MemoryHubClient
async def main():
client = MemoryHubClient.from_env() # reads MEMORYHUB_URL, MEMORYHUB_AUTH_URL, MEMORYHUB_CLIENT_ID, MEMORYHUB_CLIENT_SECRET
async with client:
results = await client.search(
"deployment patterns",
focus="OpenShift", # optional session focus (Layer 2)
max_results=10,
)
for memory in results.results:
print(f"[{memory.scope}] {memory.content[:80]}")
asyncio.run(main())The SDK auto-discovers .memoryhub.yaml from the current working directory and applies its retrieval_defaults to outbound search calls. See sdk/README.md for the full API surface.
pip install memoryhub-cli
memoryhub login # one-time credential setup
memoryhub search "deployment patterns" # search
memoryhub read <memory-id> # read by ID
memoryhub write "Use Podman, not Docker" --scope user --weight 0.9
memoryhub config init # set up .memoryhub.yaml + agent rule fileMemoryHub splits configuration into two files with different lifecycles: project-level policy lives in .memoryhub.yaml at the repo root (committed, shared across all contributors), while per-developer connection params and secrets live in ~/.config/memoryhub/config.json (not committed, managed by memoryhub login).
memoryhub config init is an interactive wizard that asks about session shape, loading pattern, focus source, and retrieval defaults, then writes .memoryhub.yaml and .claude/rules/memoryhub-loading.md. Both files are meant to be committed so every contributor's agent inherits the same loading pattern. On first run, any legacy .claude/rules/memoryhub-integration.md is backed up to .bak before the new rule file is written.
From inside Claude Code, you can run the same wizard without leaving the prompt:
# One-time install of the slash command
curl -o ~/.claude/commands/memoryhub-init.md \
https://raw.githubusercontent.com/redhat-ai-americas/memory-hub/main/tools/claude-commands/memoryhub-init.mdThen type /memoryhub-init in any project. It runs memoryhub config init and prints remaining setup steps.
After hand-editing .memoryhub.yaml, run memoryhub config regenerate to re-render the rule file from the YAML without touching the YAML itself.
The YAML has two top-level keys — memory_loading (when and how agents load memory) and retrieval_defaults (defaults applied to SDK/agent search calls):
memory_loading:
mode: focused # focused | broad
pattern: lazy_with_rebias # eager | lazy | lazy_with_rebias | jit
focus_source: auto # auto | declared | directory | first_turn
session_focus_weight: 0.4
on_topic_shift: rebias # rebias | warn | ignore
retrieval_defaults:
max_results: 20
max_response_tokens: 4000
default_mode: full # full | index | full_onlySee docs/agent-memory-ergonomics/design.md for the full schema, field reference, and rule file templates.
┌─────────────────────────────────────────┐
│ Consumer surfaces │
│ • Agents over MCP (streamable-HTTP) │
│ • Python SDK (memoryhub on PyPI) │
│ • CLI (memoryhub-cli) │
│ • Dashboard UI (React + PF6 + BFF) │
└────────────────┬────────────────────────┘
│
┌──────────▼──────────┐
│ memory-hub-mcp │
│ (FastMCP 3) │
│ 14 tools │
└──────────┬──────────┘
│
┌──────────────────────┼──────────────────────┐
│ │ │
┌───────▼───────┐ ┌─────────▼─────────┐ ┌────────▼──────────┐
│ authz / RBAC │ │ services / models │ │ embedding model │
│ (JWT verify, │ │ (memoryhub_core) │ │ + cross-encoder │
│ scope match)│ │ │ │ (RHOAI vLLM) │
└───────┬───────┘ └─────────┬─────────┘ └───────────────────┘
│ │
┌───────▼─────────┐ ┌────────▼─────────┐
│ memoryhub-auth │ │ PostgreSQL + │
│ (OAuth 2.1 AS) │ │ pgvector │
└─────────────────┘ └──────────────────┘
Every memory operation flows through the MCP server, which delegates to the service layer in src/memoryhub_core/. The service layer enforces authorization via core/authz.py (JWT-first, session-fallback). The OAuth 2.1 authorization server runs as a separate service. PostgreSQL with pgvector handles relational, vector, and graph queries; an external all-MiniLM-L6-v2 embedding model and an ms-marco-MiniLM-L12-v2 cross-encoder reranker both run on OpenShift AI's vLLM serving. Reranker is optional with graceful cosine fallback when unavailable.
For the full design and the deployment topology, see docs/ARCHITECTURE.md. For the per-subsystem map, see docs/SYSTEMS.md.
Full documentation lives in four top-level directories. Start with docs/README.md for a guided tour, or jump straight to whichever area matches your need:
docs/— Shipped architecture and reference material. Subsystem designs (memory tree, storage layer, governance, curator, MCP server), agent memory ergonomics, auth, identity model, admin operations.planning/— In-flight designs, open questions, and integration roadmaps (Kubernetes operator, observability, session persistence, kagenti and LlamaStack integrations).research/— Investigations and benchmarks that informed shipped decisions (FIPS storage evaluation, two-vector retrieval ranking, pivot detection, FastMCP push notifications, Claude Code JWT limitations).demos/— Conference demo scripts and scenario material (HIMSS, RSA, IACP, IAEM, World AgriTech, and the RHOAI dashboard tile demo).
Package-specific docs live in each package's own README:
- Python SDK — quickstart, API reference, project config, authentication
- CLI — commands, project config, credential setup
- MCP server — tool list, deployment, testing
- Auth service — standalone OAuth 2.1 authorization server
For LLM agents crawling this repo: llms.txt at the repo root follows the llmstxt.org convention and is the most direct entry point.
memory-hub/
├── src/memoryhub_core/ # Server-side library (services, storage, models, authz)
├── memory-hub-mcp/ # FastMCP 3 MCP server (deployed)
├── memoryhub-auth/ # OAuth 2.1 authorization server (deployed)
├── memoryhub-ui/ # Dashboard: React + PatternFly 6 frontend, FastAPI BFF (deployed)
│ ├── backend/
│ └── frontend/
├── sdk/ # Python SDK published to PyPI as `memoryhub`
├── memoryhub-cli/ # CLI client (`pip install memoryhub-cli`)
├── alembic/ # Database migrations
├── tests/ # Server-side library tests
├── docs/ # Shipped architecture and subsystem designs
├── planning/ # In-flight designs for unimplemented features
├── research/ # Investigations and explorations
├── demos/ # Conference demo scripts and dashboard demo material
├── retrospectives/ # Session retros — read for design context
├── deploy/ # Top-level deploy assets (PostgreSQL manifests)
└── benchmarks/ # Empirical benchmark results (e.g. two-vector-retrieval/)
A note on the package layout: the server-side library at src/memoryhub_core/ is published locally as memoryhub-core (used by the MCP server, BFF, alembic, and the root test suite), while the client SDK at sdk/src/memoryhub/ is published to PyPI as memoryhub. Distinct distribution names, distinct import names. See docs/package-layout.md for the rationale.
Set up the server-side library:
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest tests/ -vEach subproject has its own venv and pytest:
# MCP server
cd memory-hub-mcp && make install && .venv/bin/pytest tests/ -q --ignore=tests/examples/
# SDK
cd sdk && .venv/bin/pytest tests/ -q --ignore=tests/test_rbac_live.py
# CLI
cd memoryhub-cli && .venv/bin/pytest tests/ -q
# Dashboard BFF
cd memoryhub-ui/backend && .venv/bin/pytest tests/ -qSee CLAUDE.md for project conventions, the issue-tracker workflow, and the MCP-server scaffold rules.
Issues and PRs are welcome. Start with CONTRIBUTING.md for the local dev setup, coding conventions, and PR flow. Use the /issue-tracker slash command (or follow CLAUDE.md) when filing — every issue references a design document and follows the Backlog → In Progress → Done flow.
Most contributions do not need access to the demo OpenShift cluster — local SQLite or a podman PostgreSQL container is enough. If you do need cluster access, see docs/contributor-cluster-access.md for the access policy, GitHub IdP setup, and the no-deploy rule for new contributors.
Maintainers inviting new contributors should follow the checklist in docs/inviting-new-contributors.md.
Apache 2.0 — see LICENSE.