A general memory system for agents, powered by deep-research
δΈζζζ‘£ | English
General Agentic Memory (GAM) provides a next-generation memory framework for AI agents, combining long-term retention with dynamic reasoning. Following the Just-in-Time (JIT) principle, it preserves full contextual fidelity offline while performing deep research online to build adaptive, high-utility context. With its dual-agent architectureβMemorizer and ResearcherβGAM integrates structured memory with iterative retrieval and reflection, achieving state-of-the-art performance across LoCoMo, HotpotQA, LongBench v2, and LongCodeBench benchmarks.
- Paper: https://arxiv.org/abs/2511.18423
- Huggingface: https://huggingface.co/papers/2511.18423
-
π§ Just-in-Time (JIT) Memory Optimization
Unlike conventional Ahead-of-Time (AOT) systems, GAM performs intensive Memory Deep Research at runtime, dynamically retrieving and synthesizing high-utility context to meet real-time agent needs. -
π Dual-Agent Architecture: Memorizer & Researcher
A cooperative framework where the Memorizer constructs structured memory from raw sessions, and the Researcher performs iterative retrieval, reflection, and summarization to deliver precise, adaptive context. -
π Superior Performance Across Benchmarks
Achieves state-of-the-art results on LoCoMo, HotpotQA, LongBench v2, and LongCodeBench, surpassing prior systems such as A-MEM, Mem0, and MemoryOS in both F1 and BLEU-1 metrics. -
π§© Modular & Extensible Design
Built to support flexible plug-ins for memory construction, retrieval strategies, and reasoning toolsβfacilitating easy integration into multi-agent frameworks or standalone LLM deployments. -
π Cross-Model Compatibility
Compatible with leading LLMs such as GPT-4, GPT-4o-mini, and Qwen2.5, supporting both cloud-based and local deployments for research or production environments.
- 2025-11: Released GAM framework with modular evaluation suite
- 2025-11: Support for HotpotQA, NarrativeQA, LoCoMo, and RULER benchmarks
- β¨ Features
- π₯ News
- ποΈ Project Structure
- π― Quick Start
- π¬ Reproducing Paper Results
- π Documentation
- π Citation
- π€ Community
general-agentic-memory/
βββ gam/ # Core GAM package
β βββ __init__.py
β βββ agents/ # Agent implementations
β β βββ memory_agent.py # MemoryAgent - memory construction
β β βββ research_agent.py # ResearchAgent - deep research
β βββ generator/ # LLM generators
β β βββ openai_generator.py # OpenAI API generator
β β βββ vllm_generator.py # VLLM local generator
β βββ retriever/ # Retrievers
β β βββ index_retriever.py # Index retrieval
β β βββ bm25.py # BM25 keyword retrieval
β β βββ dense_retriever.py # Dense semantic retrieval
β βββ prompts/ # Prompt templates
β βββ schemas/ # Data models
β βββ config/ # Configuration management
βββ eval/ # Evaluation suite
β βββ __init__.py
β βββ run.py # Unified CLI entry
β βββ README.md # Evaluation documentation
β βββ QUICKSTART.md # Quick start guide
β βββ datasets/ # Dataset adapters
β β βββ base.py # Base evaluation class
β β βββ hotpotqa.py # HotpotQA multi-hop QA
β β βββ narrativeqa.py # NarrativeQA narrative QA
β β βββ locomo.py # LoCoMo conversation memory
β β βββ ruler.py # RULER long-context eval
β βββ utils/ # Evaluation utilities
β βββ chunking.py # Text chunking
β βββ metrics.py # Evaluation metrics
βββ scripts/ # Shell scripts
β βββ eval_hotpotqa.sh
β βββ eval_narrativeqa.sh
β βββ eval_locomo.sh
β βββ eval_ruler.sh
β βββ eval_all.sh
βββ examples/ # Usage examples
β βββ quickstart/ # Quick start examples
β βββ README.md # Examples documentation
β βββ basic_usage.py # Basic usage example
β βββ model_usage.py # Model selection example
βββ assets/ # Resource files
βββ docs/ # Documentation
βββ setup.py # Installation config
βββ pyproject.toml # Modern project config
βββ requirements.txt # Dependencies
βββ README.md # This file
# Clone the repository
git clone https://github.com/VectorSpaceLab/general-agentic-memory.git
cd general-agentic-memory
# Install dependencies
pip install -r requirements.txt
# Install the package
pip install -e .import os
from gam import (
MemoryAgent,
ResearchAgent,
OpenAIGenerator,
OpenAIGeneratorConfig,
InMemoryMemoryStore,
InMemoryPageStore,
DenseRetriever,
DenseRetrieverConfig,
)
# 1. Configure and create generator
gen_config = OpenAIGeneratorConfig(
model="gpt-4o-mini",
api_key=os.getenv("OPENAI_API_KEY"),
temperature=0.3
)
generator = OpenAIGenerator(gen_config)
# 2. Create memory and page stores
memory_store = InMemoryMemoryStore()
page_store = InMemoryPageStore()
# 3. Create MemoryAgent
memory_agent = MemoryAgent(
generator=generator,
memory_store=memory_store,
page_store=page_store
)
# 4. Memorize documents
documents = [
"Artificial Intelligence is a branch of computer science...",
"Machine Learning is a subset of AI...",
"Deep Learning uses neural networks..."
]
for doc in documents:
memory_agent.memorize(doc)
# 5. Get memory state
memory_state = memory_agent.get_memory_state()
print(f"Built {len(memory_state.events)} memory events")
# 6. Create ResearchAgent for Q&A
retriever_config = DenseRetrieverConfig(
model_path="BAAI/bge-base-en-v1.5"
)
retriever = DenseRetriever(
config=retriever_config,
memory_store=memory_store,
page_store=page_store
)
research_agent = ResearchAgent(
generator=generator,
retriever=retriever
)
# 7. Perform research
result = research_agent.research(
question="What is the difference between ML and DL?",
top_k=3
)
print(f"Answer: {result.final_answer}")For detailed examples and advanced usage:
examples/quickstart/basic_usage.py- Complete workflow with memory building and researchexamples/quickstart/model_usage.py- Model selection and configurationexamples/quickstart/README.md- Examples documentation
We provide a complete evaluation framework to reproduce the experimental results in the paper.
# 1. Prepare datasets
mkdir -p data
# Place your datasets in the data/ directory
# 2. Set environment variables
export OPENAI_API_KEY="your_api_key_here"
# 3. Run evaluations
# HotpotQA
bash scripts/eval_hotpotqa.sh --data-path data/hotpotqa.json
# NarrativeQA
bash scripts/eval_narrativeqa.sh --data-path narrativeqa --max-samples 100
# LoCoMo
bash scripts/eval_locomo.sh --data-path data/locomo.json
# RULER
bash scripts/eval_ruler.sh --data-path data/ruler.jsonl --dataset-name niah_single_1
# Or run all evaluations
bash scripts/eval_all.shpython -m eval.run \
--dataset hotpotqa \
--data-path data/hotpotqa.json \
--generator openai \
--model gpt-4 \
--retriever dense \
--max-samples 100For complete evaluation documentation:
- eval/README.md - Evaluation framework guide
- eval/QUICKSTART.md - Quick start guide
| Dataset | Task Type | Metrics | Documentation |
|---|---|---|---|
| HotpotQA | Multi-hop QA | F1 | View |
| NarrativeQA | Narrative QA | F1 | View |
| LoCoMo | Conversation Memory | F1, BLEU-1 | View |
| RULER | Long Context | Accuracy | View |
More detailed documentation is coming soon π. Check these resources in the meantime:
- Examples Documentation - Usage examples and tutorials
- Evaluation Guide - Evaluation framework documentation
- Quick Start Guide - Quick start for evaluations
If you find this project useful, please consider citing our paper:
- GitHub Issues: Report bugs or request features
- Email: your-email@example.com
Contributions are welcome! Please feel free to submit issues or pull requests.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
We thank the authors of the following datasets:
- HotpotQA
- NarrativeQA
- LoCoMo
- RULER
This is a research project. Please use it responsibly and ethically.
Made with β€οΈ by the GAM Team
