Skip to content

OneManCrew/virtual-memory-llm

Repository files navigation

Virtual Memory for LLMs

Paging, Swapping, and Cache Hierarchies for Agentic Context Management

This repository contains the full working code from the blog post: Virtual Memory for LLMs: Implementing Paging, Swapping, and Cache Hierarchies in Agentic Context Management

What This Does

Treats the LLM context window as an L1 cache in a four-layer memory hierarchy — borrowing virtual memory concepts from operating systems to handle documents and corpora that are arbitrarily larger than any context window.

Layer Name Technology Access Time
L1 Active Context LLM Prompt Window (in-process dict) ~1ms
L2 RAM — Short-Term Redis KV Store with TTL ~5ms
L3 Swap Space — Mid-Term ChromaDB (vector DB, semantic retrieval) ~50ms
L4 Disk — Long-Term Local filesystem / Object Storage ~200ms+

Key mechanisms:

  • Dynamic Context Paging — documents are chunked into 2,000-token pages; only relevant pages enter L1
  • Page Fault Handling — agent signals missing info; pager loads from lower tiers automatically
  • LRU Eviction — least-recently-used pages are evicted from L1 when capacity is needed
  • Abstractive Compression — evicted pages are compressed ~10:1 before archival (gpt-4o-mini)
  • Context Serialization — compact agent-to-agent hand-off protocol using pointers, not payloads

Files

File Description
main.py Entry point — ingests a large document and runs queries
agent.py VirtualMemoryAgent with page fault handling and hand-off
context_pager.py Core paging engine — swap-in, swap-out, LRU eviction
memory_layers.py L1/L2/L3/L4 layer implementations
page_table.py Central index mapping page IDs to tiers and metadata
compressor.py LLM-based abstractive page compression
context_serializer.py Agent-to-agent context snapshot protocol

Setup

1. Create a virtual environment

python -m venv venv
source venv/bin/activate   # Linux/macOS
venv\Scripts\activate      # Windows

2. Install dependencies

pip install -r requirements.txt

3. Start Redis (optional — falls back to in-memory if unavailable)

docker run -d -p 6379:6379 redis:alpine

4. Set your OpenAI API key

export OPENAI_API_KEY="your-key-here"

5. Run

python main.py

Requirements

  • Python 3.11+
  • OpenAI API key (GPT-4o for agent, GPT-4o-mini for compression)
  • Redis (optional, recommended for production)

Blog Post

Virtual Memory for LLMs

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages