Context Window Lab

A lab project for experimenting with context window management, conversation visualization, and RAG operations for LLM interactions.

Project Structure

lab/
├── chatgpt_client.py      # Simple ChatGPT client (the "boss")
├── smithers.py            # Assistant LLM with context management tools
├── context_manager.py     # CRUD operations for context windows
├── visualizer.py          # Conversation graph visualizer
├── benchmarks/            # Long-context benchmarking suite
│   ├── base_benchmark.py      # Base class for benchmarks
│   ├── needle_in_haystack.py  # Needle in haystack benchmark
│   ├── oolong.py              # OOLONG benchmark
│   ├── oolong_pairs.py        # OOLONG PAIRS benchmark
│   ├── codeqa.py              # CodeQA benchmark
│   ├── browsecomp.py          # BrowseComp+ benchmark
│   └── benchmark_runner.py    # Benchmark execution engine
├── examples/              # Example scripts
│   ├── example_usage.py
│   └── demo_visualizer.py
├── tests/                 # Test files
│   └── test_all.py
├── requirements.txt       # Python dependencies
├── .env                   # API keys (create this file)
└── README.md             # This file

Installation

Install dependencies:
```
pip install -r requirements.txt
```

Create .env file:

echo "OPENAI_API_KEY=your_api_key_here" > .env

Components

1. ChatGPT Client (`chatgpt_client.py`)

Simple client for direct ChatGPT interactions.

Usage:

# Command line
python chatgpt_client.py "Hello, how are you?"

# Interactive mode
python chatgpt_client.py

2. Context Manager (`context_manager.py`)

Core library for managing context windows with full CRUD operations.

Features:

Create, Read, Update, Delete context entries
Search context by text
Compact/summarize context ranges
Save/load context to JSON
Statistics and analytics

Usage:

from context_manager import ContextManager

cm = ContextManager()
cm.create("user", "Hello")
cm.create("assistant", "Hi there!")
entry = cm.read(index=0)
cm.update(0, content="Hello world")
cm.delete(index=1)

3. Smithers (`smithers.py`)

Assistant LLM that helps manage context windows and performs RAG operations.

Features:

Chat with automatic context management
RAG (Retrieval-Augmented Generation)
Context compaction using AI
Full CRUD operations via CLI
Integration with visualizer

Usage:

python smithers.py

Commands:

chat <message> - Chat with Smithers
create <role> <content> - Add context entry
read [index|start:end|role] - Read context
update <index> <field> <value> - Update entry
delete <index|start:end|role> - Delete entries
search <query> - Search context
compact [start:end] - Compact context
stats - Show context statistics
save <filepath> - Save context to file
load <filepath> - Load context from file
rag <query> - RAG-enhanced chat
visualize [output.png] [start:end] - Visualize conversation graph
exit/quit - Exit

Programmatic Usage:

from smithers import Smithers

smithers = Smithers()
response = smithers.chat("What is 2+2?")
cm = smithers.get_context_manager()

4. Visualizer (`visualizer.py`)

Creates graph visualizations of conversations with node sizes based on word count.

Features:

Top-down hierarchical layout
Node size = word count
Color-coded by role (user/assistant/system)
Save to PNG/PDF or display interactively

Usage:

# From saved context file
python visualizer.py context.json output.png

# Programmatically
from visualizer import ConversationVisualizer
from context_manager import ContextManager

cm = ContextManager()
# ... add context ...
viz = ConversationVisualizer(cm)
viz.visualize(output_file="graph.png")

From Smithers:

python smithers.py
> visualize conversation.png
> visualize output.png 0:10  # Specific range

Quick Start

Set up environment:

pip install -r requirements.txt
echo "OPENAI_API_KEY=your_key" > .env

Try the simple client:
```
python chatgpt_client.py "Hello!"
```

Use Smithers with context management:

python smithers.py
> chat Hello, I'm working on a project
> chat Tell me about context windows
> stats
> visualize conversation.png

Run benchmarks:

python benchmarks/benchmark_runner.py --benchmark needle_in_haystack --context-lengths 1000 5000

See examples:

python examples/example_usage.py
python examples/demo_visualizer.py

Workflow Example

# 1. Start Smithers
python smithers.py

# 2. Have a conversation
> chat What are context windows?
> chat How do I manage large contexts?
> chat Explain RAG

# 3. Check your context
> stats
> read

# 4. Search for specific topics
> search RAG

# 5. Visualize the conversation
> visualize conversation_graph.png

# 6. Save for later
> save my_conversation.json

# 7. Load and continue
> load my_conversation.json
> chat Continue our discussion

Dependencies

openai - OpenAI API client
python-dotenv - Environment variable management
networkx - Graph creation (for visualizer)
matplotlib - Plotting (for visualizer)

Note: Visualization features are optional. The rest of the system works without networkx and matplotlib installed.

5. Benchmarking Suite (`benchmarks/`)

Comprehensive benchmarking system for evaluating long-context window performance.

Available Benchmarks:

Needle in Haystack: Find specific information in very long contexts
OOLONG: Out-Of-LOng-context Needle - information at various positions
OOLONG PAIRS: Find and relate pairs of information across long contexts
CodeQA: Answer questions about code in long contexts
BrowseComp+: Browse and comprehend information across long documents

Usage:

# Run all benchmarks
python benchmarks/benchmark_runner.py --benchmark all --context-lengths 1000 5000 10000 20000

# Run specific benchmark
python benchmarks/benchmark_runner.py --benchmark needle_in_haystack --context-lengths 5000 10000 --runs 5

# Save results
python benchmarks/benchmark_runner.py --benchmark all --output results.json

Programmatic Usage:

from benchmarks.benchmark_runner import BenchmarkRunner

runner = BenchmarkRunner(model="gpt-4o")
results = runner.run_benchmark(
    "needle_in_haystack",
    context_lengths=[1000, 5000, 10000],
    num_runs=3
)
runner.save_results(results, "results.json")

Individual Benchmark Usage:

from benchmarks.needle_in_haystack import NeedleInHaystackBenchmark

benchmark = NeedleInHaystackBenchmark()
test_case = benchmark.generate_test_case(context_length=5000, needle_position="middle")
# Use test_case["context"] and test_case["question"]
# Evaluate with: benchmark.evaluate(response, test_case["expected_answer"])

Notes

This is a lab project for experimentation with long-context windows
Context is stored in memory by default (use save/load for persistence)
All tools are designed to be simple and understandable
The visualizer requires networkx and matplotlib (install separately if needed)
Benchmarks are designed to test context window limits and retrieval capabilities

File Descriptions

Core Components

chatgpt_client.py: Simple ChatGPT interface (the "boss")
smithers.py: Assistant with context management tools
context_manager.py: Core CRUD operations for context arrays
visualizer.py: Graph visualization of conversations

Benchmarks

benchmarks/base_benchmark.py: Base class for all benchmarks
benchmarks/needle_in_haystack.py: Needle in haystack benchmark
benchmarks/oolong.py: OOLONG benchmark
benchmarks/oolong_pairs.py: OOLONG PAIRS benchmark
benchmarks/codeqa.py: CodeQA benchmark
benchmarks/browsecomp.py: BrowseComp+ benchmark
benchmarks/benchmark_runner.py: Benchmark execution engine

Examples & Tests

examples/example_usage.py: Code examples for all features
examples/demo_visualizer.py: Demo script for visualization
tests/test_all.py: Comprehensive test suite

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Context Window Lab

Project Structure

Installation

Components

1. ChatGPT Client (`chatgpt_client.py`)

2. Context Manager (`context_manager.py`)

3. Smithers (`smithers.py`)

4. Visualizer (`visualizer.py`)

Quick Start

Workflow Example

Dependencies

5. Benchmarking Suite (`benchmarks/`)

Notes

File Descriptions

Core Components

Benchmarks

Examples & Tests

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
benchmarks		benchmarks
examples		examples
tests		tests
.gitignore		.gitignore
README.md		README.md
chatgpt_client.py		chatgpt_client.py
context_manager.py		context_manager.py
requirements.txt		requirements.txt
smithers.py		smithers.py
visualizer.py		visualizer.py

Baggiest/CWL

Folders and files

Latest commit

History

Repository files navigation

Context Window Lab

Project Structure

Installation

Components

1. ChatGPT Client (chatgpt_client.py)

2. Context Manager (context_manager.py)

3. Smithers (smithers.py)

4. Visualizer (visualizer.py)

Quick Start

Workflow Example

Dependencies

5. Benchmarking Suite (benchmarks/)

Notes

File Descriptions

Core Components

Benchmarks

Examples & Tests

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. ChatGPT Client (`chatgpt_client.py`)

2. Context Manager (`context_manager.py`)

3. Smithers (`smithers.py`)

4. Visualizer (`visualizer.py`)

5. Benchmarking Suite (`benchmarks/`)

Packages