gwt-context

Global Workspace Theory-inspired context management for LLM agents. MCP server with bounded workspace selection, broadcast-bus subscribers, specialist competition, and multi-hop reasoning.

What it does

LLMs lose information in long contexts — multi-hop reasoning degrades (Sequential-NIAH best: 63.15%), information aggregation suffers (LongBench Pro T6: 57.72%). This project focuses on the architectural parts of GWT that are useful for agent memory: bounded selection, global availability, recurrent activation, and independent post-broadcast processors.

gwt-context implements a concrete selection-broadcast runtime as an MCP server. Specialist processors compete to surface relevant information into a capacity-limited workspace, then a broadcast bus fans the selected content out to independent subscribers that can propose recall, exact resolution, contradiction flags, or follow-up actions. It is not a biological GWT simulator; it is a practical agent-memory architecture with explicit GWT markers.

GWT markers implemented

Marker	Implementation
Global availability	Workspace broadcast — all items visible simultaneously
Functional concurrency	6 specialists score independently
Coordinated selection	CompetitionEngine — single arbitration point
Capacity limitation	Workspace capacity = 7 (Miller's 7±2)
Persistence with controlled update	Items persist until displaced by competition
Goal-modulated arbitration	×1.3 multiplicative boost by goal relevance
Post-broadcast processors	BroadcastBus fan-out with accepted/inhibited proposals

Install

pip install gwt-context

Requirements: Python 3.11+, sentence-transformers (all-MiniLM-L6-v2, downloaded on first run).

Usage

With Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "gwt-context": {
      "command": "python",
      "args": ["-m", "gwt_context"]
    }
  }
}

With any MCP client

python -m gwt_context

Local readiness smoke

Use hash embeddings when you want a fully local startup check without downloading a sentence-transformer model:

GWT_EMBEDDING_PROVIDER=hash GWT_EMBEDDING_MODEL=hash python -m gwt_context.smoke

The same command is available after installation as gwt-context-smoke.

Run a real stdio MCP client smoke against the packaged server entrypoint:

gwt-context-mcp-smoke

Add it to Codex as a local MCP server:

codex mcp add gwt-context \
  --env GWT_EMBEDDING_PROVIDER=hash \
  --env GWT_EMBEDDING_MODEL=hash \
  --env GWT_EMBEDDING_DIM=32 \
  --env GWT_DATA_DIR=/home/netsky/.gwt-context-codex/projects/gwt-context \
  -- python -m gwt_context

New Codex MCP tools are available to new Codex sessions after the config is loaded. Use gwt-global with /home/netsky/.gwt-context-codex/global for cross-project memory. See docs/codex-mcp.md for namespace setup and cleanup.

MCP Tools

Tool	Description
`gwt_store`	Store content in long-term memory (embed + index + buffer)
`gwt_set_goal`	Set active goal — biases competition toward goal-relevant items
`gwt_broadcast`	Run selection-broadcast cycle — returns workspace content
`gwt_compete`	Competition round without broadcast (dry-run)
`gwt_query`	Semantic search over long-term memory, optionally admitted to competition
`gwt_attend`	One-call goal-directed attention pass with semantic, structured, graph, hybrid, or auto planning
`gwt_resolve`	Resolve a question against runtime structured memory without broadcasting
`gwt_collection_query`	Run exact count/filter/top-k/average/sum/distinct/min/max/compare operations over runtime structured memory
`gwt_trace_explain`	Explain the most recent explicit attention trace
`gwt_memory_profile`	Inspect active namespace, persisted counts, embedding settings, and structured read-model state
`gwt_readiness_check`	Return compact runtime readiness checks, counts, namespace, and bus health
`gwt_backup_memory`	Produce a JSONL backup payload with namespace metadata
`gwt_export_memory`	Export persisted memory as JSONL without embeddings
`gwt_import_memory`	Import JSONL memory into the active namespace and re-embed records
`gwt_restore_memory`	Restore JSONL memory in merge or confirmed replace mode
`gwt_compact_working_memory`	Dry-run or confirmed compaction of old working-memory records
`gwt_reset`	Clear runtime, workspace, or confirmed persistent memory with explicit confirmation
`gwt_evict`	Manual eviction from workspace
`gwt_link`	Bidirectional link between items (enables multi-hop chains)
`gwt_inspect`	Observe workspace, buffer, goals, stats

How it works

 ┌─────────────────────────────────────────────────┐
 │                  Long-Term Memory               │
 │            (SQLite + Vector Index)               │
 └──────────────────┬──────────────────────────────┘
                    │ candidates
                    ▼
 ┌──────────────────────────────────────────────────┐
 │              Specialist Processors               │
 │                                                  │
 │  Relevance (0.35)    Recency (0.20)              │
 │  Novelty (0.15)      Frequency (0.10)            │
 │  Structural Linkage (0.10)  Goal Linkage (0.10)  │
 └──────────────────┬───────────────────────────────┘
                    │ scored candidates
                    ▼
 ┌──────────────────────────────────────────────────┐
 │            Competition Engine                    │
 │     weighted scores + goal modulation (×1.3)     │
 │     top-N admitted, losers evicted               │
 └──────────────────┬───────────────────────────────┘
                    │ winners
                    ▼
 ┌──────────────────────────────────────────────────┐
 │          Global Workspace (capacity=7)           │
 │                                                  │
 │    Broadcast → formatted text returned to LLM    │
 └──────────────────────────────────────────────────┘

Multi-hop reasoning

Items can be linked bidirectionally via gwt_link. The GoalLinkageSpecialist weights each link by how relevant the linked item is to the current goal — multi-hop chains are boosted only when they lead toward the goal. The StructuralLinkageSpecialist preserves chains across minor goal shifts.

Goal switching

When the goal changes, GoalLinkageSpecialist re-weights all links by relevance to the new goal. Items linked to now-irrelevant content lose their boost and get evicted, making room for goal-relevant items.

Explicit attention control

gwt_context.application.attention.AttentionController provides a reusable path for deterministic selection: set the goal, resolve an evidence plan, query/admit matching memories, then broadcast. Production planning supports semantic lookup, exact structured collection evidence, relation-graph continuation, hybrid mode, and auto mode. The controller itself depends only on application ports.

SelectionBroadcastCycle publishes each workspace broadcast to a subscriber bus. Structured resolve, semantic recall, relation continuation, contradiction checking, and plan critique subscribers read the same broadcast and return proposals. gwt_attend applies accepted proposals through public ports: follow-up memory queries, deterministic answer resolution, contradiction flags, and follow-up flags are recorded in the trace. Repeated proposals are inhibited across broadcasts.

Conscious items also reactivate their gwt_link targets into the preconscious buffer for the next cycle, so recurrent attention can follow explicit memory links instead of only parsing names from rendered broadcast text.

The strict state/admission rules are documented in docs/gwt-runtime-contracts.md. Start from docs/quickstart.md for local usage and docs/demo-scenarios.md for reproducible demos, and docs/external-subscribers.md for LLM/NLI subscriber adapters.

MCP clients can call gwt_attend(question, keywords?, k?, planner?) for this path without manually sequencing gwt_set_goal, gwt_query(admit=true), and gwt_broadcast. They can also call gwt_resolve or gwt_collection_query when they need an exact runtime answer without a broadcast. The most recent attention trace is available at gwt://attention/last and summarized by gwt_trace_explain. gwt_bus_inspect exposes the latest cycle-level bus result, including subscriber statuses and accepted/inhibited proposal counts.

Architecture

src/gwt_context/
├── domain/           # Pure domain, no I/O
│   ├── models.py     # MemoryItem, Goal, WorkspaceSlot, etc.
│   ├── workspace.py  # GlobalWorkspace (capacity-limited slots)
│   ├── specialists.py # 6 specialist scoring functions
│   ├── competition.py # CompetitionEngine (scoring + eviction)
│   └── broadcast.py  # BroadcastAssembler (workspace → text)
├── application/      # Orchestration
│   ├── broadcast_bus.py # Post-broadcast subscribers and proposal arbitration
│   ├── attention.py  # Explicit attention controller
│   ├── structured.py # Runtime collection and relation evidence
│   ├── cycle.py      # SelectionBroadcastCycle + PreconsciousBuffer
│   ├── ingestion.py  # Content → MemoryItem pipeline
│   └── goal_manager.py
├── infrastructure/   # Storage, embeddings
│   ├── storage.py    # SQLiteMemoryStore
│   ├── vector_index.py # Numpy cosine similarity index
│   ├── embeddings.py # SentenceTransformerEmbedder
│   └── config.py     # GWTConfig
├── mcp/              # MCP interface
│   ├── tools.py      # 12 tool definitions
│   ├── resources.py  # MCP resources
│   └── prompts.py    # System + multi-hop prompts
└── server.py         # FastMCP wiring + entry point

Tests

pip install -e ".[dev]"
pytest tests/unit/ tests/integration/ -q

Tests cover domain logic, storage, vector index, MCP boundaries, and full selection-broadcast cycles.

Benchmarks

Benchmark harness for evaluation against OpenAI-compatible APIs (Qwen, Llama, etc. via vLLM/TGI):

pip install -e ".[dev,bench]"
cp .env.example .env

python -m tests.benchmarks.ruler_multi_hop

python -m tests.benchmarks.longbench_pro

See tests/benchmarks/README.md for the full variable matrix, command examples, and reproducible output behavior. See docs/attention-controller.md for the architecture note behind the controlled/hybrid design. See docs/release-readiness.md for current release gates and Qwen smoke status. See docs/honest-gwt-report.md for the current GWT claim and limitations. See docs/dogfood-report.md for the latest real MCP/Qwen dogfood evidence. See docs/benchmark-report-v0.3.md for the current v0.3 benchmark summary. See docs/mcp-tool-contracts.md for stable MCP response shapes.

Each benchmark runs GWT mode (with tools) and baseline mode (all context in prompt) for comparison. Results are saved as JSON in BENCHMARK_RESULTS_DIR (default tests/benchmarks/results/) using deterministic filenames:

{benchmark}_{model}_{timestamp}_{config_hash}.json

Benchmark modes include prompt-only baseline, model-controlled tools, production generic attend, deterministic controlled, and hybrid mode where GWT selection is deterministic and the model only performs final synthesis.

Analyze failures and runtime metrics with:

python -m tests.benchmarks.analyze_results tests/benchmarks/results

Render trace-heavy results as HTML with:

python -m tests.benchmarks.render_trace tests/benchmarks/results/<result>.json

The HTML report groups bus proposals by subscriber/kind, lists inhibited proposal keys and rationale, and summarizes workspace changes across trace phases.

Run a small local MCP-facing scenario without downloading embedding models:

python examples/mcp_demo.py

Run the post-release usage loop that exercises store, attend, bus inspection, trace explanation, and workspace inspection:

python examples/real_usage_loop.py

Run the external subscriber proof-of-concept:

python examples/external_subscriber_poc.py

Run the deterministic benchmark smoke used by npm test:

npm run benchmark:smoke

Run a tiny model-backed Qwen/OpenAI-compatible smoke while keeping local GWT embeddings deterministic:

GWT_EMBEDDING_PROVIDER=hash GWT_EMBEDDING_MODEL=hash \
python -m tests.benchmarks.ruler_multi_hop \
    --hops 2 --distractors 3 --tasks-per-config 1 --max-tasks 1 \
    --gwt-mode attend

Or use the bounded sanity wrapper:

npm run qwen:sanity -- --run --max-tasks 1

RunPod endpoint

The benchmark entrypoints load .env automatically if it exists. The repository now includes .env.example with the current RunPod-compatible defaults:

BENCHMARK_API_BASE=https://example-openai-compatible-endpoint/v1
BENCHMARK_API_PATH=/v1
BENCHMARK_MODEL=qwen3.6-35b-a3b
BENCHMARK_API_KEY=test
BENCHMARK_TIMEOUT_SECONDS=30
BENCHMARK_MAX_RETRIES=2
BENCHMARK_CONCURRENCY=16
BENCHMARK_RESULTS_DIR=tests/benchmarks/results

.env is ignored by git, while .env.example is tracked so the shared setup stays visible. You can still override everything explicitly on the CLI:

python -m tests.benchmarks.ruler_multi_hop \
    --api-base "$BENCHMARK_API_BASE" \
    --api-path "$BENCHMARK_API_PATH" \
    --model "$BENCHMARK_MODEL" \
    --api-key "$BENCHMARK_API_KEY" \
    --max-tasks 3

Configuration

Environment variables:

Variable	Default	Description
`GWT_WORKSPACE_CAPACITY`	`7`	Max items in workspace
`GWT_BUFFER_SIZE`	`50`	Preconscious buffer size
`GWT_GOAL_MODULATION`	`0.3`	Goal boost strength (0-1)
`GWT_MIN_ACTIVATION`	`0.2`	Ignition threshold for admitting new workspace candidates
`GWT_EMBEDDING_PROVIDER`	`sentence-transformer`	`sentence-transformer` or deterministic local `hash`
`GWT_EMBEDDING_MODEL`	`all-MiniLM-L6-v2`	Sentence transformer model
`GWT_EMBEDDING_DIM`	`384`	Vector dimension for storage/search
`GWT_DATA_DIR`	`~/.gwt-context`	Storage directory
`GWT_DB_PATH`	unset	Optional exact SQLite DB path override
`GWT_VECTOR_INDEX_PATH`	unset	Optional exact vector index path override
`GWT_MAX_BROADCAST_TOKENS`	`4000`	Max tokens per broadcast
`GWT_MAX_VECTOR_ELEMENTS`	`100000`	Max vector index capacity setting

References

Baars, B.J. (1988). A Cognitive Theory of Consciousness
Hsieh et al. (2024). RULER: What's the Real Context Size of Your Long-Context Language Models?
Dehaene & Naccache (2001). Towards a cognitive neuroscience of consciousness

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
research		research
scripts		scripts
src/gwt_context		src/gwt_context
tests		tests
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
ARCHITECTURE.md		ARCHITECTURE.md
CHANGELOG.md		CHANGELOG.md
README.md		README.md
ROADMAP.md		ROADMAP.md
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gwt-context

What it does

GWT markers implemented

Install

Usage

With Claude Desktop

With any MCP client

Local readiness smoke

MCP Tools

How it works

Multi-hop reasoning

Goal switching

Explicit attention control

Architecture

Tests

Benchmarks

RunPod endpoint

Configuration

References

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

gwt-context

What it does

GWT markers implemented

Install

Usage

With Claude Desktop

With any MCP client

Local readiness smoke

MCP Tools

How it works

Multi-hop reasoning

Goal switching

Explicit attention control

Architecture

Tests

Benchmarks

RunPod endpoint

Configuration

References

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages