Gen3 Admin Contextual Memory

Contextual Memory

Start Here

Open Contextual Memory in the Control Panel sidebar (Super Admin only).
Enable the platform layer, select embedding and summary models, and set TTL and auto-inject options as needed.
Click Save Contextual Memory settings before expecting tenant chat or GT Helper to index or recall prior messages.
After changing the embedding model, use Force re-embed all to rebuild stored embeddings and summaries (review the cost estimate in the confirmation dialog).
Monitor pipeline jobs on this page and in tenant Observability → Contextual Memory; configure per-agent defaults in Building Agents.

Control Panel Contextual Memory configuration

Why this matters

Contextual Memory is GT AI OS’s three-layer recall system for chat and helper threads. The Control Panel page controls the platform layer only: indexing, embedding, summarization, optional auto-inject into helper context, and deployment-wide pipeline operations.

When the platform layer is disabled or missing an embedding model, the brain control in GT Chat is hidden and memory tools are unavailable—even if an agent default would otherwise enable recall. Tenant operators still choose per-conversation scope in chat; agents set defaults for new conversations in Building Agents.

Details

Three layers of memory

Layer	Where configured	What it controls
1. Platform	Control Panel Contextual Memory (this page)	Indexing, embeddings, summaries, TTL, auto-inject, force re-embed
2. Agent default	Tenant Building Agents → agent configuration	Default recall mode for new conversations (`this_conversation`, `this_agent`, `all_agents`)
3. Per-conversation	GT Chat brain control in the composer	User toggles memory on/off and recall scope for that thread

Platform layer must be enabled with a valid embedding model before layers 2 and 3 affect runtime behavior.

Configuration

Setting	Purpose
Enable Contextual Memory	Master switch for tenant indexing and recall tooling
Embedding model	Model used to embed chat and helper messages (required when enabled)
Summary model	Chat-capable model for rolling conversation summaries
Message embedding TTL (days)	Retention for message-level embeddings
Summary TTL (days)	Retention for generated summaries
Auto-inject into helper context	Prepend relevant memory excerpts into GT Helper / CTP Helper inference
Auto-inject max runes	Upper bound on injected memory text per helper turn

Deployment-wide default chat and embedding models for non-memory flows remain on Models → Default Models. This page uses dedicated memory model IDs so you can tune recall without changing general chat defaults.

Saving is blocked when Contextual Memory is enabled but no embedding model is selected.

Force re-embed all

Use Force re-embed all when you:

Change the memory embedding model
Need to recover from a bad backfill or widespread embedding failures
Migrate after a major model catalog change

The workflow:

Click Force re-embed all (enabled only when the platform layer is on and embedding model is set).
Review the estimate: embeddable characters, estimated tokens, and estimated USD cost.
Confirm in the dialog to enqueue a full backfill.

The operation deletes existing message embeddings and summaries, then enqueues workers to rebuild tenant chat and helper threads. Expect elevated embedding traffic until the pipeline drains.

Pipeline status (Control Panel)

When the platform layer is enabled or recent pipeline activity exists, the page shows Contextual Memory pipeline status for the Control Panel helper stream:

Job counts: pending, running, succeeded, failed
Message embeddings and summaries: active, expired, total
Jobs by kind table
Recent failures with error text

Use Refresh status after a re-embed or when investigating stuck jobs.

Tenant observability

Tenant roles review memory pipeline metrics under Management → Observability → Contextual Memory tab in Observability:

Scope follows tenant role (owner-wide, managed-group, or personal)
Objective metrics: job counts, stored artifacts, usage signals, recent failures
Complements billing breakdowns on the Billing tab when financial controls expose memory spend

Control Panel pipeline status focuses on the operator helper stream; tenant observability covers the deployment scope your role can see.

Agent and chat behavior

In Building Agents, set default memory mode for new chats:

This conversation only — no cross-thread recall
Just conversations with this agent — agent-scoped memory search
All my conversations — user-wide recall when platform layer allows

In GT Chat, users override per conversation with the brain control when the platform layer is enabled: toggle memory on/off and choose recall scope for that thread.

Agents may invoke memory search tools when scope and platform settings allow. Activity labels such as Searching Contextual Memory appear in the chat timeline during recall.

Prerequisites

Before enabling Contextual Memory in production:

Configure a reachable embedding provider on Models (for example Ollama with embeddinggemma per Ollama host setup).
Set deployment default embedding model if datasets also depend on embeddings.
Select memory-specific embedding and summary models on this page.
Plan a maintenance window before Force re-embed all on large tenants.

Troubleshooting

Symptom	What to check
Brain control missing in GT Chat	Platform layer disabled or embedding model unset on this page
Memory tools fail in agent chat	Agent default mode, per-conversation scope, and platform enablement
High failed job count	Recent failures table on this page or tenant Contextual Memory observability tab
Stale recall after model change	Run Force re-embed all and monitor pipeline until succeeded counts stabilize
Helper auto-inject too large	Lower Auto-inject max runes or disable auto-inject

Related pages

Models
Ollama host setup
Observability (tenant Contextual Memory tab)
Building Agents
GT Helper Settings
Financial Controls

GT AI OS Instructions

Home

Self-Hosted deployment

Uh oh!

Gen3 Admin Contextual Memory

Contextual Memory

Start Here

Why this matters

Details

Three layers of memory

Configuration

Force re-embed all

Pipeline status (Control Panel)

Tenant observability

Agent and chat behavior

Prerequisites

Troubleshooting

Related pages

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!