Skip to content

Commit 4793075

Browse files
committed
#7733 WIP
1 parent bdacaf1 commit 4793075

1 file changed

Lines changed: 284 additions & 20 deletions

File tree

learn/blog/context-engineering-revolution.md

Lines changed: 284 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -141,19 +141,19 @@ Here's a simplified view of how this works:
141141
```javascript
142142
function initializeToolMapping() {
143143
const openApiDocument = yaml.load(fs.readFileSync(openApiFilePath, 'utf8'));
144-
144+
145145
for (const pathItem of Object.values(openApiDocument.paths)) {
146146
for (const operation of Object.values(pathItem)) {
147147
if (operation.operationId) {
148148
// Build Zod schema from OpenAPI definition
149149
const inputZodSchema = buildZodSchema(openApiDocument, operation);
150-
150+
151151
// Convert to JSON Schema for MCP clients
152152
const inputJsonSchema = zodToJsonSchema(inputZodSchema, {
153153
target: 'openApi3',
154154
$refStrategy: 'none'
155155
});
156-
156+
157157
// Store tool definition
158158
toolMapping[operation.operationId] = {
159159
name: operation.operationId,
@@ -186,14 +186,14 @@ Here's the startup flow from `DatabaseService.mjs`:
186186
```javascript
187187
async initAsync() {
188188
await super.initAsync();
189-
189+
190190
// Wait for ChromaDB to be available
191191
await DatabaseLifecycleService.ready();
192-
192+
193193
logger.info('[Startup] Checking knowledge base status...');
194194
const knowledgeBasePath = aiConfig.dataPath;
195195
const kbExists = await fs.pathExists(knowledgeBasePath);
196-
196+
197197
try {
198198
if (!kbExists) {
199199
logger.info('[Startup] Knowledge base file not found. Starting full synchronization...');
@@ -224,11 +224,11 @@ Here's how `ensureHealthy()` works:
224224
```javascript
225225
async ensureHealthy() {
226226
const health = await this.healthcheck();
227-
227+
228228
if (health.status !== 'healthy') {
229229
const details = health.details.join('\n - ');
230-
const statusMsg = health.status === 'unhealthy'
231-
? 'not available'
230+
const statusMsg = health.status === 'unhealthy'
231+
? 'not available'
232232
: 'not fully operational';
233233
throw new Error(`Knowledge Base is ${statusMsg}:\n - ${details}`);
234234
}
@@ -270,28 +270,28 @@ const queryWords = queryLower
270270

271271
results.metadatas[0].forEach((metadata, index) => {
272272
let score = (results.metadatas[0].length - index) * queryScoreWeights.baseIncrement;
273-
273+
274274
queryWords.forEach(queryWord => {
275275
const keyword = queryWord;
276-
const keywordSingular = keyword.endsWith('s')
277-
? keyword.slice(0, -1)
276+
const keywordSingular = keyword.endsWith('s')
277+
? keyword.slice(0, -1)
278278
: keyword;
279-
279+
280280
if (keywordSingular.length > 2) {
281281
// Path matching - highest weight
282-
if (sourcePathLower.includes(`/${keywordSingular}/`))
282+
if (sourcePathLower.includes(`/${keywordSingular}/`))
283283
score += queryScoreWeights.sourcePathMatch; // +40
284-
284+
285285
// Filename matching
286-
if (fileName.includes(keywordSingular))
286+
if (fileName.includes(keywordSingular))
287287
score += queryScoreWeights.fileNameMatch; // +30
288-
288+
289289
// Class name matching
290-
if (metadata.className?.toLowerCase().includes(keywordSingular))
290+
if (metadata.className?.toLowerCase().includes(keywordSingular))
291291
score += queryScoreWeights.classNameMatch; // +20
292-
292+
293293
// Content type bonuses
294-
if (metadata.type === 'guide')
294+
if (metadata.type === 'guide')
295295
score += queryScoreWeights.guideMatch; // +50
296296
}
297297
});
@@ -451,13 +451,277 @@ If the Knowledge Base is the AI's understanding of the *project*, the **Memory C
451451
452452
Every interaction—every prompt, thought process, and response—is captured and stored as a "memory." This is not just a chat log; it's a structured, queryable history of the agent's own work. When a new session begins, the Memory Core automatically analyzes and summarizes all previous, unsummarized sessions. This creates a high-level "recap" of past work, allowing the agent to remember what it did, what decisions it made, and why.
453453
454+
#### The Save-Then-Respond Protocol
455+
456+
At the heart of the Memory Core is the **transactional memory protocol**. Every agent interaction follows a strict three-part structure defined in the OpenAPI spec:
457+
458+
```yaml
459+
AddMemoryRequest:
460+
type: object
461+
required:
462+
- prompt
463+
- thought
464+
- response
465+
properties:
466+
prompt:
467+
type: string
468+
description: The user's verbatim prompt to the agent
469+
thought:
470+
type: string
471+
description: The agent's internal reasoning process
472+
response:
473+
type: string
474+
description: The agent's final, user-facing response
475+
sessionId:
476+
type: string
477+
description: Session ID for grouping (auto-generated if not provided)
478+
```
479+
480+
This isn't just logging—it's a **mandatory save-then-respond loop**. The agent protocol requires that before delivering any response to the user, the agent must call `add_memory` with its complete reasoning chain. This creates an honest, unfiltered record of the agent's thought process.
481+
482+
Here's how it works in `MemoryService.mjs`:
483+
484+
```javascript
485+
async addMemory({ prompt, response, thought, sessionId }) {
486+
const collection = await ChromaManager.getMemoryCollection();
487+
const combinedText = `User Prompt: ${prompt}\nAgent Thought: ${thought}\nAgent Response: ${response}`;
488+
const timestamp = new Date().toISOString();
489+
const memoryId = `mem_${timestamp}`;
490+
491+
// Generate semantic embedding for the entire interaction
492+
const embedding = await TextEmbeddingService.embedText(combinedText);
493+
494+
await collection.add({
495+
ids: [memoryId],
496+
embeddings: [embedding],
497+
metadatas: [{
498+
prompt,
499+
response,
500+
thought,
501+
sessionId,
502+
timestamp,
503+
type: 'agent-interaction'
504+
}],
505+
documents: [combinedText]
506+
});
507+
508+
return { id: memoryId, sessionId, timestamp, message: "Memory successfully added" };
509+
}
510+
```
511+
512+
The key innovation here is that we embed the **entire interaction**—prompt, thought, and response—as a single vector. This means when the agent searches its memory later, it's searching not just for what it said, but for *why* it said it and what problem it was solving.
513+
514+
#### Autonomous Session Summarization
515+
516+
The real magic happens at startup. Just like the Knowledge Base server, the Memory Core is **self-maintaining**. When the server starts, it automatically discovers and summarizes any unsummarized sessions from previous work.
517+
518+
From `SessionService.mjs`:
519+
520+
```javascript
521+
async initAsync() {
522+
await super.initAsync();
523+
await DatabaseLifecycleService.ready();
524+
525+
// Initialize collections
526+
this.memoryCollection = await ChromaManager.getMemoryCollection();
527+
this.sessionsCollection = await ChromaManager.getSummaryCollection();
528+
529+
// Skip if GEMINI_API_KEY is missing
530+
if (!this.model) return;
531+
532+
logger.info('[Startup] Checking for unsummarized sessions...');
533+
534+
try {
535+
const result = await this.summarizeSessions({});
536+
537+
if (result.processed > 0) {
538+
logger.info(`✅ [Startup] Summarized ${result.processed} session(s):`);
539+
result.sessions.forEach(session => {
540+
logger.info(` - ${session.title} (${session.memoryCount} memories)`);
541+
});
542+
}
543+
} catch (error) {
544+
logger.warn('⚠️ [Startup] Session summarization failed:', error.message);
545+
}
546+
}
547+
```
548+
549+
The summarization process uses **Gemini 2.5 Flash** to analyze the entire session and extract structured metadata:
550+
551+
```javascript
552+
async summarizeSession(sessionId) {
553+
const memories = await this.memoryCollection.get({
554+
where: {sessionId},
555+
include: ['documents', 'metadatas']
556+
});
557+
558+
if (memories.ids.length === 0) return null;
559+
560+
// Aggregate all memories from the session
561+
const aggregatedContent = memories.documents.join('\n\n---\n\n');
562+
563+
const summaryPrompt = `
564+
Analyze the following development session and provide a structured summary in JSON format:
565+
566+
- "summary": A detailed summary of the session
567+
- "title": A concise, descriptive title (max 10 words)
568+
- "category": One of: 'bugfix', 'feature', 'refactoring', 'documentation', 'new-app', 'analysis', 'other'
569+
- "quality": Score 0-100 rating the session's flow and focus
570+
- "productivity": Score 0-100 indicating if primary goals were achieved
571+
- "impact": Score 0-100 estimating the significance of changes
572+
- "complexity": Score 0-100 rating the task's complexity
573+
- "technologies": Array of key technologies involved
574+
575+
${aggregatedContent}
576+
`;
577+
578+
const result = await this.model.generateContent(summaryPrompt);
579+
const summaryData = JSON.parse(result.response.text());
580+
581+
// Embed the summary for semantic search
582+
const embeddingResult = await this.embeddingModel.embedContent(summaryData.summary);
583+
584+
await this.sessionsCollection.upsert({
585+
ids: [`summary_${sessionId}`],
586+
embeddings: [embeddingResult.embedding.values],
587+
metadatas: [{
588+
sessionId,
589+
timestamp: new Date().toISOString(),
590+
memoryCount: memories.ids.length,
591+
...summaryData
592+
}],
593+
documents: [summaryData.summary]
594+
});
595+
596+
return { sessionId, title: summaryData.title, memoryCount: memories.ids.length };
597+
}
598+
```
599+
600+
#### The Two-Stage Query Protocol
601+
602+
The Memory Core implements a sophisticated **two-stage query strategy** for recalling past work:
603+
604+
**Stage 1: Query Summaries** (Fast)
605+
When you need high-level context about past work, query the summary collection first:
606+
607+
```javascript
608+
async querySummaries({ query, nResults, category }) {
609+
const collection = await ChromaManager.getSummaryCollection();
610+
const embedding = await TextEmbeddingService.embedText(query);
611+
612+
const queryArgs = {
613+
queryEmbeddings: [embedding],
614+
nResults,
615+
include: ['metadatas', 'documents']
616+
};
617+
618+
if (category) {
619+
queryArgs.where = { category };
620+
}
621+
622+
const searchResult = await collection.query(queryArgs);
623+
624+
// Calculate relevance scores from vector distances
625+
const summaries = ids.map((id, index) => {
626+
const distance = Number(distances[index] ?? 0);
627+
const relevanceScore = Number((1 / (1 + distance)).toFixed(6));
628+
629+
return {
630+
id,
631+
sessionId: metadata.sessionId,
632+
title: metadata.title,
633+
summary: documents[index],
634+
category: metadata.category,
635+
quality: Number(metadata.quality),
636+
productivity: Number(metadata.productivity),
637+
impact: Number(metadata.impact),
638+
complexity: Number(metadata.complexity),
639+
technologies: metadata.technologies.split(','),
640+
distance,
641+
relevanceScore
642+
};
643+
});
644+
645+
return { query, count: summaries.length, results: summaries };
646+
}
647+
```
648+
649+
**Stage 2: Query Raw Memories** (Deep)
650+
Once you've identified relevant sessions, drill down into the raw interaction data:
651+
652+
```javascript
653+
async queryMemories({ query, nResults, sessionId }) {
654+
const collection = await ChromaManager.getMemoryCollection();
655+
const embedding = await TextEmbeddingService.embedText(query);
656+
657+
const queryArgs = {
658+
queryEmbeddings: [embedding],
659+
nResults,
660+
include: ['metadatas']
661+
};
662+
663+
// Optional: Filter to specific session
664+
if (sessionId) {
665+
queryArgs.where = { sessionId };
666+
}
667+
668+
const searchResult = await collection.query(queryArgs);
669+
670+
return {
671+
query,
672+
count: memories.length,
673+
results: memories.map((memory, index) => ({
674+
...memory,
675+
distance: distances[index],
676+
relevanceScore: 1 / (1 + distances[index])
677+
}))
678+
};
679+
}
680+
```
681+
682+
This two-stage approach is powerful because:
683+
1. **Summaries are fast** - Pre-processed, high-level overviews for quick context
684+
2. **Memories are detailed** - Full reasoning chains for deep investigation
685+
3. **Categories enable filtering** - Find all "refactoring" or "bugfix" sessions instantly
686+
4. **Quality metrics enable sorting** - Prioritize high-productivity sessions
687+
688+
#### Structured Session Metadata
689+
690+
The quality metrics generated by the summarization process provide valuable insights:
691+
692+
```javascript
693+
{
694+
"quality": 85, // Was the session focused and productive?
695+
"productivity": 90, // Were the goals achieved?
696+
"impact": 75, // How significant were the changes?
697+
"complexity": 60, // How difficult was the task?
698+
"category": "feature", // What type of work was this?
699+
"technologies": ["neo.mjs", "chromadb", "nodejs"]
700+
}
701+
```
702+
703+
These aren't just numbers—they enable **performance analysis over time**. The agent (and we) can ask:
704+
- "Show me all high-complexity sessions where productivity was low" (areas for improvement)
705+
- "What features did I build with impact > 80?" (highlight reel)
706+
- "Which refactoring sessions had quality < 50?" (sessions that went off-track)
707+
454708
This capability is critical for several reasons:
455709

456710
1. **Learning & Self-Correction:** By querying its own history, the agent can identify patterns in its work, recall past solutions to similar problems, and avoid repeating mistakes. It can ask itself, "How did I solve that bug last week?" and get a concrete answer from its own experience.
457711
2. **Contextual Continuity:** An agent with memory can maintain context across days or even weeks. It can pick up a complex refactoring task exactly where it left off, without needing to be re-briefed on the entire history.
458712
3. **Performance Analysis:** The session summaries include metrics on quality, productivity, and complexity. This allows us (and the agent itself) to analyze its performance over time, identifying areas for improvement in its own problem-solving strategies.
459713
4. **Transactional Integrity:** The protocol for saving memories is transactional and mandatory. The agent *must* save a consolidated record of its entire turn (prompt, thought, response) before delivering its final answer. This "save-then-respond" loop, enforced by the `add_memory` tool, guarantees that no experience is ever lost, creating a rich and honest record of the entire problem-solving process.
460714

715+
#### The Neo.mjs Backbone
716+
717+
Like all three MCP servers, the Memory Core is built using the **official MCP SDK** for protocol compliance, but its internal architecture is pure **Neo.mjs**. Every service—`MemoryService`, `SessionService`, `SummaryService`, `HealthService`—is a Neo.mjs singleton that extends `Neo.core.Base`.
718+
719+
This demonstrates a key design principle: **Neo.mjs isn't just for browsers**. The same class system that powers complex frontend applications also provides:
720+
- **Singleton pattern** for service management
721+
- **Async initialization** (`initAsync()`) for startup sequences
722+
- **Observable pattern** for event-driven architecture
723+
- **Configuration management** via `static config`
724+
461725
The Memory Core is the foundation for an agent that doesn't just execute tasks, but grows, learns, and improves with every interaction. It's the key to building a partner that truly understands the long-term narrative of the project.
462726
463727
## The GitHub Workflow Server: Closing the Loop

0 commit comments

Comments
 (0)