feat: port persistent memory, BM25 Okapi, and time awareness to Go#4
feat: port persistent memory, BM25 Okapi, and time awareness to Go#4cybersecua merged 1 commit intomainfrom
Conversation
Implements three core intelligence features ported from the Python
adaptive_agent framework into the CyberStrikeAI Go codebase.
### Persistent Memory (internal/agent/persistent_memory.go)
- SQLite-backed key-value store that survives conversation compression
and server restarts (table: agent_memories)
- Five categories: credential, target, vulnerability, fact, note
- Four new MCP agent tools: store_memory, retrieve_memory,
list_memories, delete_memory
- Memory context block auto-injected into every system prompt
- Configurable via agent.memory.{enabled, max_entries}
### Corpus-Level BM25 Okapi (internal/knowledge/bm25.go)
- Full BM25 Okapi implementation with real IDF:
IDF(t) = log((N - n(t) + 0.5) / (n(t) + 0.5) + 1)
- BM25CorpusIndexer rebuilt from all knowledge chunks on startup
- BM25CorpusIndexer.ScoreText() replaces the previous per-document
approximation in retriever.go
- Configurable k1, b, and delta (BM25+) parameters
- Score normalised via tanh for hybrid blending compatibility
### Time Awareness (internal/agent/time_awareness.go)
- Current date/time, timezone, and session age injected into every
system prompt via <time_context> XML block
- New get_current_time MCP tool for on-demand queries
- Configurable timezone (IANA) via agent.time_awareness.{enabled, timezone}
- Defaults to UTC; backward-compatible (enabled by default for new installs)
### Wiring & Config
- config.go: TimeAwarenessConfig and MemoryConfig structs with defaults
- app.go: initialisation + tool registration before agent variable creation
(avoids package-name shadowing)
- builtin/constants.go: 5 new tool name constants
- config.yaml: documented new sections with inline comments
- Version bumped to v1.4.0
### Docs
- README.md: new Persistent Memory and Time Awareness sections,
updated Highlights, Knowledge Base BM25 description, and
Configuration Reference
- ROADMAP.md: marked shipped items, added Memory UI and BM25
persistence as near-term items
https://claude.ai/code/session_01Uie29eQJ2a6E2M4a3rm5vj
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the agent's intelligence capabilities by integrating persistent memory, advanced knowledge retrieval with BM25 Okapi, and time awareness. These additions enable the agent to maintain context and learn across sessions, improve the accuracy of information retrieval, and make more time-sensitive decisions, ultimately leading to a more robust and capable AI security testing platform. Highlights
Changelog
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces significant features like persistent memory, corpus-level BM25 Okapi, and time awareness, enhancing the agent's intelligence. However, the implementation of persistent memory has critical security flaws, including global memory sharing that could lead to data leakage of sensitive information, and a lack of escaping when injecting memories into the system prompt, enabling persistent indirect prompt injection. Additionally, a missing safety check in the store_memory tool handler can cause a remote denial-of-service via application panic. Beyond these security concerns, the review also identified a critical issue with unsafe type assertions in a tool handler that could cause a panic, and areas for improvement in database transaction handling, error logging, and configuration logic to enhance robustness.
| func (pm *PersistentMemory) BuildContextBlock() string { | ||
| entries, err := pm.List("", 100) | ||
| if err != nil || len(entries) == 0 { | ||
| return "" | ||
| } |
There was a problem hiding this comment.
The persistent memory implementation lacks conversation isolation. The BuildContextBlock method retrieves memories globally from the database using pm.List("", 100) and injects them into the system prompt of every conversation. This leads to sensitive data leakage (e.g., credentials stored in one session being visible in another) and cross-conversation poisoning. In a multi-user or multi-target environment, this allows one session to access or influence the data of another.
| for _, item := range items { | ||
| sb.WriteString(fmt.Sprintf(" • %s: %s\n", item.Key, item.Value)) | ||
| } |
There was a problem hiding this comment.
Memory entries are injected into the system prompt without escaping. An attacker can craft a memory entry containing XML closing tags (e.g., </persistent_memory>) to break out of the context block and inject malicious instructions into the agent's system prompt. This is a form of Indirect Prompt Injection that persists across sessions.
|
|
||
| // initialize PersistentMemory before creating the agent variable | ||
| var persistentMem *agent.PersistentMemory | ||
| memEnabled := cfg.Agent.Memory.Enabled || cfg.Agent.Memory.MaxEntries == 0 |
There was a problem hiding this comment.
The logic to determine if persistent memory is enabled, memEnabled := cfg.Agent.Memory.Enabled || cfg.Agent.Memory.MaxEntries == 0, can be confusing and lead to unexpected behavior. According to the configuration comments, max_entries: 0 means 'unlimited', which implies the feature is enabled. However, this logic means a user cannot disable the feature by setting enabled: false if they also set max_entries: 0. For example, memory: { enabled: false, max_entries: 0 } would result in memEnabled being true. This is counter-intuitive. A clearer approach would be to respect the enabled flag explicitly, similar to how taEnabled is handled for time awareness.
| cat := agent.MemoryCategory(args["category"].(string) + "") | ||
| if cat == "" { | ||
| cat = agent.MemoryCategoryFact | ||
| } |
There was a problem hiding this comment.
The store_memory tool handler performs an unsafe type assertion on the category argument. Since this argument is optional in the tool's schema (not included in the required list on line 1298), an LLM call that omits it will cause args["category"] to be nil, leading to a runtime panic and application crash, potentially causing a remote denial-of-service. This issue also highlights a general problem with unsafe type assertions for parameters like key and value. It is crucial to use the two-value form of type assertion (val, ok := ...) to handle missing or incorrect types gracefully and prevent application crashes.
| cat := agent.MemoryCategory(args["category"].(string) + "") | |
| if cat == "" { | |
| cat = agent.MemoryCategoryFact | |
| } | |
| catStr, _ := args["category"].(string) | |
| cat := agent.MemoryCategory(catStr) | |
| if cat == "" { | |
| cat = agent.MemoryCategoryFact | |
| } |
| func (pm *PersistentMemory) migrate() error { | ||
| createTable := ` | ||
| CREATE TABLE IF NOT EXISTS agent_memories ( | ||
| id TEXT PRIMARY KEY, | ||
| key TEXT NOT NULL, | ||
| value TEXT NOT NULL, | ||
| category TEXT NOT NULL DEFAULT 'fact', | ||
| conversation_id TEXT, | ||
| created_at DATETIME NOT NULL, | ||
| updated_at DATETIME NOT NULL | ||
| ); | ||
| CREATE INDEX IF NOT EXISTS idx_agent_memories_category ON agent_memories(category); | ||
| CREATE INDEX IF NOT EXISTS idx_agent_memories_key ON agent_memories(key); | ||
| CREATE INDEX IF NOT EXISTS idx_agent_memories_conversation ON agent_memories(conversation_id); | ||
| ` | ||
| if _, err := pm.db.Exec(createTable); err != nil { | ||
| return err | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
The migrate function executes multiple SQL statements (CREATE TABLE and CREATE INDEX) in a single db.Exec call. This is not transactional. If the table creation succeeds but one of the index creations fails, the database will be left in an inconsistent state. It's better to wrap these operations in a transaction to ensure atomicity.
func (pm *PersistentMemory) migrate() error {
tx, err := pm.db.Begin()
if err != nil {
return err
}
defer tx.Rollback() // The rollback will be ignored if the tx has been committed.
createTable := `
CREATE TABLE IF NOT EXISTS agent_memories (
id TEXT PRIMARY KEY,
key TEXT NOT NULL,
value TEXT NOT NULL,
category TEXT NOT NULL DEFAULT 'fact',
conversation_id TEXT,
created_at DATETIME NOT NULL,
updated_at DATETIME NOT NULL
);`
if _, err := tx.Exec(createTable); err != nil {
return err
}
indexes := []string{
"CREATE INDEX IF NOT EXISTS idx_agent_memories_category ON agent_memories(category);",
"CREATE INDEX IF NOT EXISTS idx_agent_memories_key ON agent_memories(key);",
"CREATE INDEX IF NOT EXISTS idx_agent_memories_conversation ON agent_memories(conversation_id);",
}
for _, index := range indexes {
if _, err := tx.Exec(index); err != nil {
return err
}
}
return tx.Commit()
}| if t, err := time.Parse(time.RFC3339Nano, createdAt); err == nil { | ||
| e.CreatedAt = t | ||
| } | ||
| if t, err := time.Parse(time.RFC3339Nano, updatedAt); err == nil { | ||
| e.UpdatedAt = t | ||
| } |
There was a problem hiding this comment.
Errors from time.Parse are silently ignored. If a malformed date string is present in the database for any reason, the CreatedAt or UpdatedAt fields will be zero-valued without any indication of a problem. This could lead to subtle bugs related to data ordering or display. It would be better to log these parsing errors.
if t, err := time.Parse(time.RFC3339Nano, createdAt); err == nil {
e.CreatedAt = t
} else if createdAt != "" {
pm.logger.Warn("failed to parse CreatedAt timestamp for memory entry", zap.String("value", createdAt), zap.Error(err))
}
if t, err := time.Parse(time.RFC3339Nano, updatedAt); err == nil {
e.UpdatedAt = t
} else if updatedAt != "" {
pm.logger.Warn("failed to parse UpdatedAt timestamp for memory entry", zap.String("value", updatedAt), zap.Error(err))
}| id, _ := args["id"].(string) | ||
| if id == "" { | ||
| return &mcp.ToolResult{ | ||
| Content: []mcp.Content{{Type: "text", Text: "Error: id parameter is required"}}, | ||
| IsError: true, | ||
| }, nil | ||
| } |
There was a problem hiding this comment.
The type assertion id, _ := args["id"].(string) is not fully safe. While it won't panic if the key is missing (it will result in an empty string which is checked), it's not idiomatic Go. It's better to use the two-value form of type assertion (val, ok := ...) to explicitly check if the key exists and has the correct type. This makes the code clearer and more robust.
| id, _ := args["id"].(string) | |
| if id == "" { | |
| return &mcp.ToolResult{ | |
| Content: []mcp.Content{{Type: "text", Text: "Error: id parameter is required"}}, | |
| IsError: true, | |
| }, nil | |
| } | |
| id, ok := args["id"].(string) | |
| if !ok || id == "" { | |
| return &mcp.ToolResult{ | |
| Content: []mcp.Content{{Type: "text", Text: "Error: id parameter is required"}}, | |
| IsError: true, | |
| }, nil | |
| } |
Implements three core intelligence features ported from the Python adaptive_agent framework into the CyberStrikeAI Go codebase.
Persistent Memory (internal/agent/persistent_memory.go)
Corpus-Level BM25 Okapi (internal/knowledge/bm25.go)
Time Awareness (internal/agent/time_awareness.go)
Wiring & Config
Docs