Skip to content

feat: port persistent memory, BM25 Okapi, and time awareness to Go#4

Merged
cybersecua merged 1 commit intomainfrom
claude/port-python-to-go-MRiU5
Mar 4, 2026
Merged

feat: port persistent memory, BM25 Okapi, and time awareness to Go#4
cybersecua merged 1 commit intomainfrom
claude/port-python-to-go-MRiU5

Conversation

@cybersecua
Copy link
Owner

Implements three core intelligence features ported from the Python adaptive_agent framework into the CyberStrikeAI Go codebase.

Persistent Memory (internal/agent/persistent_memory.go)

  • SQLite-backed key-value store that survives conversation compression and server restarts (table: agent_memories)
  • Five categories: credential, target, vulnerability, fact, note
  • Four new MCP agent tools: store_memory, retrieve_memory, list_memories, delete_memory
  • Memory context block auto-injected into every system prompt
  • Configurable via agent.memory.{enabled, max_entries}

Corpus-Level BM25 Okapi (internal/knowledge/bm25.go)

  • Full BM25 Okapi implementation with real IDF: IDF(t) = log((N - n(t) + 0.5) / (n(t) + 0.5) + 1)
  • BM25CorpusIndexer rebuilt from all knowledge chunks on startup
  • BM25CorpusIndexer.ScoreText() replaces the previous per-document approximation in retriever.go
  • Configurable k1, b, and delta (BM25+) parameters
  • Score normalised via tanh for hybrid blending compatibility

Time Awareness (internal/agent/time_awareness.go)

  • Current date/time, timezone, and session age injected into every system prompt via <time_context> XML block
  • New get_current_time MCP tool for on-demand queries
  • Configurable timezone (IANA) via agent.time_awareness.{enabled, timezone}
  • Defaults to UTC; backward-compatible (enabled by default for new installs)

Wiring & Config

  • config.go: TimeAwarenessConfig and MemoryConfig structs with defaults
  • app.go: initialisation + tool registration before agent variable creation (avoids package-name shadowing)
  • builtin/constants.go: 5 new tool name constants
  • config.yaml: documented new sections with inline comments
  • Version bumped to v1.4.0

Docs

  • README.md: new Persistent Memory and Time Awareness sections, updated Highlights, Knowledge Base BM25 description, and Configuration Reference
  • ROADMAP.md: marked shipped items, added Memory UI and BM25 persistence as near-term items

Implements three core intelligence features ported from the Python
adaptive_agent framework into the CyberStrikeAI Go codebase.

### Persistent Memory (internal/agent/persistent_memory.go)
- SQLite-backed key-value store that survives conversation compression
  and server restarts (table: agent_memories)
- Five categories: credential, target, vulnerability, fact, note
- Four new MCP agent tools: store_memory, retrieve_memory,
  list_memories, delete_memory
- Memory context block auto-injected into every system prompt
- Configurable via agent.memory.{enabled, max_entries}

### Corpus-Level BM25 Okapi (internal/knowledge/bm25.go)
- Full BM25 Okapi implementation with real IDF:
  IDF(t) = log((N - n(t) + 0.5) / (n(t) + 0.5) + 1)
- BM25CorpusIndexer rebuilt from all knowledge chunks on startup
- BM25CorpusIndexer.ScoreText() replaces the previous per-document
  approximation in retriever.go
- Configurable k1, b, and delta (BM25+) parameters
- Score normalised via tanh for hybrid blending compatibility

### Time Awareness (internal/agent/time_awareness.go)
- Current date/time, timezone, and session age injected into every
  system prompt via <time_context> XML block
- New get_current_time MCP tool for on-demand queries
- Configurable timezone (IANA) via agent.time_awareness.{enabled, timezone}
- Defaults to UTC; backward-compatible (enabled by default for new installs)

### Wiring & Config
- config.go: TimeAwarenessConfig and MemoryConfig structs with defaults
- app.go: initialisation + tool registration before agent variable creation
  (avoids package-name shadowing)
- builtin/constants.go: 5 new tool name constants
- config.yaml: documented new sections with inline comments
- Version bumped to v1.4.0

### Docs
- README.md: new Persistent Memory and Time Awareness sections,
  updated Highlights, Knowledge Base BM25 description, and
  Configuration Reference
- ROADMAP.md: marked shipped items, added Memory UI and BM25
  persistence as near-term items

https://claude.ai/code/session_01Uie29eQJ2a6E2M4a3rm5vj
@cybersecua cybersecua merged commit 48b4499 into main Mar 4, 2026
@cybersecua cybersecua deleted the claude/port-python-to-go-MRiU5 branch March 4, 2026 11:37
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the agent's intelligence capabilities by integrating persistent memory, advanced knowledge retrieval with BM25 Okapi, and time awareness. These additions enable the agent to maintain context and learn across sessions, improve the accuracy of information retrieval, and make more time-sensitive decisions, ultimately leading to a more robust and capable AI security testing platform.

Highlights

  • Persistent Memory Implementation: Introduced a SQLite-backed key-value store for agents to retain information across conversations and server restarts. This includes five memory categories (credential, target, vulnerability, fact, note) and four new MCP agent tools: store_memory, retrieve_memory, list_memories, and delete_memory. Memory context is automatically injected into system prompts and is configurable.
  • Corpus-Level BM25 Okapi Integration: Implemented a full BM25 Okapi algorithm with real Inverse Document Frequency (IDF) for enhanced knowledge base hybrid retrieval. A BM25CorpusIndexer is built from all knowledge chunks on startup, replacing the previous per-document approximation, and configurable k1, b, and delta parameters are available. Scores are normalized using tanh for hybrid blending compatibility.
  • Time Awareness Feature: Added functionality to inject current date/time, timezone, and session age into every agent system prompt via an XML block. A new get_current_time MCP tool allows for on-demand time queries, and the feature is configurable with IANA timezones, defaulting to UTC.
  • Configuration and Wiring Updates: New TimeAwarenessConfig and MemoryConfig structs were added to config.go with default values. The app.go file was updated for proper initialization and tool registration of these new features, and five new tool name constants were added to builtin/constants.go. The config.yaml now includes documented sections for time awareness and persistent memory, and the application version was bumped to v1.4.0.
  • Documentation Enhancements: The README.md was updated to reflect the new Persistent Memory and Time Awareness features, an updated description of the Knowledge Base BM25, and an expanded Configuration Reference. The ROADMAP.md was also updated to mark shipped items and add new near-term items related to memory UI and BM25 persistence.
Changelog
  • README.md
    • Updated the main features list to include Persistent Memory and Time Awareness.
    • Revised the Knowledge Base section to highlight corpus-level BM25 Okapi hybrid retrieval and configurable hybrid weight.
    • Added new dedicated sections detailing Persistent Memory and Time Awareness, including their purpose, categories/tools, configuration, and injected context block examples.
  • ROADMAP.md
    • Marked 'Corpus-level BM25 Okapi', 'Persistent memory', and 'Time awareness' as completed items under 'Agent Intelligence'.
    • Updated 'Agent memory improvements' to reflect the new persistent memory store and BM25 corpus index.
    • Added 'Memory UI panel' and 'Memory expiry / TTL' as near-term items.
    • Added 'BM25 index persistence' as a near-term item under 'Knowledge Base'.
    • Updated the 'Last updated' date to 2026-03-04.
  • config.yaml
    • Bumped the application version from v1.3.16 to v1.4.0.
    • Added new time_awareness configuration section under agent, including enabled and timezone settings.
    • Added new memory configuration section under agent, including enabled and max_entries settings.
  • internal/agent/agent.go
    • Added persistentMemory and timeAwareness fields to the Agent struct.
    • Introduced SetPersistentMemory and SetTimeAwareness methods to attach these new components to the agent.
    • Updated the GetSystemPrompt method to inject persistent_memory and time_context blocks into the system prompt.
  • internal/agent/persistent_memory.go
    • Added a new file defining MemoryCategory and MemoryEntry types.
    • Implemented PersistentMemory struct with methods for NewPersistentMemory, migrate, Store, Retrieve, List, Delete, and BuildContextBlock for managing SQLite-backed agent memories.
  • internal/agent/time_awareness.go
    • Added a new file defining the TimeAwareness struct.
    • Implemented NewTimeAwareness, Now, SessionElapsed, BuildContextBlock, and FormatCurrentTime methods to provide and format temporal context for the agent.
  • internal/app/app.go
    • Imported the strings package.
    • Initialized TimeAwareness and PersistentMemory instances based on configuration, including backward compatibility for time awareness.
    • Registered get_current_time tool and the four persistent memory tools (store_memory, retrieve_memory, list_memories, delete_memory) with the MCP server.
    • Attached the initialized TimeAwareness and PersistentMemory components to the agent instance.
    • Updated references to the agent variable to agentInstance to avoid shadowing.
  • internal/config/config.go
    • Defined new TimeAwarenessConfig and MemoryConfig structs.
    • Embedded TimeAwarenessConfig and MemoryConfig into the AgentConfig struct.
    • Updated the Default() function to include default values for TimeAwarenessConfig and MemoryConfig.
  • internal/knowledge/bm25.go
    • Added a new file implementing the BM25 Okapi algorithm.
    • Defined BM25Params and BM25Index structs for corpus-level BM25 indexing and scoring.
    • Implemented Add, Build, Score, ScoreAll, and ScoreText methods for managing and querying the BM25 index.
    • Introduced BM25CorpusIndexer as a wrapper to keep the BM25 index synchronized with the knowledge embeddings database.
  • internal/knowledge/retriever.go
    • Removed the strings import.
    • Added a bm25Index field of type *BM25CorpusIndexer to the Retriever struct.
    • Modified NewRetriever to initialize and asynchronously rebuild the BM25CorpusIndexer from existing knowledge chunks.
    • Updated the bm25Score method to delegate scoring to the BM25CorpusIndexer and normalize the score using math.Tanh.
  • internal/mcp/builtin/constants.go
    • Added ToolGetCurrentTime constant for the new time awareness tool.
    • Added ToolStoreMemory, ToolRetrieveMemory, ToolListMemories, and ToolDeleteMemory constants for the new persistent memory tools.
    • Updated IsBuiltinTool and GetAllBuiltinTools functions to include the newly added tool constants.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant features like persistent memory, corpus-level BM25 Okapi, and time awareness, enhancing the agent's intelligence. However, the implementation of persistent memory has critical security flaws, including global memory sharing that could lead to data leakage of sensitive information, and a lack of escaping when injecting memories into the system prompt, enabling persistent indirect prompt injection. Additionally, a missing safety check in the store_memory tool handler can cause a remote denial-of-service via application panic. Beyond these security concerns, the review also identified a critical issue with unsafe type assertions in a tool handler that could cause a panic, and areas for improvement in database transaction handling, error logging, and configuration logic to enhance robustness.

Comment on lines +248 to +252
func (pm *PersistentMemory) BuildContextBlock() string {
entries, err := pm.List("", 100)
if err != nil || len(entries) == 0 {
return ""
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The persistent memory implementation lacks conversation isolation. The BuildContextBlock method retrieves memories globally from the database using pm.List("", 100) and injects them into the system prompt of every conversation. This leads to sensitive data leakage (e.g., credentials stored in one session being visible in another) and cross-conversation poisoning. In a multi-user or multi-target environment, this allows one session to access or influence the data of another.

Comment on lines +278 to +280
for _, item := range items {
sb.WriteString(fmt.Sprintf(" • %s: %s\n", item.Key, item.Value))
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Memory entries are injected into the system prompt without escaping. An attacker can craft a memory entry containing XML closing tags (e.g., </persistent_memory>) to break out of the context block and inject malicious instructions into the agent's system prompt. This is a form of Indirect Prompt Injection that persists across sessions.


// initialize PersistentMemory before creating the agent variable
var persistentMem *agent.PersistentMemory
memEnabled := cfg.Agent.Memory.Enabled || cfg.Agent.Memory.MaxEntries == 0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic to determine if persistent memory is enabled, memEnabled := cfg.Agent.Memory.Enabled || cfg.Agent.Memory.MaxEntries == 0, can be confusing and lead to unexpected behavior. According to the configuration comments, max_entries: 0 means 'unlimited', which implies the feature is enabled. However, this logic means a user cannot disable the feature by setting enabled: false if they also set max_entries: 0. For example, memory: { enabled: false, max_entries: 0 } would result in memEnabled being true. This is counter-intuitive. A clearer approach would be to respect the enabled flag explicitly, similar to how taEnabled is handled for time awareness.

Comment on lines +1304 to +1307
cat := agent.MemoryCategory(args["category"].(string) + "")
if cat == "" {
cat = agent.MemoryCategoryFact
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The store_memory tool handler performs an unsafe type assertion on the category argument. Since this argument is optional in the tool's schema (not included in the required list on line 1298), an LLM call that omits it will cause args["category"] to be nil, leading to a runtime panic and application crash, potentially causing a remote denial-of-service. This issue also highlights a general problem with unsafe type assertions for parameters like key and value. It is crucial to use the two-value form of type assertion (val, ok := ...) to handle missing or incorrect types gracefully and prevent application crashes.

Suggested change
cat := agent.MemoryCategory(args["category"].(string) + "")
if cat == "" {
cat = agent.MemoryCategoryFact
}
catStr, _ := args["category"].(string)
cat := agent.MemoryCategory(catStr)
if cat == "" {
cat = agent.MemoryCategoryFact
}

Comment on lines +64 to +83
func (pm *PersistentMemory) migrate() error {
createTable := `
CREATE TABLE IF NOT EXISTS agent_memories (
id TEXT PRIMARY KEY,
key TEXT NOT NULL,
value TEXT NOT NULL,
category TEXT NOT NULL DEFAULT 'fact',
conversation_id TEXT,
created_at DATETIME NOT NULL,
updated_at DATETIME NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_agent_memories_category ON agent_memories(category);
CREATE INDEX IF NOT EXISTS idx_agent_memories_key ON agent_memories(key);
CREATE INDEX IF NOT EXISTS idx_agent_memories_conversation ON agent_memories(conversation_id);
`
if _, err := pm.db.Exec(createTable); err != nil {
return err
}
return nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The migrate function executes multiple SQL statements (CREATE TABLE and CREATE INDEX) in a single db.Exec call. This is not transactional. If the table creation succeeds but one of the index creations fails, the database will be left in an inconsistent state. It's better to wrap these operations in a transaction to ensure atomicity.

func (pm *PersistentMemory) migrate() error {
	tx, err := pm.db.Begin()
	if err != nil {
		return err
	}
	defer tx.Rollback() // The rollback will be ignored if the tx has been committed.

	createTable := `
	CREATE TABLE IF NOT EXISTS agent_memories (
		id TEXT PRIMARY KEY,
		key TEXT NOT NULL,
		value TEXT NOT NULL,
		category TEXT NOT NULL DEFAULT 'fact',
		conversation_id TEXT,
		created_at DATETIME NOT NULL,
		updated_at DATETIME NOT NULL
	);`
	if _, err := tx.Exec(createTable); err != nil {
		return err
	}

	indexes := []string{
		"CREATE INDEX IF NOT EXISTS idx_agent_memories_category ON agent_memories(category);",
		"CREATE INDEX IF NOT EXISTS idx_agent_memories_key ON agent_memories(key);",
		"CREATE INDEX IF NOT EXISTS idx_agent_memories_conversation ON agent_memories(conversation_id);",
	}

	for _, index := range indexes {
		if _, err := tx.Exec(index); err != nil {
			return err
		}
	}

	return tx.Commit()
}

Comment on lines +300 to +305
if t, err := time.Parse(time.RFC3339Nano, createdAt); err == nil {
e.CreatedAt = t
}
if t, err := time.Parse(time.RFC3339Nano, updatedAt); err == nil {
e.UpdatedAt = t
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Errors from time.Parse are silently ignored. If a malformed date string is present in the database for any reason, the CreatedAt or UpdatedAt fields will be zero-valued without any indication of a problem. This could lead to subtle bugs related to data ordering or display. It would be better to log these parsing errors.

		if t, err := time.Parse(time.RFC3339Nano, createdAt); err == nil {
			e.CreatedAt = t
		} else if createdAt != "" {
			pm.logger.Warn("failed to parse CreatedAt timestamp for memory entry", zap.String("value", createdAt), zap.Error(err))
		}
		if t, err := time.Parse(time.RFC3339Nano, updatedAt); err == nil {
			e.UpdatedAt = t
		} else if updatedAt != "" {
			pm.logger.Warn("failed to parse UpdatedAt timestamp for memory entry", zap.String("value", updatedAt), zap.Error(err))
		}

Comment on lines +1440 to +1446
id, _ := args["id"].(string)
if id == "" {
return &mcp.ToolResult{
Content: []mcp.Content{{Type: "text", Text: "Error: id parameter is required"}},
IsError: true,
}, nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The type assertion id, _ := args["id"].(string) is not fully safe. While it won't panic if the key is missing (it will result in an empty string which is checked), it's not idiomatic Go. It's better to use the two-value form of type assertion (val, ok := ...) to explicitly check if the key exists and has the correct type. This makes the code clearer and more robust.

Suggested change
id, _ := args["id"].(string)
if id == "" {
return &mcp.ToolResult{
Content: []mcp.Content{{Type: "text", Text: "Error: id parameter is required"}},
IsError: true,
}, nil
}
id, ok := args["id"].(string)
if !ok || id == "" {
return &mcp.ToolResult{
Content: []mcp.Content{{Type: "text", Text: "Error: id parameter is required"}},
IsError: true,
}, nil
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants