Skip to content

fix: strip injected tags from captured memories and dedup fallback turns#3

Merged
Dhravya merged 1 commit intomainfrom
fix/capture-cleanContent-dedup
Apr 28, 2026
Merged

fix: strip injected tags from captured memories and dedup fallback turns#3
Dhravya merged 1 commit intomainfrom
fix/capture-cleanContent-dedup

Conversation

@sreedharsreeram
Copy link
Copy Markdown
Member

No description provided.

@sreedharsreeram sreedharsreeram requested a review from Dhravya April 28, 2026 01:06
Copy link
Copy Markdown

@vorflux vorflux Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed — found 2 issues.


Review with Vorflux

Comment thread src/hooks/capture.ts
messages = [
{
role: "assistant",
content: cleanContent(stripPrivateContent(payload.last_assistant_message)),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug (medium): fallback path can ingest empty [assistant] memories

The cleanContent(stripPrivateContent(...)) call can return an empty string when last_assistant_message consists entirely of injected tags (e.g. a <system-reminder> or <supermemory-context> block). In that case messages is still set to a one-element array, so messages.length === 0 is never true and capture proceeds with empty content. SupermemoryClient.formatConversationMessage() will then store a meaningless [assistant] placeholder memory.

Fix: guard the push with an empty-content check, the same way parseTranscript does:

const cleaned = cleanContent(stripPrivateContent(payload.last_assistant_message));
if (cleaned) {
  messages = [{ role: "assistant", content: cleaned }];
  newLamHash = currentHash;
}

Comment thread src/services/privacy.ts

export function cleanContent(content: string): string {
return content
.replace(/<system-reminder>[\s\S]*?<\/system-reminder>/gi, "")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug (medium): cleanContent() silently truncates legitimate user/assistant content

The regexes strip every <system-reminder>…</system-reminder> and <supermemory-context>…</supermemory-context> block from any message, regardless of who wrote it or why. In a coding assistant context — especially in this very repository — users and assistants can legitimately discuss, paste, or generate those exact tag names as code examples or documentation. When that happens the stored memory is silently truncated or emptied, which is data loss with no warning.

The stripping should be scoped to actual injected wrapper content only. One approach is to apply cleanContent() solely to the outermost message envelope (i.e. before splitting into individual turns) rather than to every individual turn's text, so that tags embedded inside a turn's prose are preserved.

@Dhravya Dhravya merged commit 8e18add into main Apr 28, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants