Skip to content

Update memories.py#4864

Closed
john-gitdev wants to merge 7 commits intoBasedHardware:mainfrom
john-gitdev:delete-memories-fix
Closed

Update memories.py#4864
john-gitdev wants to merge 7 commits intoBasedHardware:mainfrom
john-gitdev:delete-memories-fix

Conversation

@john-gitdev
Copy link
Copy Markdown
Collaborator

@john-gitdev john-gitdev commented Feb 18, 2026

delete associated transcripts when a summary (conversation) is deleted

delete associated transcripts when a summary is deleted
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces functionality to automatically delete associated conversation transcripts and their corresponding vector embeddings when a memory is deleted. This is a good step towards maintaining data consistency and preventing orphaned data in the system.

Comment thread backend/routers/memories.py Outdated
add try and throw error

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@john-gitdev john-gitdev marked this pull request as draft February 18, 2026 07:49
update to delete audio blob of the deleted transcript/summary
updated error log to printf to match conversations.py
update to delete audio for single transcript and summary
update to remove audio files related to deleted conversations and summaries
@john-gitdev john-gitdev marked this pull request as ready for review February 18, 2026 16:45
@john-gitdev
Copy link
Copy Markdown
Collaborator Author

john-gitdev commented Feb 18, 2026

updated storage and conversations to delete the associated audio file as well

@john-gitdev
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the deletion logic for memories and conversations. When a memory is deleted, it now correctly triggers the deletion of the associated conversation, its vector embeddings, and all related audio files. Similarly, deleting a conversation now also cleans up its associated audio files. The implementation uses try-except blocks to ensure that failures in deleting associated files do not cause the main deletion operation to fail, which is a good resilient pattern. My feedback focuses on improving logging by using the standard logging module instead of print for better error tracking and consistency.

delete_conversation_audio_files(uid, conversation_id)
delete_conversation_recording(uid, conversation_id)
except Exception as e:
print(f"Failed to delete audio files for conversation {conversation_id}: {e}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

For better observability and consistency with other parts of the application (like routers/memories.py), please use the logging module instead of print for error messages. This will ensure errors are properly captured by your logging infrastructure. You'll need to add import logging at the top of the file.

Suggested change
print(f"Failed to delete audio files for conversation {conversation_id}: {e}")
logging.error(f"Failed to delete audio files for conversation {conversation_id}: {e}")

delete_conversation_recording(uid, conversation_id) # memories_recordings_bucket
except Exception as e:
# Log the error, but don't block the memory deletion as it's already done
print(f"Failed to delete conversation {conversation_id} or its vector for memory {memory_id}: {e}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This file already imports and uses the logging module. For consistency and to ensure errors are properly tracked, please use logger.error or logger.warning instead of print.

Suggested change
print(f"Failed to delete conversation {conversation_id} or its vector for memory {memory_id}: {e}")
logger.error(f"Failed to delete conversation {conversation_id} or its vector for memory {memory_id}: {e}")

@john-gitdev
Copy link
Copy Markdown
Collaborator Author

@aaravgarg @mdmohsin7 thoughts? up to you on how to handle error logging

@beastoin
Copy link
Copy Markdown
Collaborator

Hey 👋 — thanks for putting this together! Before we can review, could you share a quick live demo (screenshot, screen recording, or terminal output) showing this working on your local or dev environment?

In the AI era, writing code is the easy part — what really makes a PR stand out is proof that it works end-to-end. A short video or even a screenshot goes a long way in helping reviewers feel confident about merging.

Feel free to update this PR whenever you have something to show. Thanks! 🙏

@john-gitdev
Copy link
Copy Markdown
Collaborator Author

john-gitdev commented Feb 19, 2026

https://www.dropbox.com/scl/fi/2hzcfwh0u8f83gur9bsax/screen-20260219-114550-1771530281340.mp4?rlkey=ah25vqy6zmh3ifzk14qpscjbb&dl=0

Here you can see a transcript with mention of 'mega potion'

I ask omi if i mentioned 'mega potion' and it says yes

I deleted the summary, and then omi says no I didnt mention mega potion

I can't verify backend audio was deleted, if you want me to remove that code snippet so that just the transcript gets removed, i can do that, you just have orphaned audio data on your backend

@beastoin

@mdmohsin7
Copy link
Copy Markdown
Member

@john-gitdev don't you think we should let the user decide whether to also delete the conversation (while deleting the memory) or not? Some might just want to delete the memory and not the conversation?

@john-gitdev
Copy link
Copy Markdown
Collaborator Author

@john-gitdev don't you think we should let the user decide whether to also delete the conversation (while deleting the memory) or not? Some might just want to delete the memory and not the conversation?

for me personally, if I delete a conversation, I don't want 'ask omi' to reference it at all anymore. however, I can see your point. I can either ask discord users for their preference or I could add a toggle in the developer settings (under experimental) to determine if the deletion of the transcript would occur or not upon deletion of the conversation

@github-actions
Copy link
Copy Markdown
Contributor

Hey @john-gitdev 👋

Thank you so much for taking the time to contribute to Omi! We truly appreciate you putting in the effort to submit this pull request.

After careful review, we've decided not to merge this particular PR. Please don't take this personally — we genuinely try to merge as many contributions as possible, but sometimes we have to make tough calls based on:

  • Project standards — Ensuring consistency across the codebase
  • User needs — Making sure changes align with what our users need
  • Code best practices — Maintaining code quality and maintainability
  • Project direction — Keeping aligned with our roadmap and vision

Your contribution is still valuable to us, and we'd love to see you contribute again in the future! If you'd like feedback on how to improve this PR or want to discuss alternative approaches, please don't hesitate to reach out.

Thank you for being part of the Omi community! 💜

@john-gitdev john-gitdev deleted the delete-memories-fix branch February 23, 2026 08:22
@beastoin
Copy link
Copy Markdown
Collaborator

We see you provided a video demo, engaged constructively with the UX feedback, and proposed multiple solutions. The discussion stalled on our side — that's on us. If you'd like to revisit this, we're happy to continue the conversation.

@beastoin
Copy link
Copy Markdown
Collaborator

Hey @john-gitdev — thanks for your patience, and sorry again that the discussion stalled on our end while you were actively engaging.

We've done a deeper code review of your diff and wanted to share feedback so you have a clear path forward if you'd like to resubmit:

What's good:

  • The conversation-level audio cleanup (delete_conversation_audio_files) is solid — cleaning up orphaned audio blobs is a real need.
  • Your demo video clearly showed the feature working end-to-end. The "mega potion" test was a smart way to prove the delete cascaded to the LLM context.

Items that need attention:

1. Over-deletion risk (high)
A single conversation can generate multiple memories (the backend creates separate memory entries from one transcript). Your current implementation deletes the entire conversation when any one of its memories is deleted. This means deleting Memory A could also wipe the transcript context for Memory B, C, etc. — which the user didn't intend.

@mdmohsin7 raised this exact concern, and after reviewing the code we think he was right: the user should have a choice, or at minimum the cascade should only happen when the last memory linked to that conversation is deleted.

2. API consistency (medium)
The cascade behavior would exist in the v3 memories route, but other memory-delete paths (like the MCP route) still delete only the memory. This inconsistency could confuse API consumers.

3. Error handling (low)
Gemini's earlier suggestion about using logging instead of print for error output is worth adopting for production consistency.

Suggested path forward:

  • Scope the cascade: only delete the conversation when the user explicitly deletes the conversation itself, or when the last linked memory is removed
  • Keep the audio cleanup logic — it's good and addresses real storage orphaning
  • Consider the toggle approach you proposed to @mdmohsin7

Your demo was honest — you even noted "I can't verify backend audio was deleted" which was transparent and appreciated. The core idea is sound; it just needs tighter scoping around user intent.

Happy to discuss further if you'd like to pick this back up.

@beastoin
Copy link
Copy Markdown
Collaborator

@john-gitdev Some pointers on the items mentioned above:

Over-deletion: The core issue is that one conversation can have multiple memories. Before deleting the conversation, you'd want to check if any other memories still reference the same conversation_id. Look at how memories are stored in backend/database/memories.py — a query filtering by conversation_id excluding the memory being deleted will tell you if it's safe to cascade.

API consistency: Right now the cascade would only exist in the v3 memories route. The MCP route (backend/routers/mcp.py) also has a memory delete endpoint that doesn't cascade. Either add the same logic there, or consider keeping cascade behavior only on conversation delete (not memory delete) — which is cleaner semantically.

User choice (mdmohsin7's point): An optional query parameter on the delete endpoint (defaulting to no cascade) would let the caller decide. That way the default behavior is safe and explicit.

Regression test idea: Two memories sharing one conversation — delete one memory, verify conversation still exists. Delete the second, verify conversation is cleaned up.

Happy to discuss the approach if you'd like to pick this back up.

@john-gitdev
Copy link
Copy Markdown
Collaborator Author

thanks @beastoin

Actually, I was wrong and my video was not valid. This was a bug I was tracking since last December.

However, the behavior was already fixed this commit: 6569495

The bug existed specifically in the version using search_conversations() from utils/conversations/search.py.
That function queried Typesense directly and returned results straight to the LLM without a Firestore lookup:

When a conversation was deleted, it was removed from Firestore but never removed from Typesense, so it remained in the search index and the LLM could still reference it.

The current code reverted to Pinecone-based search, which does a Firestore fetch after the vector lookup. Since deleted conversations are gone from Firestore, they get silently dropped before reaching the LLM.

So my video evidence was flawed -- the behavior is already fixed as of the current version.

However, the backend bug still exists and Typesense will continue to store orphan, stale data until it is fixed. Sorry for the inconvenience. I can redo the branch and factor in mohsin's and your comments if you want, but I am not running my own backend so my evidence is not valid. I'm still learning, thanks for being patient with me

@beastoin @mdmohsin7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants