Skip to content

Semantic Grove search#478

Merged
NeptuneHub merged 10 commits intomainfrom
devel
May 4, 2026
Merged

Semantic Grove search#478
NeptuneHub merged 10 commits intomainfrom
devel

Conversation

@NeptuneHub
Copy link
Copy Markdown
Owner

This PR want to ad a new functionality that do a search by song in the lyrics by using both Lyrics embbeding (75% weight) and Musicnn embbeding (25% embbeding).

The result should be a similar song list with very similar lyrics that also try to remain in the same genre.

Co-authored-by: Copilot <copilot@github.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements 'SemGrove,' a merged lyrics and audio similarity search feature. It includes a new API blueprint, background index management tasks, and a 'By Song' search interface. Feedback was provided to optimize the loading of segmented indices by streaming data directly from the database to a temporary file, which avoids potential memory issues when handling large index files.

Comment thread tasks/sem_grove_manager.py
Co-authored-by: Copilot <copilot@github.com>
@NeptuneHub
Copy link
Copy Markdown
Owner Author

/gemini review

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

Gemini is experiencing higher than usual traffic and was unable to create the review. Please try again in a few hours by commenting /gemini review.

@NeptuneHub
Copy link
Copy Markdown
Owner Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces "SemGrove," a feature that enables song-by-song similarity search by merging lyrics and audio embeddings into a unified Voyager index. The changes include a new API blueprint, a "By Song" search tab in the UI with autocomplete support, and background tasks for index construction and hot-reloading. Review feedback suggests aligning default weight values with configuration files, enhancing the safety of vector normalization in distance calculations, and utilizing more precise epsilon values for numerical stability.

Comment thread tasks/sem_grove_manager.py Outdated
Comment thread app.py
Comment thread tasks/sem_grove_manager.py Outdated
Comment thread tasks/sem_grove_manager.py
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces "SemGrove," a hybrid search feature that merges lyrics and audio embeddings into a single Voyager index to provide similarity results based on both meaning and sound. The implementation includes a new API blueprint, a manager for index construction and persistence, and a "By Song" search tab in the frontend. Additionally, the track search endpoint was updated to support filtering by the SemGrove index. Feedback focuses on improving code style by moving local imports to the top of the files to adhere to PEP 8.

Comment thread app.py
Comment thread app_lyrics.py
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces 'SemGrove,' a feature that merges lyrics and audio embeddings into a single Voyager index for similarity searches. It adds a management module for index building and querying, a Flask blueprint for the search API, and a 'By Song' search tab in the UI. The feedback suggests moving hardcoded weights to constants, using np.isclose for floating-point comparisons, refactoring redundant normalization logic, and applying np.clip to cosine distance calculations to prevent precision-related errors.

Comment thread tasks/sem_grove_manager.py Outdated
Comment thread tasks/sem_grove_manager.py Outdated
Comment thread tasks/sem_grove_manager.py
Comment thread tasks/sem_grove_manager.py Outdated
Co-authored-by: Copilot <copilot@github.com>
@NeptuneHub
Copy link
Copy Markdown
Owner Author

/gemini review

Co-authored-by: Copilot <copilot@github.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements 'SemGrove' (Semantic + Groove), a search feature that combines lyrics and audio embeddings to provide song-by-song similarity results. The changes include a new API blueprint, integration into the background analysis and cleaning tasks, and a new 'By Song' search tab in the frontend. A review comment identified an inconsistency in the default weight values used for merging embeddings in the manager logic compared to the configuration defaults, which should be aligned for consistency.

Comment thread tasks/sem_grove_manager.py Outdated
NeptuneHub and others added 2 commits May 4, 2026 11:38
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Copilot <copilot@github.com>
@NeptuneHub
Copy link
Copy Markdown
Owner Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces 'SemGrove,' a feature that combines lyrics and audio embeddings into a merged Voyager index for song-similarity searches. It includes a new manager for index maintenance, a dedicated API blueprint, and a 'By Song' search tab in the user interface. Feedback suggests improving code quality by removing aligned assignment operators, optimizing result validation logic, and eliminating redundant imports within function bodies.

Comment thread app_lyrics.py
Comment thread app_sem_grove.py
Comment thread tasks/sem_grove_manager.py
NeptuneHub and others added 4 commits May 4, 2026 12:22
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
@NeptuneHub NeptuneHub merged commit bf0d595 into main May 4, 2026
17 checks passed
@NeptuneHub NeptuneHub deleted the devel branch May 4, 2026 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant