Skip to content

feat(similar-artists): backfill Deezer pictures when Last.fm wins#156

Merged
InstaZDLL merged 3 commits into
mainfrom
feat/similar-artists-deezer-pictures
May 25, 2026
Merged

feat(similar-artists): backfill Deezer pictures when Last.fm wins#156
InstaZDLL merged 3 commits into
mainfrom
feat/similar-artists-deezer-pictures

Conversation

@InstaZDLL
Copy link
Copy Markdown
Owner

@InstaZDLL InstaZDLL commented May 25, 2026

Summary

Last.fm's artist.getSimilar returns the same generic star placeholder URL for every entry β€” their artist-image endpoint was retired in 2019. With a Last.fm key configured, the "Similar artists" carousel on ArtistDetailView was showing a sea of grey stars for every result that wasn't already in the user's library. Users without Last.fm got a better outcome because the cascade fell through to Deezer's /artist/{id}/related, which returns real photos.

Fix the asymmetry by routing the raw cascade output through a new enrich_with_deezer_pictures helper before responding.

How it works

  1. Single SELECT scan against the cross-profile app.metadata_artist cache by LOWER(TRIM(name)) β€” pulls every entry we already have a Deezer picture for, no matter which profile fetched it originally.
  2. Parallel search_artist for cache misses via futures::future::join_all (≀ 12 in flight, bounded by RESULT_LIMIT).
  3. Persist + download β€” metadata_artwork::download_and_cache(url, dir) writes the JPEG once (blake3-keyed, dedup across artists), INSERT … ON CONFLICT(deezer_id) DO UPDATE refreshes the metadata row + the 30-day TTL.
  4. Merge β€” final DTO mapping prefers the Deezer URL over Last.fm's placeholder for picture_url, and resolves picture_path from the in-library hash first β†’ metadata_artist.picture_hash fallback β†’ None.

Caching

  • Metadata (app.metadata_artist) β€” 30-day TTL, shared between profiles.
  • Image bytes (metadata_artwork/<blake3>.jpg) β€” disk cache, no TTL, dedups identical pictures across artists.
  • Subsequent loads β€” both layers hit local disk, zero network round-trip. Works offline (the function short-circuits on offline::is_offline(), serving stale cache only).

What didn't change

  • Cascade priority (Last.fm primary when key configured, Deezer fallback).
  • app.lastfm_similar cache (still stores raw RawSimilar rows with the Last.fm placeholder URL β€” enrichment lives at the read path, not the write path, so the picture stays fresh independently of the affinity scores).
  • Frontend resolveRemoteImage(picture_path, picture_url) resolution β€” the priority logic already prefers local paths.

Test plan

  • cargo test --manifest-path src-tauri/Cargo.toml --lib β†’ 111 passed.
  • cargo check --all-targets clean.
  • Manual smoke (Windows with Last.fm key): open an artist page β†’ "Similar artists" should show real Deezer photos instead of grey stars within 1-2 seconds; re-open the page β†’ instant render from disk cache.
  • Offline mode toggle: open an unseen artist β†’ grey stars expected (no network), open a previously-enriched artist β†’ real photos from disk cache.

Summary by CodeRabbit

  • Nouvelles fonctionnalitΓ©s

    • Les artistes similaires (Last.fm) affichent dΓ©sormais des images enrichies depuis Deezer, avec mise en cache (~30 jours), logique de rΓ©cupΓ©ration en cas de manque et comportement dΓ©gradΓ© sans enrichissement.
    • Limitation de la charge rΓ©seau et enrichissement Γ©chelonnΓ© pour Γ©viter les surcharges.
  • Documentation

    • Doc mise Γ  jour dΓ©crivant le flux d’enrichissement, la rΓ©solution des images, le mode hors‑ligne (lecture du cache) et les tolΓ©rances en cas d’erreur.

Review Change Stack

… cascade

Last.fm's artist.getSimilar has been returning the same generic star
placeholder image URL for every result since they retired their artist-
image endpoint in 2019. With a Last.fm key configured, the similar-
artists carousel was therefore showing a sea of grey stars for every
entry that wasn't already in the user's library β€” only in-library hits
got a picture, courtesy of the existing profile-DB Deezer cache. Users
without a Last.fm key got a better visual outcome because the cascade
fell through to Deezer's /artist/{id}/related, which returns real photo
URLs.

Fix the asymmetry by routing the raw cascade output through a new
enrich_with_deezer_pictures helper before responding. It joins against
the cross-profile app.metadata_artist cache by canonical name, fans out
parallel Deezer search_artist calls for cache misses (≀ 12 in flight),
persists the new rows + downloads the image into the shared
metadata_artwork cache, and merges everything into a per-name
HashMap<canonical_name, (picture_url, picture_hash)> the final DTO
mapping consumes. Result: same UX whether Last.fm is configured or not,
plus a permanent disk cache so subsequent loads are network-free.

No-op when offline mode is on (stale metadata cache only) β€” matches the
existing offline behaviour of every other Deezer-enrichment path.
@InstaZDLL InstaZDLL added scope: backend Rust/Tauri backend (src-tauri/) scope: docs Docs, README, assets type: feat New feature size: m 50-200 lines labels May 25, 2026
@InstaZDLL InstaZDLL self-assigned this May 25, 2026
@InstaZDLL InstaZDLL enabled auto-merge (squash) May 25, 2026 11:59
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

No actionable comments were generated in the recent review. πŸŽ‰

ℹ️ Recent review info
βš™οΈ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 2428b216-67d6-4885-9e0f-92a78dade867

πŸ“₯ Commits

Reviewing files that changed from the base of the PR and between 49bc763 and b70c1c4.

πŸ“’ Files selected for processing (2)
  • docs/features/integrations.md
  • src-tauri/src/commands/similar.rs

πŸ“ Walkthrough

Walkthrough

Ajout d'un enrichissement d'images Deezer pour les artistes similaires Last.fm : la fonction enrich_with_deezer_pictures consulte le cache app.metadata_artist, lance des requΓͺtes parallΓ¨les Deezer pour les manques, tΓ©lΓ©charge l'artwork et persiste avec TTL 30 jours. get_similar_artists intΓ¨gre cet enrichissement et priorise les sources Deezer pour picture_url et picture_path. Documentation utilisateur et changelog mises Γ  jour.

Changes

Enrichissement d'images Deezer pour similar artists

Layer / File(s) RΓ©sumΓ©
DeezerClient clonable (prΓ©requis)
src-tauri/src/deezer.rs
DeezerClient dΓ©rive Clone pour permettre la duplication du client lors d'opΓ©rations concurrentes.
Enrichissement Deezer et intΓ©gration dans get_similar_artists
src-tauri/src/commands/similar.rs
Nouvelle fonction helper enrich_with_deezer_pictures : charge le cache app.metadata_artist, calcule les misses, interroge Deezer en parallèle, télécharge/cache l'artwork et persiste via upsert avec TTL 30 jours. get_similar_artists appelle cet enrichissement et priorise picture_url/picture_path issus du cache Deezer (sinon fallback fournisseur).
Documentation de la feature
docs/features/integrations.md, CLAUDE.md
Changelog et guide utilisateur dΓ©crivent le mΓ©canisme de picture enrichment, la rΓ©solution picture_url/picture_path Γ  partir du cache Deezer, les limites de concurrence, le comportement offline, et la dΓ©gradation silencieuse en cas d'erreur DB.

Sequence Diagram(s)

sequenceDiagram
  participant get_similar_artists
  participant enrich_with_deezer_pictures
  participant metadata_artist as app.metadata_artist
  participant DeezerAPI as Deezer search_artist
  participant artwork as artwork download

  get_similar_artists->>enrich_with_deezer_pictures: canonical names
  enrich_with_deezer_pictures->>metadata_artist: charge entrΓ©es non expirΓ©es
  enrich_with_deezer_pictures->>DeezerAPI: search_artist parallèle (misses)
  DeezerAPI->>enrich_with_deezer_pictures: picture_url + artist data
  enrich_with_deezer_pictures->>artwork: tΓ©lΓ©charge artwork si disponible
  artwork->>enrich_with_deezer_pictures: retourne picture_hash local
  enrich_with_deezer_pictures->>metadata_artist: upsert avec expires_at (TTL 30j)
  enrich_with_deezer_pictures-->>get_similar_artists: map canonical→(url, hash)
  get_similar_artists->>get_similar_artists: fusionne dans DTO retournΓ©
Loading

Estimated code review effort

🎯 3 (Modéré) | ⏱️ ~25 minutes

Suggested labels

size: l

Poem

🎡 Images trouvées, cache qui dort trente jours
Quand Last.fm chante, Deezer peint le contour
Offline, le registre chuchote sans bruit
Parallèle et persistant, le backfill s'enfuit
Similar artists sourient, leur visage a du jour ✨

πŸš₯ Pre-merge checks | βœ… 5
βœ… Passed checks (5 passed)
Check name Status Explanation
Title check βœ… Passed Le titre suit la convention Conventional Commits avec un scope kebab-case et dΓ©crit clairement le changement principal : l'enrichissement des images Β« similar artists Β» via Deezer quand Last.fm fournit le rΓ©sultat.
Description check βœ… Passed La description couvre les points essentiels (problΓ¨me, solution, implΓ©mentation, caching, tests). Elle remplit les sections Summary, How I tested et le contexte Checklist, rΓ©pondant aux exigences du template.
Docstring Coverage βœ… Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check βœ… Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check βœ… Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
πŸ“ Generate docstrings
  • Create stacked PR
  • Commit on current branch
πŸ§ͺ Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/similar-artists-deezer-pictures

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

πŸ€– Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src-tauri/src/commands/similar.rs`:
- Around line 243-247: La lecture cache et l’upsert avalent les erreurs
actuellement β€” remplace l’if let Ok(rows) = q.fetch_all(pool).await { ... }
silencieux par une gestion d’erreur explicite: utilisez match ou l’opΓ©rateur ?
sur q.fetch_all(pool).await dans la fonction contenant canonical_name et
map.insert pour propager une Err(Result) ou logger l’erreur (par ex. via
tracing::error!) avec le contexte ("fetch_all for cache" ou "upsert cache")
avant de retourner/propager l’erreur; applique le mΓͺme changement aux blocs
similaires autour des symboles q.fetch_all, map.insert et toute fonction
d’upsert dans les sections correspondantes (lines ~301-319).
- Around line 252-287: The current fan-out launches a request per item in
misses, ignoring the "≀ 12" concurrency cap and the later take(RESULT_LIMIT);
change the pipeline so you only process up to RESULT_LIMIT entries from misses
and run at most 12 concurrent Deezer requests: replace the join_all over
misses.iter() with a stream-based approach (e.g.,
futures::stream::iter(misses.into_iter().take(RESULT_LIMIT)).map(|r| async move
{ ... }).buffer_unordered(CONCURRENCY_LIMIT).collect()) using a constant
CONCURRENCY_LIMIT = 12 and the existing RESULT_LIMIT, keeping the same logic
inside the async block that calls DeezerClient::search_artist and canonical_name
to produce (canon, hit).
- Around line 224-241: The SQL comparison uses LOWER(TRIM(name)) but the bound
values in canonicals were produced by canonical_name(), so names with
punctuation (e.g., AC/DC, P!nk) mismatch; update the WHERE clause so the DB-side
expression matches canonical_name() normalization (for example apply the same
lowercase+strip-non-alnum transformation on name using a regexp replace) and
keep binding canonicals as-is; change the SQL string built in sql (used to
create q) to use that canonicalized expression against the ({placeholders}) list
so app.metadata_artist lookups align with canonical_name().
πŸͺ„ Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
βš™οΈ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 90592c25-3848-4e4c-9841-7fc1bc3262f5

πŸ“₯ Commits

Reviewing files that changed from the base of the PR and between e5fc3a0 and f7d7d50.

πŸ“’ Files selected for processing (3)
  • CLAUDE.md
  • docs/features/integrations.md
  • src-tauri/src/commands/similar.rs

Comment thread src-tauri/src/commands/similar.rs Outdated
Comment thread src-tauri/src/commands/similar.rs Outdated
Comment thread src-tauri/src/commands/similar.rs Outdated
Three issues raised in PR #156 review, all valid against the current
code:

1. **Silent failure swallowing.** `if let Ok(rows) = q.fetch_all(...)`
   and `let _ = sqlx::query(...).execute(...)` were dropping DB errors
   on the floor β€” direct violation of the project's no-silent-failures
   policy. Both now match the Err arm and log via `tracing::warn!` with
   the right context (cache lookup vs upsert).

2. **Concurrency cap absent + over-enrichment.** `join_all` over the
   raw miss list could fan out > RESULT_LIMIT requests when the Deezer
   fallback returned > 12 hits, and the surplus was thrown away by
   `.take(RESULT_LIMIT)` downstream. Switched to
   `futures::stream::iter(...).buffer_unordered(CONCURRENCY_LIMIT)` and
   sliced misses to `raw[..RESULT_LIMIT]` upfront so every fired
   request maps to an entry that'll actually display.

3. **Canonical-name SQL mismatch.** The cache lookup used
   `LOWER(TRIM(name))` while `canonicals` came from `canonical_name()`
   which strips non-alphanumerics. Artists like AC/DC and P!nk would
   never match, forcing a Deezer round-trip every request. SQLite's
   stock build has no REGEXP function so an equivalent SQL predicate
   isn't reachable without a stored column + migration. Switched to a
   `WHERE expires_at > ?` scan + Rust-side `canonical_name()` filter β€”
   table is bounded at ~thousands of rows so the scan is sub-ms, and
   the comparison is now byte-for-byte identical to the scanner's.

Side effects: 1) offline mode now serves stale cache instead of
returning an empty map (the cache read is local); 2) `DeezerClient`
derives `Clone` so the buffered stream can stamp it into each future
without HRTB lifetime grief β€” internal field is already Arc-backed so
clone is a refcount bump.
@InstaZDLL
Copy link
Copy Markdown
Owner Author

InstaZDLL commented May 25, 2026

@coderabbitai Addressed all three findings in 49bc763.

1. Silent failure swallowing β€” fixed
Both q.fetch_all(pool).await and the INSERT … ON CONFLICT upsert now match the Err arm and log via tracing::warn! with explicit context. Enrichment still degrades gracefully on DB failure (correct behaviour β€” the endpoint must keep returning the raw similar-artist list) but errors are no longer invisible.

2. Concurrency cap + RESULT_LIMIT trimming β€” fixed

  • Added const CONCURRENCY_LIMIT: usize = RESULT_LIMIT;
  • Misses sliced to raw[..raw.len().min(RESULT_LIMIT)] upfront so we never enrich entries the caller's .take(RESULT_LIMIT) will drop.
  • Swapped join_all for futures::stream::iter(...).buffer_unordered(CONCURRENCY_LIMIT).collect().
  • Side effect: had to #[derive(Clone)] on DeezerClient so each future of the buffered stream can stamp its own copy (internal reqwest::Client is already Arc-backed, clone is a refcount bump).

3. Canonical-name SQL mismatch β€” fixed (chose Rust-side filter)
Confirmed against scan::canonical_name β€” it does lowercase + strip non-alphanumeric + collapse whitespace. LOWER(TRIM("AC/DC")) = "ac/dc" β‰  canonical_name("AC/DC") = "ac dc", so every name with punctuation missed the cache.

SQLite's standard build has no REGEXP function and no user-defined alphanumeric filter, so an equivalent SQL predicate isn't reachable without a stored canonical column + migration. Took the simpler path: dropped the WHERE name IN (...) filter, kept WHERE expires_at > ?, and filter in Rust via canonical_name(&row.name) against a HashSet of targets. Scan cost is sub-ms (table bounded at ~thousands of rows in steady state) and the comparison is now byte-for-byte identical to the scanner's. Commented the tradeoff in the function doc and noted the migration path if profiling ever flags this.

Bonus side effect: offline mode now serves stale cache instead of returning an empty map (the cache read is local β€” no reason to skip it). Matches the existing offline behaviour of get_similar_artists itself.

CI: cargo test --lib β†’ 111 passed, cargo check --all-targets clean.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

Tip

For best results, initiate chat on the files or code changes.

@InstaZDLL All three fixes verified in the diff β€” implementation looks correct.

  • Fix 1 βœ… Both fetch_all and the upsert execute have Err(err) arms with tracing::warn!(?err, ...) carrying full context. Graceful degradation preserved.
  • Fix 2 βœ… CONCURRENCY_LIMIT = RESULT_LIMIT, miss slice capped at RESULT_LIMIT, buffer_unordered used. DeezerClient clone comment accurately explains it's a refcount bump. Good catch that map.insert(canon, ...) still fires even when the upsert errors β€” the in-flight enrichment still benefits the current response.
  • Fix 3 βœ… Full table scan + Rust-side canonical_name() / HashSet filter is the right call given SQLite's lack of a user-definable alphanumeric function. Comment documents the stored canonical_name column + index migration path for later. Offline bonus (cache read before the is_offline() guard) is a clean improvement.

Nothing blocking. 🐰

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

πŸ€– Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src-tauri/src/commands/similar.rs`:
- Around line 249-253: La requΓͺte actuelle utilise sqlx::query_as::<_, (String,
Option<String>, Option<String>)> avec "FROM app.metadata_artist WHERE expires_at
> ?" et la branche offline renvoie immΓ©diatement le map, ce qui fait perdre les
images locales quand le TTL est expirΓ©; changez la logique pour appeler
offline::is_offline() en amont et, si offline, exΓ©cuter une requΓͺte sans le
filtre "WHERE expires_at > ?" sur app.metadata_artist pour lire le cache local
(y compris picture_url/picture_hash) puis court-circuiter toute tentative de
refresh rΓ©seau; appliquez la mΓͺme modification aux autres occurrences
mentionnΓ©es (les blocs autour des lignes Γ©quivalentes 291-294).
πŸͺ„ Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
βš™οΈ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 63615fe8-f418-4dba-b1ca-edc08415d015

πŸ“₯ Commits

Reviewing files that changed from the base of the PR and between f7d7d50 and 49bc763.

πŸ“’ Files selected for processing (3)
  • docs/features/integrations.md
  • src-tauri/src/commands/similar.rs
  • src-tauri/src/deezer.rs

Comment thread src-tauri/src/commands/similar.rs Outdated
The cache lookup applied `WHERE expires_at > ?` unconditionally and
the offline check sat downstream. Net effect: when the user was
offline AND the metadata_artist row had expired, the enrichment
helper returned an empty map for that artist β€” even though the
`metadata_artwork/<hash>.jpg` blob was still on disk and a stale
picture is strictly better than a grey star (we have no way to refresh
the metadata until network returns anyway).

Split the cache query in two: the offline branch reads
`SELECT … FROM app.metadata_artist` (no TTL filter) and short-circuits
before the miss computation; the online branch keeps the
`WHERE expires_at > ?` predicate so expired rows correctly fall
through to a Deezer refresh.

The shared metadata_artwork blob cache has no TTL, so any hash we
return that still has a file on disk will render β€” `existing_path`
already guards the resolution. Hashes whose file was purged just
degrade to `picture_url` (remote) which won't load offline but
matches the pre-change behaviour for missing files.
@InstaZDLL InstaZDLL added size: l 200-500 lines and removed size: m 50-200 lines labels May 25, 2026
@InstaZDLL
Copy link
Copy Markdown
Owner Author

InstaZDLL commented May 25, 2026

@coderabbitai Addressed in b70c1c4.

Finding valid. Confirmed against current code: the WHERE expires_at > ? predicate was applied unconditionally, then the offline short-circuit sat downstream. Net effect for offline users with an expired metadata_artist row: empty map returned even though the metadata_artwork/<hash>.jpg blob was still on disk β€” exactly the regression CodeRabbit flagged.

Fix: split the cache query in two paths based on offline::is_offline():

  • Offline: SELECT … FROM app.metadata_artist (no TTL filter) + short-circuit before the miss computation.
  • Online: SELECT … WHERE expires_at > ? (unchanged β€” expired rows still fall through to a Deezer refresh).

The shared metadata_artwork/<hash>.jpg blob cache has no TTL of its own; metadata_artwork::existing_path already filters the resolution to files that actually exist on disk, so a returned hash whose file got purged just degrades to picture_url (which won't load offline but matches pre-change behaviour for missing files).

Both occurrences the reviewer pointed at (cache lookup ~249-253 + offline short-circuit ~291-294) handled in the same refactor. The miss-computation block now only runs when online, no more crate::offline::is_offline() in the if miss_names.is_empty() guard.

Tests: 111 passed.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

Tip

For best results, initiate chat on the files or code changes.

@InstaZDLL Fix is correct. Splitting the query at the is_offline() branch point rather than post-filtering is the right approach β€” offline users now get stale cache rows (and their on-disk blobs), while online users still get the TTL-filtered refresh path. Clean refactor.

πŸ‡ βœ“

@InstaZDLL InstaZDLL merged commit 32c2e6d into main May 25, 2026
14 checks passed
@InstaZDLL InstaZDLL deleted the feat/similar-artists-deezer-pictures branch May 25, 2026 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

scope: backend Rust/Tauri backend (src-tauri/) scope: docs Docs, README, assets size: l 200-500 lines type: feat New feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant