Fix: Spotify plugin unable to recognize Chinese and Japanese albums. #5705

dhruvravii · 2025-04-03T17:51:06Z

Fixes for Issue #5699 (close #5699)

Have made changes to spotify.py (in lines 399 - 407) to allow Chinese and Japanese characters to be read. Previously these were converted to their ASCII values which led to them not being matched. Made changes to test_spotify.py to check parsing of Chinese and Japanese characters. The changes were tested and passed.

A file called japanese_track_request.json was made to mimic the Spotify API response since I don't have the credentials. Entries in that will need to be modified with the actual entries.

Let me know if any changes need to be made.

Documentation. No need for change in documentation.
Changelog. No need for changelog entry until review is completed.
Tests. The changes were tested and passed.

github-actions · 2025-04-03T17:51:20Z

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

wisp3rwind · 2025-04-08T14:50:47Z

You have seen that there is already a PR for this? Also is there a reason to ascii decode the query? As far as I can see this is not required by spotify and should be handled closer to the input anyways.

Seems like both PRs were opened almost at the same time. (The other one is #5703.)

As it is done here, the unidecode indeed appears to be without effect, since it's only called if the query is ASCII-only anyway.

However, I'm wondering about the original reasoning for adding unidecode here. Is it still valid (for some languages)? Should there be a fallback to the unidecoded if the direct query yields no or only low-confidence results? Or maybe always query for both, and the autotagger sort it out?

Maybe you could check the git blame and pull requests related to this plugin to see whether there's any discussion of the need for the ASCII conversion?

semohr · 2025-04-08T15:12:58Z

Was added in a commit in part of the initial spotify plugin implementation. The reason for adding is explained in a comment, I don't think it is up to date anymore. We would need to test if spotify still has issues with ascii.

refs:

PR: Add Spotify candidates and item_candidates #3125
Commit: semohr@8c6cc65
Comment: Add Spotify candidates and item_candidates #3125 (comment)

Addition:
Interestingly, the example from the comment still is handled the same way. I.e. artist:deadmau5 album:4×4 doesn't return this album, but artist:deadmau5 album:4x4 does.
We need some decision here, I don't think there is an easy way to support both. I would argue, the user is responsible to normalize their input. E.g. do not use × if x was wanted... Just converting all input seems wrong to me, i.e. it is hard to understand for end users why things are converted. E.g. I'm not fluent in ascii and neither is spotify ;)

semohr · 2025-04-08T18:13:39Z

I think #5189 and maybe #4514 could be related.

semohr · 2025-05-19T13:14:06Z

As there wasn't any commit from @dhruvravii for some time on this. I just added a config option to enable the legacy behavior and adjusted the tests to check for ascii encoding.

Hope we can get this going again.

semohr · 2025-06-01T11:12:23Z

@beetbox/maintainers Could we get another look and some feedback here? I think this is an easy win.

…inese characters

a test to check if the query is ascii/nonascii.

JOJ0 · 2025-06-20T06:54:26Z

@semohr just had a high level look on this and trying to think about what's best from a user's point of view. I often have diffculties finding things on spotify and yes the @mgoltzsche issue you found might also be related to all this.

Basically I think your idea of keeping the original "decode to asciii" strategy with a config option is good and should stay. But also my first thoughts went to something similar to what @wisp3rwind suggested initially: The idea of searching with the original unicode string and then falling back on bad (zero) results to the ascii strategy is worth considering! Would that be difficult to implement?

And last but not least: Does the original idea of the contributor make any sense? Only searching in ascii if all is ascii in the first place? Just throwing this idea in the mix again...

semohr · 2025-06-20T09:00:08Z

The idea of searching with the original unicode string and then falling back on bad (zero) results to the ascii strategy is worth considering! Would that be difficult to implement?

I think it would be rather difficult to reliable check. Getting zero result is quite unusual even for wrong queries. For instance 裘盛戎 - 盗御马 as ascii returns completely unrelated albums.
I guess it should be possible to re-fetch candidates if the previous fetch only return albums or tracks under a given threshold (maybe 60%?). While possible I think this could be quite difficult to implement as we would have to patch some core metadataplugin logic here.

And last but not least: Does the original idea of the contributor make any sense? Only searching in ascii if all is ascii in the first place? Just throwing this idea in the mix again...

I don't think this makes any difference, we send a string-query to spotify (spotify expects utf-8 afaik) and will treat any string identical. Whether we convert or query to a single byte representation (ascii) should therefore make no difference here. I'm not even sure if python treats ascii and utf-16 different...

JOJ0

Since the essential code changes are only a few lines and depend on a config option this is a minimal change and the old behavior can be reverted / be experimented with if people want that. Tiny grammar correction within. Thanks for resurrecting and polishing this PR @semohr

Also I see the point now that it is difficult to decide if zero results is a bad result or if "random" results is worse. So let's keep it simple!

docs/plugins/spotify.rst

Co-authored-by: J0J0 Todos <2733783+JOJ0@users.noreply.github.com>

semohr · 2025-06-20T10:23:18Z

It seems like windows tests are broken at the moment. Will wait with the merge until they are fixed :)

…eetbox#5705) Fixes an issue where each spotify query was converted to ascii before sending. Adds a new config option to enable legacy behaviour. A file called japanese_track_request.json was made to mimic the Spotify API response since I don't have the credentials. Entries in that will need to be modified with the actual entries. Co-authored-by: Sebastian Mohr <sebastian@mohrenclan.de> Co-authored-by: Sebastian Mohr <39738318+semohr@users.noreply.github.com> Co-authored-by: J0J0 Todos <2733783+JOJ0@users.noreply.github.com>

dhruvravii changed the title ~~added modifications to spotify.py to allow reading of Japanese and Ch…~~ Fix: Spotify plugin unable to recognize Chinese and Japanese albums. Apr 3, 2025

This comment was marked as outdated.

Sign in to view

semohr added the spotify label May 12, 2025

semohr requested a review from snejus May 19, 2025 13:17

semohr mentioned this pull request May 19, 2025

Fix: Spotify plugin unable to find Chinese/Japanese albums #5703

Closed

1 task

semohr requested a review from wisp3rwind May 19, 2025 19:15

semohr force-pushed the chinese-japanese-chars-spotify-fix branch from d33337f to dde48a9 Compare June 18, 2025 12:29

dhruvravii and others added 4 commits June 18, 2025 15:21

added modifications to spotify.py to allow reading of Japanese and Ch…

a86a0af

…inese characters

added modifications to spotify.py to allow reading of Japanese and Ch…

e18b675

…inese characters

Added a config option to allow legacy ascii formatting behavior. Added

05aa47d

a test to check if the query is ascii/nonascii.

Moved changelog entry up to unreleased.

aea30a1

semohr force-pushed the chinese-japanese-chars-spotify-fix branch from dde48a9 to aea30a1 Compare June 18, 2025 13:22

Moved the config check into the _construct_search_query method.

86fae7a

JOJ0 approved these changes Jun 20, 2025

View reviewed changes

docs/plugins/spotify.rst Outdated Show resolved Hide resolved

Update docs/plugins/spotify.rst

b2f3860

Co-authored-by: J0J0 Todos <2733783+JOJ0@users.noreply.github.com>

Merge branch 'master' into chinese-japanese-chars-spotify-fix

aaa8ebb

semohr merged commit dd6cb53 into beetbox:master Jul 1, 2025
16 checks passed

semohr mentioned this pull request Jul 7, 2025

Avoid Unidecoding Query Strings in Deezer Metadata Plugin #5860

Open

semohr mentioned this pull request Jul 18, 2025

Avoid Unidecoding Query Strings in Deezer Metadata Plugin #5882

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: Spotify plugin unable to recognize Chinese and Japanese albums. #5705

Fix: Spotify plugin unable to recognize Chinese and Japanese albums. #5705

Uh oh!

dhruvravii commented Apr 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Apr 3, 2025

Uh oh!

This comment was marked as outdated.

wisp3rwind commented Apr 8, 2025

Uh oh!

semohr commented Apr 8, 2025 •

edited

Loading

Uh oh!

semohr commented Apr 8, 2025 •

edited

Loading

Uh oh!

semohr commented May 19, 2025 •

edited

Loading

Uh oh!

semohr commented Jun 1, 2025

Uh oh!

JOJ0 commented Jun 20, 2025 •

edited

Loading

Uh oh!

semohr commented Jun 20, 2025 •

edited

Loading

Uh oh!

JOJ0 left a comment •

edited

Loading

Uh oh!

Uh oh!

semohr commented Jun 20, 2025

Uh oh!

Uh oh!

Uh oh!

Fix: Spotify plugin unable to recognize Chinese and Japanese albums. #5705

Fix: Spotify plugin unable to recognize Chinese and Japanese albums. #5705

Uh oh!

Conversation

dhruvravii commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Apr 3, 2025

Uh oh!

This comment was marked as outdated.

wisp3rwind commented Apr 8, 2025

Uh oh!

semohr commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

semohr commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

semohr commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

semohr commented Jun 1, 2025

Uh oh!

JOJ0 commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

semohr commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JOJ0 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

semohr commented Jun 20, 2025

Uh oh!

Uh oh!

Uh oh!

dhruvravii commented Apr 3, 2025 •

edited

Loading

semohr commented Apr 8, 2025 •

edited

Loading

semohr commented Apr 8, 2025 •

edited

Loading

semohr commented May 19, 2025 •

edited

Loading

JOJ0 commented Jun 20, 2025 •

edited

Loading

semohr commented Jun 20, 2025 •

edited

Loading

JOJ0 left a comment •

edited

Loading