Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 (and '...') handling #603

Open
braoult opened this issue Jul 17, 2024 · 9 comments
Open

UTF-8 (and '...') handling #603

braoult opened this issue Jul 17, 2024 · 9 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@braoult
Copy link

braoult commented Jul 17, 2024

After starting Satunes, having a look in Albums tab, the following get displayed :

hor

The left faulty album name should show Ainsi Soit Je... (with 3 trailing dots, therefore only ASCII here).
The right faulty one should show いぶき (UTF-8).

@antoinepirlot
Copy link
Owner

Hey,

Thanks for reporting.

Hmm, I thought it was already fixed 🤔.

I will check that, thanks

@antoinepirlot antoinepirlot self-assigned this Jul 17, 2024
@antoinepirlot antoinepirlot added the bug Something isn't working label Jul 17, 2024
@antoinepirlot antoinepirlot added this to the v2.0.0 milestone Jul 17, 2024
@braoult
Copy link
Author

braoult commented Jul 17, 2024

Thanks a lot. I forgot: I use Android 14 .

@antoinepirlot
Copy link
Owner

Thanks a lot. I forgot: I use Android 14 .

Thanks

@braoult
Copy link
Author

braoult commented Jul 17, 2024

If you need, I can transfer you some faulty Music files, but this would not be possible here, due to © issues...

@antoinepirlot
Copy link
Owner

If you need, I can transfer you some faulty Music files, but this would not be possible here, due to © issues...

You can send me by email if you prefer at pirlot.antoine@outlook.com

@braoult braoult mentioned this issue Jul 17, 2024
@antoinepirlot
Copy link
Owner

antoinepirlot commented Jul 17, 2024

Issue duplicated for ellipsis except for いぶき.

I also noticed it happens not everytime. I set the album name: "Ainsi Soit Je..." with the elipsis char will show wrong chars instead of "..."
but "いぶき ..." (still with ellipsis char) won't show wrong chars.

It's due to formatted chars in files informations by the OS or program with the one you edit names or by Jetpack Compose.

Also, I'll check to make app accepting different formats to avoid this issue, later.

@antoinepirlot antoinepirlot removed this from the v2.0.0 milestone Jul 17, 2024
@antoinepirlot antoinepirlot added this to the v2.1.0 milestone Aug 4, 2024
@antoinepirlot antoinepirlot modified the milestones: v2.1.0, v2.2.0 Aug 11, 2024
@antoinepirlot
Copy link
Owner

I don't have enough knowledge about that at this time

@antoinepirlot antoinepirlot modified the milestones: v2.2.0, v2.4.0, v2.3.0 Aug 30, 2024
@braoult
Copy link
Author

braoult commented Aug 31, 2024

Usually, to fix UTF-related issues, one can first dump the data and understand what characters are exactly.

For the "Ainsi Soit Je…" case, I just double-checked, and dumped the data (filenames and Mp3 tags) from the sample I shared with you.

My mistake, some of the ellipsis for this case are not ASCII, but the HORIZONTAL ELLIPSIS UTF-8 character (U+2026) :

  • The ellipsis in directory name "Mylène Farmer/Ainsi Soit Je…/" uses U+2026
  • In "L'Horloge" MP3, the TALB (album) tag ("Ainsi Soit Je…"), the ellipsis is also the U+2026 character.

This Album MP3 tag should be decoded (maybe unnecessary) and displayed as standard UTF-8. How do you proceed in this case to get a wrong display ?

Note: U+2026 is hex: e2 80 a6

EDIT: It may be more complicated than I thought: A 2015 discussion about mp3tag community seems to indicate some choices have to be made, which may not work everywhere.
Maybe you should avoid to spend time on this issue, until you have more information on how the tags should be encoded. It may even be different depending on id3v2 version :-(

@antoinepirlot
Copy link
Owner

antoinepirlot commented Aug 31, 2024

Yeah, I spent a lot of time to get a solution that doesn't make Satunes loads longer.

I checked about id3v2 but I didn't find a way to manage the encoding type with no huge performance impact.

Also, I'm beginning a master degree in computer science, I hope I will find a solution with more knowledge 🤭.

I'm still checking for a solution.

Thanks for the link

@antoinepirlot antoinepirlot modified the milestones: v2.3.0, v2.4.0 Sep 11, 2024
@antoinepirlot antoinepirlot modified the milestones: v2.4.0, v2.5.0 Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Issues
Development

When branches are created from issues, their pull requests are automatically linked.

2 participants