Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metadata encoding detection not working #8844

Closed
jonath92 opened this issue May 20, 2021 · 4 comments
Closed

metadata encoding detection not working #8844

jonath92 opened this issue May 20, 2021 · 4 comments
Labels
core:option-param-conf config, parameters, properties, options, shortcuts os:linux

Comments

@jonath92
Copy link

jonath92 commented May 20, 2021

When opening a file/stream with non UTF-8 characters (e.g. this stream: 'http://ic7.101.ru:8000/c1_2' which almost always has non UTF-8 characters in the metadata) non UTF-8 characters are not decoded. The relevant line in terminal output is something like this:

[display-tags]  icy-title: ��������� ����� & ����� - � �� ������� - 0:00

However when I open the same stream with Rhythmbox, it is shown correctly:

image

It also should be mentioned that this also occurs in other languages, e.g. for German streams the "Umlaute" (ä,ü.ö) are not shown correctly.

I first assumed that the problem occurs because mpv hasn't been built with uchardet as the docs mention:

uchardet will be used to guess the charset. (If mpv was not compiled with uchardet, then utf-8 is the effective default.)

but the behavior is still the same one a custom build with uchardet. AFAICS the problem either occurs because I haven't built it correctly or due to a bug. The building log included the following line why I assumed I haven't made a mistake:

Checking for uchardet support                                             : yes 

Important Information

  • mpv version: problem occurs on 0.32.0 installed from the linux mint apt repo and 0.33.0-161-g83b4bc622a with a custom build
  • Linux Distribution: Linux Mint 20.1 (which based on Ubuntu 20.04)

Reproduction steps

  • Build mpv with Uchardet
  • Open mpv with a file/stream with non UTF-8 characters (e.g. this stream: 'http://ic7.101.ru:8000/c1_2' which almost always has non UTF-8 characters in the metadata).

Expected behavior

  • mpv is showing the metadata correctly

Actual behavior

  • mpv is showing weird symbols in the metadata.

Log file

log.txt

Sample files

Unfortunately I don't have a sample file but it is usually very good reproducible with the stream mentioned above. It also occurs with this stream: http://wdr-wdr2-rheinruhr.icecast.wdr.de/wdr/wdr2/rheinruhr/mp3/128/stream.mp3. However in this case it occurs much less often due the fact that many titles don't include non UTF-8 characters.

Any help is much appreciated.

CounterPillow added a commit to CounterPillow/mpv that referenced this issue May 20, 2021
This adds an option to mpv to set the codepage that should be used
for decoding the icy-title metadata. By default, "auto" is chosen,
which uses uchardet for guessing if mpv was built with support for
it, and otherwise effectively uses utf-8.

Fixes mpv-player#8844.
CounterPillow added a commit to CounterPillow/mpv that referenced this issue May 20, 2021
This adds an option to mpv to set the codepage that should be used
for decoding the icy-title metadata. By default, "auto" is chosen,
which uses uchardet for guessing if mpv was built with support for
it, and otherwise effectively uses utf-8.

Fixes mpv-player#8844.
@Akemi
Copy link
Member

Akemi commented May 20, 2021

this is remotely related to #8812(?)

@Akemi Akemi added the core:option-param-conf config, parameters, properties, options, shortcuts label May 20, 2021
@taras133
Copy link

there is still no solution to this issue?
mpv 0.33.0

File tags:
icy-title: Pianoboy - � ���� ���
A: 00:00:29 / 00:00:33 (88%) Cache: 3.8s/135KB

i tried setting --metadata-codepage=auto, tried changing different system encodings, have no affect, this issue is so annoying,
please help,

@CounterPillow
Copy link
Contributor

I haven't gotten around to reworking that PR yet, so there still is no solution to this issue as of now.

@taras133
Copy link

I haven't gotten around to reworking that PR yet, so there still is no solution to this issue as of now.

thanks for update

Dudemanguy added a commit to Dudemanguy/mpv that referenced this issue Oct 1, 2023
a343666 made demux options public, so
we can take advantage of that here as well. This lets users guess the
codepage if the stream doesn't use UTF-8 characters. Fixes mpv-player#8844.
Dudemanguy added a commit to Dudemanguy/mpv that referenced this issue Oct 1, 2023
a343666 made demux options public, so
we can take advantage of that here as well. This lets users guess the
codepage if the stream doesn't use UTF-8 characters. Fixes mpv-player#8844.
Dudemanguy added a commit to Dudemanguy/mpv that referenced this issue Oct 1, 2023
a343666 made demux options public, so
we can take advantage of that here as well. This lets users guess the
codepage if the stream doesn't use UTF-8 characters. Fixes mpv-player#8844.
Dudemanguy added a commit to Dudemanguy/mpv that referenced this issue Oct 2, 2023
a343666 made demux options public, so
we can take advantage of that here as well. This lets users guess the
codepage if the stream doesn't use UTF-8 characters. Fixes mpv-player#8844.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core:option-param-conf config, parameters, properties, options, shortcuts os:linux
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants