Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Broken character encoding when using the yt-music scraper #714

Closed
pitsi opened this issue Jan 8, 2024 · 11 comments
Closed

[BUG]: Broken character encoding when using the yt-music scraper #714

pitsi opened this issue Jan 8, 2024 · 11 comments
Labels
bug Something isn't working

Comments

@pitsi
Copy link

pitsi commented Jan 8, 2024

Describe the bug

When using yt-music as the scraper, the non-english characters are broken.

To Reproduce

Launch ytfzf -c yt-music and search for something that would return results in greek, e.g. "vevilos", as seen below.

Expected behavior

The returned results inside fzf should have greek or any other character, like they do when youtube is used as a scrapper (ytfzf -c yt).

Screenshots

2024-01-08-125814_816x460_scrot

2024-01-08-130027_720x456_scrot

Information

  • OS: Debian testing/unstable x64
  • Terminal: Urxvt and alacritty
  • Ytfzf version: 2.6.1
  • Output of readlink $(which sh): dash

Additional context

The font on urxvt (black text on light grey background) is hack and on alacritty (light grey text on black background) is cascadia mono. Both support non-english or utf8 characters. I use hack on every app that needs a monospace font, I just changed the one on alacritty for the screenshot.
Also note that when searching for some other greek... artist, e.g. anser, the search breaks completely, possibly because of some weird result.

$ ytfzf -c yt-music
Search
> anser
jq: parse error: Invalid numeric literal at line 1, column 157145
jq: parse error: Invalid numeric literal at line 1, column 157145
[ERROR]: Nothing was scraped
@pitsi pitsi added the bug Something isn't working label Jan 8, 2024
@Euro20179
Copy link
Collaborator

Euro20179 commented Jan 8, 2024

I spent some time messing around with this, until i realized youtube gives me broken utf-8, even my hex viewer can't display it properly. I'm not sure if this is fixable.

example:
image

@pitsi
Copy link
Author

pitsi commented Jan 8, 2024

Great :(
Does this apply only to the core issue I showed in the screenshot or does it cover the jq parse error as well?

p.s. I will be afk tomorrow, Jan 9th, and probably the day after. Please excuse any or no replies during that time.

@Euro20179
Copy link
Collaborator

This only applies to the core issue,

also, I don't get that jq error, but I do get a jq error

 ytfzf -c yt-music        
Search
> anser
jq: error (at <stdin>:66): string ("\"L2jSSE1p...) and null (null) cannot have their containment checked
jq: error (at <stdin>:131): string ("\"L2jSSE1p...) and null (null) cannot have their containment checked

this might have to do with the utf-8 nonsense.

I'm not sure about the one you get.

p.s. I will be afk tomorrow, Jan 9th, and probably the day after. Please excuse any or no replies during that time.

Ok

@pitsi
Copy link
Author

pitsi commented Jan 9, 2024

Quick question while I am afk.

Can this be related to... whatever the /usr/share/ytfzf/addons/scrapers/yt-music-utils/convert-ascii-escape.pl perl file does?

@Euro20179
Copy link
Collaborator

Euro20179 commented Jan 9, 2024

Actually it might, I tried running it again, and the data from youtube looks normal, until it gets piped into the script, I thought it was broken from the start, i'll have another look.

Edit: I think i fixed it. apparently setting binmode(STDOUT, "utf8") makes it not utf8????? clearly im missing something lol.

The fix is in this commit: 6fe8d0d,
I'd recommend just using the development branch to run the changes.

@pitsi
Copy link
Author

pitsi commented Jan 9, 2024

I removed the line from the file, just like you have it on the patch and it now works as it should as you can see here. Nothing plays after that though (on a few results that I tried) and I have no idea why, but I will investigate it.

2024-01-09-195122_816x460_scrot

The jq parse error when searching is still present, only the line number at the end has changed

$ ytfzf -c yt-music
Search
> anser
jq: parse error: Invalid numeric literal at line 1, column 157530
jq: parse error: Invalid numeric literal at line 1, column 157530
[ERROR]: Nothing was scraped

@pitsi
Copy link
Author

pitsi commented Jan 31, 2024

Thank you for adding this to the stable version of the script, even though it will be its very last release.
I am now waiting for debian to package it and check if the jq error at the end persists.

@pitsi
Copy link
Author

pitsi commented Feb 1, 2024

Can you please tag 2.6.2 as the latest release so that everyone can see it on the side of the main page?
I hope that debian's package tracker will also see it that way and flag the package as out of date.

Thank you :)

@Euro20179
Copy link
Collaborator

I am now waiting for debian to package it and check if the jq error at the end persists.

tbh it probably will, but you never know

I am now waiting for debian to package it

you could probably just download the main ytfzf file and run it directly to test it.

@pitsi
Copy link
Author

pitsi commented Feb 1, 2024

Debian packages ytfzf, but months after a new releases is made. I think 2.6.2 will be packaged in April :D
As for the testing, I have already patched the perl file from last time and now it works.

@pitsi
Copy link
Author

pitsi commented Apr 30, 2024

April ends today and debian's ytfzf is still on 2.4.1! :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants