Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling non-ascii charset for the response of api call #841

Closed
deskmonster opened this issue Feb 5, 2023 · 21 comments
Closed

Handling non-ascii charset for the response of api call #841

deskmonster opened this issue Feb 5, 2023 · 21 comments

Comments

@deskmonster
Copy link

deskmonster commented Feb 5, 2023

Sat, 21 Jan 2023 16:26:00 +0000 3516424851832493469 https://nyaa.si/view/1627605 Anime 461688000 [GM-Team][������][������][The Three-Body Problem][2022][08][AVC][GB][1080P] https://localhost:443/nzbhydra/gettorrent/api/xxxxx?apikey=xxxxx Sat, 21 Jan 2023 16:26:00 +0000 xxxxxhttps://nyaa.si/view/1627605 Anime 461688000 [GM-Team][������][������][The Three-Body Problem][2022][07][AVC][GB][1080P]

If hydra can handle non-acsii charset for api call just as it handles for internal search, it would be great.

  • If not clear, why do you want it?
    To identify release name more accurately

  • Do you think it's something only you need or something that might be popular?
    popular, since no one likes garbled character

@theotherp
Copy link
Owner

Please provide a way for me to reproduce this.

@deskmonster
Copy link
Author

Please provide a way for me to reproduce this.

  • add nyaa.is to jackett (The reason choose nyaa is that it's tracker for anime thus may contains many non-ascii character)
  • add jackett as an indexer to hydra, fill host with nyaa's torznab feed
    picture
  • call the Torznab API endpoint. forexample, this will scan a Chinese anime whose original title is "三体" and you will see some corrupted character.
http://localhost:443/nzbhydra/torznab/api?t=tvsearch&cat=5030,5040,5000&extended=1&apikey=(removed)&offset=0&limit=100&q=Three%20Body&season=1

@theotherp
Copy link
Owner

Works fine for me:
image

@theotherp
Copy link
Owner

Try making the request agains the actual instance instead of the reverse proxy.

@deskmonster
Copy link
Author

ss

using the ip adress directly but still..
tested on firefox edge chrome and sonarr

@deskmonster
Copy link
Author

deskmonster commented Feb 5, 2023

I'm using the docker image builded by linuxserver for the timebeing.
I set up a fresh install from https://github.com/theotherp/nzbhydra2/releases/tag/v5.1.1 , testd it again and got corrupted characters again. Both version of nzbhdra are v5.1.1.

my system is ubuntu 20.04.5 LTS amd64; also tested on Ubuntu 20.04.5 LTS arm64 for both docker and release

@theotherp
Copy link
Owner

Please post your debug infos zip.

@theotherp
Copy link
Owner

Nevermind, it's an issue with the docker image.

@deskmonster
Copy link
Author

nzbhydra.log
here is the log if still on demand.

If it's an issue with the docker image, it's strange that when I use the binary directly, it's still corrupted.

@theotherp
Copy link
Owner

The problem is encoding related (obviously).
On my machine and on a self built docker the reported encoding is UTF-8. On the lsio container the reported encoding is ANSI_X3.4-1968.

That's the log, not the debug infos. You can create them in the system section. After you created them you should find an entry in the log saying "File encoding".

@theotherp
Copy link
Owner

theotherp commented Feb 5, 2023

See docker-nzbhydra2/issues/41

@deskmonster
Copy link
Author

yes it's ANSI_X3.4-1968 in the debug infos both for docker and local. I'm going to find the language encoding setting.

Thank you for you kind help and opening the issue on lxio.
Have a nice day!

@Fmajor
Copy link

Fmajor commented Nov 18, 2023

Same error using the latest docker image

# docker compose file
---
version: "2.1"
services:
  nzbhydra2:
    image: lscr.io/linuxserver/nzbhydra2:latest
    container_name: nzbhydra2
    environment:
      - PUID=297
      - PGID=297
      - TZ=Asian/Shanghai
    volumes:
      - ./config:/config
      - /backupfs/private/workspaces/nzb:/downloads
    ports:
      - 5076:5076
    restart: unless-stopped
# docker info
 Server Version: 23.0.3
 Kernel Version: 6.1.21-gentoo-x86_64
# docker compose
docker-compose version 1.29.2
# image info
26ccec62fff7   lscr.io/linuxserver/nzbhydra2:latest

I make test request using python

url_jackett = "http://172.17.0.1/jackett/api/v2.0/indexers/simpleanime/results/torznab/api?apikey={removed}&t=search&extended=1&q=Kusuriya%20no%20Hitorigoto%2001&password=1&cat=5000&limit=1000"
url_nzb = "http://127.0.0.1:5076/nzbhydra2/torznab/api?t=tvsearch&cat=5030%2C5040%2C5000&apikey={removed}&offset=0&limit=100&q=Kusuriya+no+Hitorigoto+01"

when query jackett directly, I get

  <item>
   <title>
    [天月搬運組] 藥師少女的獨語  Kusuriya no Hitorigoto  01 (NetFlix 1920x1080 AVC AAC MKV)
   </title>

but the xml result from nzbhydra is like

<item>
   <title>
    [���������������] ���������������������  Kusuriya no Hitorigoto  01 (NetFlix 1920x1080 AVC AAC MKV)
   </title>

If I repeat the search in web frontend (by click Repeat this search with all currently enabled indexers in the history search list), I can also get the right title [天月搬運組] 藥師少女的獨語 Kusuriya no Hitorigoto 01 (NetFlix 1920x1080 AVC AAC MKV) , this bug only affect the API.

I try to debug it by searching "File encoding" in the nzbhydra log files config/logs/nzbhydra2.log, but find nothing

How can I debug this encoding issue?

@Fmajor
Copy link

Fmajor commented Nov 18, 2023

Issue resoved by using other docker image
From my tests, these docker images have the File encoding problem

    image: lscr.io/linuxserver/nzbhydra2:latest
    image: hotio/nzbhydra2:latest
    image: ghcr.io/hotio/nzbhydra2

All of them report File encoding: ANSI_X3.4-1968 (maybe they are built based on same base-container)

and this image works fine

    image: binhex/arch-nzbhydra2

I am not familiar with java, but I searched that the file.encoding property has to be specified as the JVM starts up;

So can we add a extra -Dfile.encoding=UTF-8 option to force java use this file encoding, no matter what base os we use?

@theotherp
Copy link
Owner

Thanks for the research. I'm not sure setting that properly actually fixes anything. I added it to the wrapper in a container of lscr.io/linuxserver/nzbhydra2:latest and the API results still return mangled content. The UI though does show the correct results. The reported encoding in the log is misleading, I think.

Can you verify that the results are shown properly in the hydra UI?

@theotherp
Copy link
Owner

Nevermind, found the issue.

theotherp added a commit that referenced this issue Nov 18, 2023
@theotherp
Copy link
Owner

@Fmajor Please check newest image.

@Fmajor
Copy link

Fmajor commented Nov 20, 2023

@Fmajor Please check newest image.

still have bug in

917400ade716   lscr.io/linuxserver/nzbhydra2:latest

do i use the right image?

@theotherp
Copy link
Owner

Sorry, had to pull that release, wait for next one.

@theotherp
Copy link
Owner

Should be fixed now.

@Fmajor
Copy link

Fmajor commented Nov 21, 2023

bug fixed in container ee9bc2838785 lscr.io/linuxserver/nzbhydra2:latest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants