New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Media analysis fails on ZIP file with exotic charset #41
Comments
I'll test on my end. Non-ASCII shouldn't be an issue as latin accented characters work well. |
I can't reproduce the issue, it works fine on my Macbook with katakana characters in the directory name, file name, or both. In order for me to dig deeper, could you please provide:
Thanks |
I'm running Komga in a Docker container on Ubuntu 18.0.4.
The mangas are bind mounted from a folder on hard disk
Here is the log with the exception when parsing the manga with non-ascii title |
Thanks a lot for the information. That's an error while accessing the content of the zip file. If you are able to provide me with this particular file, I will investigate more with the debugger and try to find where it's coming from. I have seen a few errors on archives for various reasons that are usually fixed by fixing the archive (extract files, archive again with a proper archiver). But since you mention it's working when you remove the characters it seems to be coming from something else. |
Here is the files. |
Thanks. I did a few tests, and i would say it's not coming from the file name, but from a combination of name and file. When i use the exact same name of your file with japanese characters on another of my good files, it works. I tried repackaging your file, just extracting, then adding in a new zip, and the resulting file (with the same name as the original) parses properly. To be honest i had a few issues with the native Java zip library, but on less than 1% of the files i tested. But those files would open nicely using other archiving utilities (like The Unarchiver). So far i have dismissed the issue, as usually the remedy is as simple as extract/archive again. Could you try on your end to extract/archive, and see if you still have the problem ? Also, do you have the issue with other files, or just this one ? If the problem was more widespread, and the workaround not working, I would need to start looking at some alternative zip libraries for Java to better handle the archives. |
I think I have figured out the reason. Edit: I noticed that Linux's |
Hi, I've just tested reading the archive with Apache common-compress and it worked correctly. So perhaps you might consider using it instead java.util.zip? The interface is pretty similar to the java native one so it should be easy to port to this library. |
I've done a bit of reading and indeed the charset of zip file is a bit confusing, mostly because you have to guess it, it's not stored in the archive. As i mentioned in my previous post, given the error rate was small i did not look for any other solution (and it was mostly impacting me!). I'll keep this issue open, and have a look at other zip libraries (including the one you mentioned, thanks!) to see if i can replace it. |
I just tried a drop-in replacement of I will release a beta version and test it on my complete library, if it works i'll release that to prod. |
replacement of java.util.zip.ZipFile by org.apache.commons.compress.archivers.zip.ZipFile
## [0.10.1](v0.10.0...v0.10.1) (2020-01-01) ### Bug Fixes * **webui:** remove CDN usage for icons and fonts ([c88a27c](c88a27c)), closes [#45](#45) * **webui:** show all books when browsing series ([85ca99d](85ca99d)) * **zip extractor:** better handling of exotic charsets ([0254d7d](0254d7d)), closes [#41](#41)
🎉 This issue has been resolved in version 0.10.1 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
Thanks a lot ! |
I have several manga with Japanese character in the title.
Currently komga shows those manga in the list view but without cover image and when going to the detail pages, it shows "no chapters found" in Tachiyomi and a blank page in the web view.
If I remove non-ascii characters from the title (by renaming the folder & the .zip file in it), everything works again.
The text was updated successfully, but these errors were encountered: