Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Archives with undetected character sets #375

Closed
gingerbeardman opened this Issue Feb 28, 2019 · 7 comments

Comments

Projects
None yet
2 participants
@gingerbeardman
Copy link
Contributor

gingerbeardman commented Feb 28, 2019

(I have to zip up the LZH archives to add them to GitHub)

Japanese

@aonez aonez self-assigned this Mar 1, 2019

@aonez

This comment has been minimized.

Copy link
Owner

aonez commented Mar 1, 2019

Thanks @gingerbeardman, the charset detection must be improved.

My results where different though. In my tests:

  • bakaha10: TUA asks encoding, Archiver and Keka use bad encoding
  • hanaf131: TUA asks encoding, Archiver and Keka use bad encoding
  • hatiha10: TUA and Archiver ok, Keka uses bad encoding
  • kikyo11: Archiver ok, TUA guesses correct encoding, Keka uses bad encoding
  • Sengoku-Hanafuda_ver.1.09: Archiver ok, TUA guesses correct encoding, Keka asks encoding

@aonez aonez added the enhancement label Mar 1, 2019

@aonez aonez added this to the 1.2.0 milestone Mar 1, 2019

@gingerbeardman

This comment has been minimized.

Copy link
Contributor Author

gingerbeardman commented Mar 1, 2019

Let's go with your test results. I'm not sure why mine were different.

@aonez aonez modified the milestones: 1.2.0, 1.1.13 Mar 1, 2019

@aonez aonez added the core label Mar 1, 2019

@aonez

This comment has been minimized.

Copy link
Owner

aonez commented Mar 1, 2019

Just realized Keka has LHA extraction support but no format detection. I'm fixing that.

@gingerbeardman

This comment has been minimized.

Copy link
Contributor Author

gingerbeardman commented Mar 2, 2019

Just an addition to this, whilst Keka extracts these files from #374 correctly, the incorrect charset is used for some filenames:

@aonez

This comment has been minimized.

Copy link
Owner

aonez commented Mar 3, 2019

Thanks @gingerbeardman! Already noted that on #374 (comment).

It's incredible how many LHA variations there're. Do you think Japanese people still using it for compression? Or are those old files (like golden game gems) that still compressed in that format?

@gingerbeardman

This comment has been minimized.

Copy link
Contributor Author

gingerbeardman commented Mar 3, 2019

LHA/LZH just really took off in Japan, because it was created by a Japanese guy. It's still used widely, in fact Windows 7 Japanese comes with LZH folder support! Not sure about Windows 10 Japanese but I assume it will still have it.

https://en.wikipedia.org/wiki/LHA_(file_format)

@aonez

This comment has been minimized.

Copy link
Owner

aonez commented Mar 19, 2019

@gingerbeardman version 1.1.13 just released with most LZH files properly extracted 😊 As for the EXE ones, I'll try to fix that one in #374. It is fairly more complicated to detect the compression method used in some EXE files.

Thanks for the feedback as always!

@aonez aonez closed this Mar 19, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.