-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid UTF-8 (Core Dump) #191
Comments
Hi and thanks for your report. It's most definitely a weird file name and for some reason nobody has encountered this before. Congrats! :) I reckon some of the files originate from Windows or from a system using a non-UTF-8 locale. Given that it's just 170 files, it should be relatively easy to spot. I can easily reproduce this here with a specially crafted file name:
I'm not entirely sure about the best way to fix this yet, but it's certainly not going to take long. |
In the meantime, a workaround is to use the |
Fixed in v0.9.1. |
FWIW, I did, but I thought this was my problem, so I ran |
I was trying to compress a very large folder with mkdwarfs (v0.8.0) on archlinux when I got the error "terminate called after throwing an instance of 'utf8::invalid_utf8'what(): Invalid UTF-8" midway through the process of compressing. The verbose output of the error is here https://pastebin.com/LZaNWgeT . I'm very new to debugging but at a glance it could be one of the files has a strange name.
Or possibly, the chunk size was too small
"appending 512 bytes to block 726 @ 16,695,296 from chunkable offset 1,048,109"
the chunkable offset 1,048,109 is 467 away from 2^20 so appending 512 bytes may be what is causing the issue if it higher than the block size.
Again, I'm not familiar with this codebase and this is all purely speculation.
I hope this can be fixed soon and easily.
The text was updated successfully, but these errors were encountered: