Workaround for a problem of irreversible filename conversions #14
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Annotation:
Sometimes there might be a problem converting UTF-8 to zips native
charset(CP866), because it's not 1:1 conversion. So my solution is to
allow
developers provide their own version of converted filename and pass it
to mod_zip along with UTF-8 filename which will go straight to Unicode
path extra field (thanks to tony2001). So separator is a solution that
doesn't
break current format. And allows passing file name in both formats as
one string.
Normally we pass:
CRC32 [size] [path] [filename]\n
...
[filename] passed to archive as filename w/o conversion
UFT-8 flag for filename is set
tony2001's X-Archive-Charset: [charset] way:
CRC32 [size] [path] [filename]\n
...
[filename] is accepted to be UTF-8 string
[filename], converted to [charset] and passed to archive as filename
[filename] passed to Unicode path extra field
UFT-8 flag for filename is not set
My X-Archive-Name-Sep: [sep] solution:
CRC32 [size] [path] [native-filename][sep][utf8-filename]\n
...
[native-filename] passed to archive as filename w/o conversion
[utf8-filename] passed to Unicode path extra field
UFT-8 flag for filename is not set
You just need to provide separator that won't interfere with file
names. I suggest using '/'
as it is ASCII character and forbidden on most (if not all) platforms
as a part of filename.
Empty separator string means no UTF-8 version provided. Usefull when we
need to pass only
names encoded in native charset. It's equal to 'X-Archive-Charset:
native;'.
Note: Currently it is impossible after '[PATCH] Support for UTF-8 file
names.'(4f61592)
because UFT-8 flag (zip_utf8_flag) is set default for templates.