Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for a problem of irreversible filename conversions #14

Merged
merged 2 commits into from Mar 22, 2013

Conversation

devgs
Copy link
Contributor

@devgs devgs commented Apr 12, 2012

Annotation:

Sometimes there might be a problem converting UTF-8 to zips native
charset(CP866), because it's not 1:1 conversion. So my solution is to
allow
developers provide their own version of converted filename and pass it
to mod_zip along with UTF-8 filename which will go straight to Unicode
path extra field (thanks to tony2001). So separator is a solution that
doesn't
break current format. And allows passing file name in both formats as
one string.

Normally we pass:
CRC32 [size] [path] [filename]\n
...

  • [filename] passed to archive as filename w/o conversion

  • UFT-8 flag for filename is set

    tony2001's X-Archive-Charset: [charset] way:
    CRC32 [size] [path] [filename]\n
    ...

  • [filename] is accepted to be UTF-8 string

  • [filename], converted to [charset] and passed to archive as filename

  • [filename] passed to Unicode path extra field

  • UFT-8 flag for filename is not set

    My X-Archive-Name-Sep: [sep] solution:
    CRC32 [size] [path] [native-filename][sep][utf8-filename]\n
    ...

  • [native-filename] passed to archive as filename w/o conversion

  • [utf8-filename] passed to Unicode path extra field

  • UFT-8 flag for filename is not set

    You just need to provide separator that won't interfere with file
    names. I suggest using '/'
    as it is ASCII character and forbidden on most (if not all) platforms
    as a part of filename.

    Empty separator string means no UTF-8 version provided. Usefull when we
    need to pass only
    names encoded in native charset. It's equal to 'X-Archive-Charset:
    native;'.
    Note: Currently it is impossible after '[PATCH] Support for UTF-8 file
    names.'(4f61592)
    because UFT-8 flag (zip_utf8_flag) is set default for templates.

@evanmiller evanmiller merged commit b643a71 into evanmiller:master Mar 22, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants