Skip to content

Files double unpacked due to different UTF-8 normalizations #1633

Open
@Safihre

Description

@Safihre

For a few days the tests on Travis have been failing for macOS, for some reason the test_download_unicode_made_on_windows was unpacking the resulting file twice.
After much debugging I found out that the string frènch_german_demö that is printed in the logs is actually presented in 2 different ways. This caused the unpacker not to detect a set was already unpacked, basically because:

>>> "frènch_german_demö" == "frènch_german_demö"
False

The reason is that they are obtained from 2 different sources:
Output of os.listdir:

b'fre\xcc\x80nch_german_demo\xcc\x88'

Output of the par2 files:

b'fr\xc3\xa8nch_german_dem\xc3\xb6'

https://stackoverflow.com/a/26733055

Now just need to find a way to fix this..

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions