New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distutils produces metadata in unknown encoding #197

Closed
bb-migration opened this Issue May 3, 2014 · 3 comments

Comments

Projects
None yet
1 participant
@bb-migration

bb-migration commented May 3, 2014

Originally reported by: jaraco (Bitbucket: jaraco, GitHub: jaraco)


In Pull Request 45, Jurko observed that on Python 3.1, Python will generate the metadata files in an encoding relative to the build user's environment. Furthermore, starting with Python 3.2 but also on Python 2.6 and 2.7, the content is encoded using UTF-8.

pkg_resources currently assumes the metadata is UTF-8, so if non-ASCII characters are present and egg_info is run on Python 3.1 or earlier, the resulting metadata will fail to load on Python 3.2+.


@bb-migration

This comment has been minimized.

bb-migration commented May 3, 2014

Original comment by jaraco (Bitbucket: jaraco, GitHub: jaraco):


Monkey-patch the write_pkg_info method on Python 3.1 DistributionMetadata. Fixes #197

@bb-migration

This comment has been minimized.

bb-migration commented May 4, 2014

Original comment by jurko (Bitbucket: jurko, GitHub: jurko):


Yup, just checked 2.6.6, 2.7.6 & 3.4.4 and they all do correct utf-8 encoding (although Python2 versions require that the data be given as unicode and not str in order for the encoding to be applied).

@bb-migration

This comment has been minimized.

bb-migration commented May 12, 2014

Original comment by jurko (Bitbucket: jurko, GitHub: jurko):


I added pull request #52, generalizing the solution for this issue to more Python versions.

Some more background information on this issue:

  • Python 2.x supports writing package meta data given as utf-8 encoded byte strings, and since Python 2.6 it also supports writing package meta data given as a unicode string (CPython commit 4c683ec4415b3c4bfbc7fe7a836b949cb7beea03)
  • Python 3.x only supports writing package meta data given as a unicode string Python [3.0 - 3.2.2> does not support writing package meta data containing non-ASCII characters due to a distutils bug
  • Python 3.2.2 fixes the distutils bug (CPython commit fb4d2e6d393e96baac13c4efc216e361bf12c293)

setuptools commit 1cd816bb7c933eecd9d8464e054b21c7d5daf2df works around the non-ASCII character issue for Python version 3.1.

Pull request #52 applies the same workaround for Python version range [3.0 - 3.2.2>.

Hope this helps.

Best regards,
Jurko Gospodnetić

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment