Skip to content
This repository was archived by the owner on Apr 29, 2021. It is now read-only.

correctly parse metadata/record files, with respect to unicode characters #1

Merged
merged 1 commit into from
Aug 21, 2014

Conversation

cgtx
Copy link
Contributor

@cgtx cgtx commented Aug 14, 2014

I'm working on building python 3.4.1 and setuptools 5.5.1 for EL6, using the Fedora RPMs as a guide. I get multiple failures during the test suite, all similar to below.

======================================================================
ERROR: test_basic_bootstrapping (test.test_ensurepip.TestBootstrap)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/builddir/build/BUILD/Python-3.4.1/Lib/test/test_ensurepip.py", line 52, in test_basic_bootstrapping
    ensurepip.bootstrap()
  File "/builddir/build/BUILD/Python-3.4.1/Lib/ensurepip/__init__.py", line 105, in bootstrap
    new_whl = rewheel.rewheel_from_record(dr, rewheel_dir.name)
  File "/builddir/build/BUILD/Python-3.4.1/Lib/ensurepip/rewheel/__init__.py", line 63, in rewheel_from_record
    new_wheel_name = get_wheel_name(record_path)
  File "/builddir/build/BUILD/Python-3.4.1/Lib/ensurepip/rewheel/__init__.py", line 90, in get_wheel_name
    metadata = email.parser.Parser().parsestr(open(metadata_path).read())
  File "/builddir/build/BUILD/Python-3.4.1/Lib/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 9749: ordinal not in range(128)

In setuptools 5.5.1, there is an é character on line 234 of the METADATA file. Fedora rawhide has setuptools 2.0, which does not have this character. I could be interpreting this wrong, but I believe that the failures I am seeing are due to the fact that rewheel is opening these files with a simple open(), instead of the recommended codecs.open(), which can be made unicode-safe.

http://stackoverflow.com/questions/147741/character-reading-from-file-in-python

Feel free to tell me I'm totally off-base on this one, but this patch should fix it.

@bkabrda
Copy link
Member

bkabrda commented Aug 21, 2014

Good catch!
The problem is that rpmbuild uses LANG=C, which makes Python think it's in ASCII environment and therefore it tries to open files with ASCII encoding - which fails in this case because of the unicode-only character.
Now, we could fix this by providing encoding=utf-8 argument to open (it accepts encoding argument in Python 3), but that would make rewheel Python 2 incompatible. Since I've heard that upstream is considering backporting pyvenv and bundled setuptools/pip into Python 2, I don't want to do that.
That makes your solution the best available, so merging. I'll build this in Fedora ASAP, since we'd hit the same problem soon, too. Thanks again!

bkabrda added a commit that referenced this pull request Aug 21, 2014
correctly parse metadata/record files, with respect to unicode characters
@bkabrda bkabrda merged commit 92fd4dd into fedora-python:master Aug 21, 2014
carlwgeorge pushed a commit to iuscommunity-pkg/python35u that referenced this pull request Nov 3, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants