Skip to content

Regression 38.7.0: ascii decode error when installing package from source #1297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
asavah opened this issue Mar 17, 2018 · 8 comments
Closed

Comments

@asavah
Copy link

asavah commented Mar 17, 2018

When installing https://github.com/file/file/blob/master/python/ from source
python 2.7.14
setuptools 38.7.0
fails:

Traceback (most recent call last):
  File "setup.py", line 21, in <module>
    'Topic :: Software Development :: Libraries :: Python Modules',
  File "/home/asavah/kross/host/lib/python2.7/site-packages/setuptools-38.7.0-py2.7.egg/setuptools/__init__.py", line 129, in setup
    return distutils.core.setup(**attrs)
  File "/home/asavah/kross/host/lib/python2.7/distutils/core.py", line 151, in setup
    dist.run_commands()
  File "/home/asavah/kross/host/lib/python2.7/distutils/dist.py", line 953, in run_commands
    self.run_command(cmd)
  File "/home/asavah/kross/host/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/home/asavah/kross/host/lib/python2.7/site-packages/setuptools-38.7.0-py2.7.egg/setuptools/command/install.py", line 69, in run
    self.do_egg_install()
  File "/home/asavah/kross/host/lib/python2.7/site-packages/setuptools-38.7.0-py2.7.egg/setuptools/command/install.py", line 111, in do_egg_install
    self.run_command('bdist_egg')
  File "/home/asavah/kross/host/lib/python2.7/distutils/cmd.py", line 326, in run_command
    self.distribution.run_command(command)
  File "/home/asavah/kross/host/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/home/asavah/kross/host/lib/python2.7/site-packages/setuptools-38.7.0-py2.7.egg/setuptools/command/bdist_egg.py", line 163, in run
    self.run_command("egg_info")
  File "/home/asavah/kross/host/lib/python2.7/distutils/cmd.py", line 326, in run_command
    self.distribution.run_command(command)
  File "/home/asavah/kross/host/lib/python2.7/distutils/dist.py", line 972, in run_command
    cmd_obj.run()
  File "/home/asavah/kross/host/lib/python2.7/site-packages/setuptools-38.7.0-py2.7.egg/setuptools/command/egg_info.py", line 271, in run
    writer(self, ep.name, os.path.join(self.egg_info, ep.name))
  File "/home/asavah/kross/host/lib/python2.7/site-packages/setuptools-38.7.0-py2.7.egg/setuptools/command/egg_info.py", line 604, in write_pkg_info
    metadata.write_pkg_info(cmd.egg_info)
  File "/home/asavah/kross/host/lib/python2.7/distutils/dist.py", line 1106, in write_pkg_info
    self.write_pkg_file(pkg_info)
  File "/home/asavah/kross/host/lib/python2.7/site-packages/setuptools-38.7.0-py2.7.egg/setuptools/dist.py", line 76, in write_pkg_file
    file.write('%s: %s\n' % (field, attr_val))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc1' in position 23: ordinal not in range(128)

Setuptools 38.6.0 works fine.

Edit: python 3.6.4 + setuptools 38.7.0 also works fine.

@jaraco
Copy link
Member

jaraco commented Mar 17, 2018

Changelog indicates that change was to add the maintainer to the package metadata (#1288). My guess is these packages had non-ascii values in these fields and simply including them means we need to deal with the non-ascii values.

@pganssle Can you investigate?

@asavah
Copy link
Author

asavah commented Mar 17, 2018

Indeed the package has A with acute in author's name, https://github.com/file/file/blob/master/python/setup.py#L10

@pganssle
Copy link
Member

Very strange because I actually test for this, not sure why my test didn't catch this.

@asavah
Copy link
Author

asavah commented Mar 17, 2018

not sure why my test didn't catch this.

Probably because of https://github.com/pypa/setuptools/blob/master/setuptools/tests/test_dist.py#L1
which is not present in dist.py

@pganssle
Copy link
Member

pganssle commented Mar 17, 2018

This does actually look like the errors I was getting on Python 2.7 when I was using io.StringIO as the file I was writing, rather than StringIO.StringIO. I switched it over because I checked that "Author" was written out correctly on Python 2.7, so assumed it was an artifact of my choice of stream emulator.

I'll change the test over to use io.StringIO again, as that seems to better emulate the real situation.

@asavah Good call on the encoding - I was assuming that did not matter because the strings to encode were not in that file, but I think the fact that the strings being formatted are being taken as ASCII is what's throwing this off.

I'll test this out and send in a new PR. Sorry about the fuss.

As an aside, I actually met the author whose name is causing the problem (@turicas) after this talk at PyCon, about how dealing with text/unicode is difficult, coincidentally enough.

@pganssle
Copy link
Member

Changing the encoding on the dist.py file didn't help. I'll dig into this a bit more. It's obviously something different about how I'm handling the encoding, but the author field is not actually new, it's just being loaded in a slightly different way, and it used to work.

@pganssle
Copy link
Member

Found the problem. On Python 2.7 there is an _encode_field function that is called in get_contact(), but I was looking at the current master branch's implementation and couldn't understand what was going on. PR incoming.

@jaraco
Copy link
Member

jaraco commented Mar 18, 2018

Release going out as v39.0.1. Please test and report back if a backport to v38 is required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants