How to handle encoding exceptions #54

michaeta · 2012-05-05T06:24:06Z

Since anyone in the world can upload packages to a pip2 compatible index, there are bound to be issues with displaying characters using various encoding standards. This is mostly a problem with the search command when it tries to display a package name and summary that uses an unknown or unsupported encoding. These problems don't have to stem from pip2 either, they could also be caused by the user's shell. What encodings should pip2 support? How should we handle encoding exceptions when they are raised? Should we skip over the result and notify the user?

Currently pip2 is capable of catching some unicode exceptions, if it encounters one when attempting to display a package's information then it will attempt to print just the package name and skip over the summary (which is what usually causes the exception) . If it is unable to print the package name then pip2 will skip that package and move onto the next.

merwok · 2012-05-07T22:28:07Z

A few data points:

METADATA files and old PKG-INFO files are ASCII or UTF-8.
print will convert a str (Python 3) object to the right encoding for the user’s terminal.

Problems can however arise if people use the PyPI HTML forms or upload a manually-written PKG-INFO file; I would be reasonable IMO to modify PyPI to reject non-UTF-8 input. I can float the idea on the catalog-sig mailing list on your behalf if you want.

njwilson · 2012-05-08T16:31:10Z

@merwok, raising your idea on the catalog-sig mailing list sounds like a good idea.

That won't fix our issue though. We'll still have to figure out how to handle these exceptions gracefully as long as there are bad packages on PyPI.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to handle encoding exceptions #54

How to handle encoding exceptions #54

michaeta commented May 5, 2012

merwok commented May 7, 2012

njwilson commented May 8, 2012

How to handle encoding exceptions #54

How to handle encoding exceptions #54

Comments

michaeta commented May 5, 2012

merwok commented May 7, 2012

njwilson commented May 8, 2012