Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image.open raise UnicodeDecodeError if locale setting modified #272

Closed
rsaint opened this issue Jul 5, 2013 · 16 comments

Comments

Projects
None yet
7 participants
@rsaint
Copy link

commented Jul 5, 2013

import locale
from PIL import Image

locale.setlocale(locale.LC_ALL, "polish")
im = Image.open('test.jpg')

Traceback (most recent call last):
  File "E:\DOWNLOADS\jpegtran\test.py", line 6, in <module>
    im = Image.open('test.jpg')
  File "G:\Python27\lib\site-packages\PIL\Image.py", line 1982, in open
    preinit()
  File "G:\Python27\lib\site-packages\PIL\Image.py", line 300, in preinit
    from PIL import PpmImagePlugin
  File "G:\Python27\lib\site-packages\PIL\PpmImagePlugin.py", line 27, in <module>
    b_whitespace = string.whitespace.encode()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 6: ordinal
not in range(128)
@aclark4life

This comment has been minimized.

Copy link
Member

commented Jul 5, 2013

Interesting, so what's the fix? Try/except to fallback to something else… looks like maybe the encode should check the locale for the proper encoding? Guessing here.

@d-schmidt

This comment has been minimized.

Copy link
Contributor

commented Jul 5, 2013

Does the image work fine if you don't change the language? Does it crash with every image once you changed the language? If it is the first, please upload the image!

@wiredfool

This comment has been minimized.

Copy link
Member

commented Jul 5, 2013

What does locale.setlocale(locale.LC_ALL, "polish") do? is "polish" a valid locale, as I can't get it to work here. [edit: I see this is probably windows, which may vary] I did generate the pl_PL.UTF-8 locale, and with that, this test passes:

from tester import *
from PIL import Image

import locale


path = "Images/lena.jpg"

def test_sanity():
    assert_no_exception(lambda: Image.open(path))
    locale.setlocale(locale.LC_ALL, "pl_PL.UTF-8")
    assert_no_exception(lambda: Image.open(path))
@rsaint

This comment has been minimized.

Copy link
Author

commented Jul 6, 2013

Windows 7, "polish" is valid locale.

import locale
print locale.setlocale(locale.LC_ALL, 'polish')
import string
print len(string.whitespace)
print ord(string.whitespace[6])

Polish_Poland.1250
7
160

string.whitespace has one addtional character: non-breaking space 0xa0 (160)
if string module imported after locale changed to "polish" and cause problems (not in
ascii range)

@wiredfool

This comment has been minimized.

Copy link
Member

commented Jul 9, 2013

Ok, so clearly in some locales, there's whitespace that's not a valid ascii character. I'm not sure what the specs are on PPM images, but I'm going to take a wild guess that it's ascii, since it's a really old spec. And that the tokens are separated by ordinary whitespace. In fact, I'd further say that the client locale really has nothing to do with the ppm spec.

String.encode should probably specify the default ascii encoding, and specify some error parameter, or we should just get the ordinary whitespace and go with that.

This should do it:

b_whitespace = string.whitespace.encode('ascii', 'ignore')
@aclark4life

This comment has been minimized.

Copy link
Member

commented Jul 9, 2013

@rsaint Please confirm this is fixed

@rsaint

This comment has been minimized.

Copy link
Author

commented Jul 9, 2013

Confirm. Thank you.

@aclark4life

This comment has been minimized.

Copy link
Member

commented Jul 9, 2013

Great! Thank you both

@sirws

This comment has been minimized.

Copy link

commented Sep 27, 2013

I still get the error even with the fix.

C:\Users\scot5141>C:\Demos\thumbnailGP\ThumbnailGenerator\image.py
2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)]
Traceback (most recent call last):
  File "C:\Demos\thumbnailGP\ThumbnailGenerator\image.py", line 8, in <module>
    background = Image.open(itemType)
  File "C:\Python27\ArcGIS10.2\lib\site-packages\PIL\Image.py", line 1982, in op
en
    preinit()
  File "C:\Python27\ArcGIS10.2\lib\site-packages\PIL\Image.py", line 300, in pre
init
    from PIL import PpmImagePlugin
  File "C:\Python27\ArcGIS10.2\lib\site-packages\PIL\PpmImagePlugin.py", line 28
, in <module>
    b_whitespace = string.whitespace.encode('ascii', 'ignore')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 6: ordinal
not in range(128)
@wiredfool

This comment has been minimized.

Copy link
Member

commented Sep 27, 2013

@sirws What platform and locale are you using?

@sirws

This comment has been minimized.

Copy link

commented Sep 27, 2013

This is Windows 7, 32-bit, python 2.7.3. ('English_United States', '1252')

I get the error only when I also import the arcpy site package: http://resources.arcgis.com/en/help/main/10.2/index.html#//000v00000001000000

I am no python expert. When I changed this:
b_whitespace = string.whitespace.encode('ascii', 'ignore')

to

b_whitespace = ''

it works for me, but I am pretty sure that is not the best thing to do. Any thoughts?

@sirws

This comment has been minimized.

Copy link

commented Sep 27, 2013

The arcpy site package is what is setting the locale to ('English_United States', '1252').

aclark4life added a commit that referenced this issue Sep 30, 2013

Merge pull request #346 from mhogg/master
Bug fix for encoding of b_whitespace - Similar to closed issue #272
@sirws

This comment has been minimized.

Copy link

commented Sep 30, 2013

Great! That seems to take care of it for me.

@markmiscavage

This comment has been minimized.

Copy link
Contributor

commented Oct 25, 2017

I appear to still be getting this error. I'm only seeing this currently on SmartOS base-64 17.3.0.

PIL/PpmImagePlugin.py line 33 is failing with UnicodeDecodeError: 'utf8' codec can't decode byte 0xa0 in position 6: invalid start byte, but the try/except is just skipping that error to fail with the same error seen above.

b_whitespace = string.whitespace.strip() seems to fix the problem for me.

I can submit a pull request if this seems like a good solution.

@hugovk

This comment has been minimized.

Copy link
Member

commented Oct 30, 2017

@markmiscavage A PR would be a good starting point and though we don't test SmartOS it will check it doesn't break existing ones on the CIs.

@markmiscavage

This comment has been minimized.

Copy link
Contributor

commented Oct 31, 2017

Pull request created.
#2820

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.