Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

character mapping issue from vispy.gloo.gl._constants #2386

Closed
kephale opened this issue Sep 26, 2022 · 22 comments · Fixed by #2437
Closed

character mapping issue from vispy.gloo.gl._constants #2386

kephale opened this issue Sep 26, 2022 · 22 comments · Fixed by #2437

Comments

@kephale
Copy link

kephale commented Sep 26, 2022

It looks like there is a character encoding issue in vispy that is happening in this bug report, please correct me if I am wrong.

https://forum.image.sc/t/try-to-load-napari-library-from-stardist-napari-dock-widget-import-surface-from-polys-and-failed-like-bellow-any-help-would-be-more-than-welcome/71524

@djhoese
Copy link
Member

djhoese commented Sep 26, 2022

I am not an expert on text encoding and how Python chooses one or the other, but my understanding was that Python would default to UTF-8. It is also my understanding based on:

https://stackoverflow.com/questions/26324622/what-characters-do-not-directly-map-from-cp1252-to-utf-8

That CP1252 (the module doing the complaining about the encoding) should match UTF-8 for all ASCII characters. If I read the file on my machine, everything is telling me it is entirely ASCII. This leads me to believe the file somehow got corrupted on the users system. I think this could be tested by doing:

x = open("vispy/gloo/gl/_constants.py", encoding="cp1252", mode="r")
print(x.read())

I believe this reads the file as if it was on-disk as cp1252. When I do this locally it loads fine. I would expect a ValueError if any of the file was non-ASCII and/or non-CP1252 compatible. I don't get an error.

@djhoese
Copy link
Member

djhoese commented Sep 26, 2022

Update: Looks like Python open uses https://docs.python.org/3/library/locale.html#locale.getpreferredencoding by default which is platform specific.

@kephale
Copy link
Author

kephale commented Sep 26, 2022

Awesome, thank you kindly @djhoese.

@djhoese
Copy link
Member

djhoese commented Sep 26, 2022

Keep us updated if the user lets you know if their issue has been resolved.

@h-westmacott
Copy link

Hi @djhoese, I am currently experiencing the same issue as OP. Specifically, UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2575: character maps to <undefined> when trying to use the napari viewer library.
Attempting to open the "_constants.py" file as described above works as expected.
Any advice would be appreciated!

@djhoese
Copy link
Member

djhoese commented Nov 14, 2022

@h-westmacott Great! (that we have someone who can test, not that you're having trouble)

What do you get when you run python -c "import locale; print(locale.getpreferredencoding())" on the command line? Here's what I get on my system:

$ python -c "import locale; print(locale.getpreferredencoding())"
UTF-8

Other questions that come to mind:

  1. What operating system are you on?
  2. Are you running/importing vispy/napari with python ... from the command line? Or some other way?
  3. What do you get when you do:
python -c "for ob in list(globals().values()): print(ob); print(repr(ob))"

@h-westmacott
Copy link

h-westmacott commented Nov 14, 2022

Happy to be able to help! I'm working on a windows machine, and this issue occurs when I try to run any napari commands from python script or the terminal. resulst included below:

python -c "import locale; print(locale.getpreferredencoding())"  
cp1252
python -c "for ob in list(globals().values()): print(ob); print(repr(ob))"  
__main__
'__main__'
None
None
None
None
<class '_frozen_importlib.BuiltinImporter'>
<class '_frozen_importlib.BuiltinImporter'>
None
None
{}
{}
<module 'builtins' (built-in)>
<module 'builtins' (built-in)>

@djhoese
Copy link
Member

djhoese commented Nov 14, 2022

How do you feel about hacking your vispy installation? If you can find the _constants.py module and add a print statement before this line:

if repr(ob).startswith('GL_'):

That says print(ob) that might help us figure some of this out. Although...that print might actually produce the same error. 🤔

I guess I assume doing python -c "import vispy._constants" raises the exception for you, right?

@h-westmacott
Copy link

Adding in the print statement before line 330 adds the following to the terminal output before the traceback:

vispy.gloo.gl._constants
GL definitions converted to Python by codegen/createglapi.py.

THIS CODE IS AUTO-GENERATED. DO NOT EDIT.

Constants for OpenGL ES 2.0.


vispy.gloo.gl
<_frozen_importlib_external.SourceFileLoader object at 0x00000175161FE5E0>
ModuleSpec(name='vispy.gloo.gl._constants', loader=<_frozen_importlib_external.SourceFileLoader object at 0x00000175161FE5E0>, origin='C:\\Users\\User\\anaconda3\\envs\\lightsheet\\lib\\site-packages\\vispy\\gloo\\gl\\_constants.py')
C:\Users\User\anaconda3\envs\lightsheet\lib\site-packages\vispy\gloo\gl\_constants.py
C:\Users\User\anaconda3\envs\lightsheet\lib\site-packages\vispy\gloo\gl\__pycache__\_constants.cpython-39.pyc

python -c "import vispy.gloo.gl._constants" does indeed raise the same exception.
python -c "import vispy._constants" raises ModuleNotFoundError: No module named 'vispy._constants'

@djhoese
Copy link
Member

djhoese commented Nov 14, 2022

python -c "import vispy.gloo.gl._constants" does indeed raise the same exception.
python -c "import vispy._constants" raises ModuleNotFoundError: No module named 'vispy._constants'

🤦‍♂️ Thanks.

For the print out output, does the traceback say the print is the cause or the repr line?

@djhoese
Copy link
Member

djhoese commented Nov 14, 2022

One more hack:

ENUM_MAP = {}
for key, ob in list(globals().items()):
    print(f"#### {key}:")
    print(ob)
    if repr(ob).startswith('GL_'):
        ENUM_MAP[int(ob)] = ob
del ob

If it is the __builtins__ dict then we have a lot more work to do.

@h-westmacott
Copy link

Yes, the print is now the cause of the repr line:

...
  File "C:\Users\User\anaconda3\envs\lightsheet\lib\site-packages\vispy\gloo\gl\_constants.py", line 330, in <module>
    print(ob)
  File "C:\Users\User\anaconda3\envs\lightsheet\lib\_sitebuiltins.py", line 61, in __repr__
    self.__setup()
  File "C:\Users\User\anaconda3\envs\lightsheet\lib\_sitebuiltins.py", line 51, in __setup
    data = fp.read()
  File "C:\Users\User\anaconda3\envs\lightsheet\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2575: character maps to <undefined>

Following the extra hack, the console outputs:

#### __name__:
vispy.gloo.gl._constants
#### __doc__:
GL definitions converted to Python by codegen/createglapi.py.

THIS CODE IS AUTO-GENERATED. DO NOT EDIT.

Constants for OpenGL ES 2.0.


#### __package__:
vispy.gloo.gl
#### __loader__:
<_frozen_importlib_external.SourceFileLoader object at 0x000001B8E40CE610>
#### __spec__:
ModuleSpec(name='vispy.gloo.gl._constants', loader=<_frozen_importlib_external.SourceFileLoader object at 0x000001B8E40CE610>, origin='C:\\Users\\User\\anaconda3\\envs\\lightsheet\\lib\\site-packages\\vispy\\gloo\\gl\\_constants.py')
#### __file__:
C:\Users\User\anaconda3\envs\lightsheet\lib\site-packages\vispy\gloo\gl\_constants.py
#### __cached__:
C:\Users\User\anaconda3\envs\lightsheet\lib\site-packages\vispy\gloo\gl\__pycache__\_constants.cpython-39.pyc
#### __builtins__:

@djhoese
Copy link
Member

djhoese commented Nov 14, 2022

Ok so now, create a test module in your current directory named check_builtins.py with the below in it:

print(__builtins__)

And do python -c "import check_builtins".

@djhoese
Copy link
Member

djhoese commented Nov 14, 2022

I should have said this earlier, but regardless of what the cause of this error is we can definitely simplify this for loop and avoid unimportant things in the globals dictionary (things prefixed with _ for one). But I'd really like to know what in Python itself is causing an error. Might be something that can be fixed upstream in CPython.

@djhoese
Copy link
Member

djhoese commented Nov 14, 2022

Wild idea...does your username have non-ASCII characters in it?

@h-westmacott
Copy link

The check_builtins.py file raises the same error:

python -c "import check_builtins"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\User\Documents\lightsheet-model\train-lightsheet-model\check_builtins.py", line 1, in <module>
    print(__builtins__)
  File "C:\Users\User\anaconda3\envs\lightsheet\lib\_sitebuiltins.py", line 61, in __repr__
    self.__setup()
  File "C:\Users\User\anaconda3\envs\lightsheet\lib\_sitebuiltins.py", line 51, in __setup
    data = fp.read()
  File "C:\Users\User\anaconda3\envs\lightsheet\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 2575: character maps to <undefined>

No non-ASCII in my username, no.

@djhoese
Copy link
Member

djhoese commented Nov 15, 2022

Ok so we've removed vispy from the equation. So that part of _sitebuiltins.py is parsing license and copyright files. You can see that here:

https://github.com/python/cpython/blob/f13f466474ed53529acd3f209070431fbae14323/Lib/_sitebuiltins.py#L40-L42

And you can see where this is used here:

https://github.com/python/cpython/blob/f13f466474ed53529acd3f209070431fbae14323/Lib/site.py#L404-L427

We know from the error message that the problem is only with one of the _Printer usages where files are loaded/read so credits and copyright shouldn't be the problem. That means it is the license file. So updating the check_builtins.py script with print(__builtins__['license']) should still fail on your system.

If that fails, then could you find the LICENSE (or LICENSE.txt) file in your lib/pythonXX directory and copy it here? For example, mine on linux is here:

~/miniconda3/envs/satpy_py310/lib/python3.10/LICENSE.txt

@djhoese
Copy link
Member

djhoese commented Nov 15, 2022

Oh or the license file could be in your current directory or your parent directory.

@djhoese
Copy link
Member

djhoese commented Nov 15, 2022

Scratch the parent directory idea, but I do think it can be a license file in your current directory. And you don't need to guess you could just edit the check script to do print(__builtins__['license']._Printer__filenames).

And I'm a little confused because at least in the current main branch of CPython the license file is read as UTF-8:

https://github.com/python/cpython/blob/f13f466474ed53529acd3f209070431fbae14323/Lib/_sitebuiltins.py#L50

So why is cp1252.py being used.

@djhoese
Copy link
Member

djhoese commented Nov 15, 2022

Ah I bet this is a bug that was fixed in Python 3.10. Here is the open line in Python 3.9:

https://github.com/python/cpython/blob/c09dba57cfbbf74273ce44b1f48f71b46806605c/Lib/_sitebuiltins.py#L50

And in Python 3.10:

https://github.com/python/cpython/blob/bc2cdfc81571dc759a90b94dd3f4858b98cad1eb/Lib/_sitebuiltins.py#L50

An encoding was added.

@Charles-Fieseler-Vienna

I had this issue on Windows, reproduced by:
python -c "import vispy.gloo.gl._constants"

Running with the "-X utf8" flag fixed the error here and in a broader napari/vispy call:
python -X utf8 -c "import vispy.gloo.gl._constants"

@djhoese
Copy link
Member

djhoese commented Dec 16, 2022

Thanks @Charles-Fieseler-Vienna. I'll add my suggested fix from #1330. If someone wants to make a PR to fix it that would be greatly appreciated:

I think a good solution might be the following:

ENUM_MAP = {}
for var_name, ob in list(globals().items()):
    if var_name.startswith('GL_'):
        ENUM_MAP[int(ob)] = ob
del ob, var_name

This has the added benefit that we don't even try repr'ing the objects at all. If you look at all the constants defined in that module they all start with GL_ so this just avoids dealing with anything else. Note that the edit would have to occur here:

DEFINE_CONST_MAP = '''
ENUM_MAP = {}
for ob in list(globals().values()):
if repr(ob).startswith('GL_'):
ENUM_MAP[int(ob)] = ob
del ob
'''

As the _constants.py module is generated from this code. I think I will then need to run the code to regenerate the files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants