Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for incorrect default encoding on Windows. #4163

Merged
merged 1 commit into from Sep 5, 2013

Conversation

dhirschfeld
Copy link

Whilst trying out rendering notebooks in a flask app under Apache on Windows I got the below error when simply trying to import SlidesExporter

mod_wsgi (pid=6260): Exception occurred processing WSGI script 'flask_test.wsgi'.
Traceback (most recent call last):
  File "flask_test.py", line 81, in render_notebook
    from IPython.nbconvert.exporters import SlidesExporter
  File "c:\\dev\\code\\ipython\\IPython\\__init__.py", line 47, in <module>
    from .terminal.embed import embed
  File "c:\\dev\\code\\ipython\\IPython\\terminal\\embed.py", line 32, in <module>
    from IPython.terminal.interactiveshell import TerminalInteractiveShell
  File "c:\\dev\\code\\ipython\\IPython\\terminal\\interactiveshell.py", line 25, in <module>
    from IPython.core.interactiveshell import InteractiveShell, InteractiveShellABC
  File "c:\\dev\\code\\ipython\\IPython\\core\\interactiveshell.py", line 59, in <module>
    from IPython.core.prompts import PromptManager
  File "c:\\dev\\code\\ipython\\IPython\\core\\prompts.py", line 138, in <module>
    HOME = py3compat.str_to_unicode(os.environ.get("HOME","//////:::::ZZZZZ,,,~~~"))
  File "c:\\dev\\code\\ipython\\IPython\\utils\\py3compat.py", line 18, in decode
    return s.decode(encoding, "replace")
LookupError: unknown encoding: cp0

A little bit of googling suggests that Windows returns 'cp0' to indicate there is no code page. This fix simply looks for this invalid value and replaces it with something valid. With this change it works for me.

@takluyver
Copy link
Member

Can the relevant line have an explanatory comment saying under what circumstances cp0 occurs?

Also, is utf-8 the best default for Windows? Windows doesn't normally use utf-8. In western locales, the default code page is cp1252, I think.

@dhirschfeld
Copy link
Author

In my Western European locale cp1252 is the default, but I doubt it is in many other parts of the world so in the absence of any information I thought it might be best to use something that has a chance of working everywhere. It certainly doesn't seem to have any adverse effects for me choosing utf-8 when it should be cp1252. I'll defer to whatever you think is best as I don't have any experience/expertise in this area!

@dhirschfeld
Copy link
Author

See also http://stackoverflow.com/questions/17096631/lookuperror-unknown-encoding-cp0 where the user (arbitrarily) chose cp437

@takluyver
Copy link
Member

The issue is that the default encoding is supposed to be our best guess for talking to various system things, like running subprocesses. If the system reports cp0, then we don't have much information to go on, but as I understand it, utf-8 is quite unlikely to be the right guess anywhere (on Windows). It's possible to set code page 65001 to use UTF-8, but I don't think that's the default anywhere, and not many users do it manually.

cp1252 certainly won't be correct for everyone, but if we have no other information, I think it's our best guess on Windows.

cp1252, utf-8 and ascii (and most other code pages) are equivalent so long as you stay within ASCII characters (bytes 0 to 127). So if you're an English speaker like me, it will probably work no matter what you pick. To test it, you need to throw in some characters like é or ø or €.

@jstenar : You're the Windows & unicode expert - what's the best default to pick if we don't have any information on the current code page?

@jstenar
Copy link
Member

jstenar commented Sep 5, 2013

I do not know what can cause cp0 but I wouldn't use utf-8 as a replacement. Perhaps we should generate a warning if cp0 is detected at start and have a way to let the user select a replacement value for cp0. It will be impossible to guess a value that works in all cases.

I would suggest using cp1252 or plain ascii as a default.

@dhirschfeld
Copy link
Author

I got my local branch in a little bit of a mess so I rebased on top of master, squashed all the commits into one and forced pushed.

I'm not sure how you'd let the user choose since the error is triggered by simply importing from ipython rather than actually running it. Also, it seems this only crops up when a service account (such as used by a webserver) is used so there may be no human user to ask what the default should be. I did take your advice and add a warning so the user has a chance of detecting the fact that a possibly invalid code page may be being used.

@takluyver
Copy link
Member

Yep, I don't think prompting the user is the best path even if it is feasible.

I think this looks good now. We can always make more changes if problems crop up later, but I'm merging it.

takluyver added a commit that referenced this pull request Sep 5, 2013
Fix for incorrect default encoding on Windows.
@takluyver takluyver merged commit b6e7332 into ipython:master Sep 5, 2013
@dhirschfeld dhirschfeld deleted the encoding-fix branch September 5, 2013 16:29
minrk added a commit that referenced this pull request Sep 9, 2013
Whilst trying out rendering notebooks in a flask app under Apache on Windows I got the below error when simply trying to import `SlidesExporter`
```python
mod_wsgi (pid=6260): Exception occurred processing WSGI script 'flask_test.wsgi'.
Traceback (most recent call last):
  File "flask_test.py", line 81, in render_notebook
    from IPython.nbconvert.exporters import SlidesExporter
  File "c:\\dev\\code\\ipython\\IPython\\__init__.py", line 47, in <module>
    from .terminal.embed import embed
  File "c:\\dev\\code\\ipython\\IPython\\terminal\\embed.py", line 32, in <module>
    from IPython.terminal.interactiveshell import TerminalInteractiveShell
  File "c:\\dev\\code\\ipython\\IPython\\terminal\\interactiveshell.py", line 25, in <module>
    from IPython.core.interactiveshell import InteractiveShell, InteractiveShellABC
  File "c:\\dev\\code\\ipython\\IPython\\core\\interactiveshell.py", line 59, in <module>
    from IPython.core.prompts import PromptManager
  File "c:\\dev\\code\\ipython\\IPython\\core\\prompts.py", line 138, in <module>
    HOME = py3compat.str_to_unicode(os.environ.get("HOME","//////:::::ZZZZZ,,,~~~"))
  File "c:\\dev\\code\\ipython\\IPython\\utils\\py3compat.py", line 18, in decode
    return s.decode(encoding, "replace")
LookupError: unknown encoding: cp0
```
A little bit of [googling](http://bugs.python.org/issue6501) suggests that Windows returns 'cp0' to indicate there is no code page. This fix simply looks for this invalid value and replaces it with something valid. With this change it works for me.
yarikoptic added a commit to yarikoptic/ipython that referenced this pull request May 2, 2014
* commit 'rel-1.1.0-3-gb8b89ca': (66 commits)
  Backport PR ipython#4209: Magic doc fixes
  Backport PR ipython#4204: remove some extraneous print statements from IPython.parallel
  back to dev
  release 1.1.0
  don't upload to GitHub in release script
  1.1 backport stats
  Backport PR ipython#4188: Allow user_ns trait to be None
  Backport PR ipython#4189: always fire LOCAL_IPS.extend(PUBLIC_IPS)
  Backport PR ipython#4174: various issues in markdown and rst templates
  Backport PR ipython#4181: nbconvert: Fix, sphinx template not removing new lines from headers
  Backport PR ipython#4043: don't 'restore_bytes' in from_JSON
  Backport PR ipython#4178: add missing data_javascript
  Backport PR ipython#4136: catch javascript errors in any output
  Backport PR ipython#4163: Fix for incorrect default encoding on Windows.
  Backport PR ipython#4171: add nbconvert config file when creating profiles
  Backport PR ipython#4159: don't split `.cell` and `div.cell` CSS
  Backport PR ipython#4158: generate choices for `--gui` configurable from real mapping
  Backport PR ipython#4143: update example custom.js
  Backport PR ipython#4144: help_end transformer shouldn't pick up ? in multiline string
  Backport PR ipython#4104: Add way to install MathJax to a particular profile
  ...
yarikoptic added a commit to yarikoptic/ipython that referenced this pull request May 2, 2014
* commit 'rel-1.1.0-7-gf5891e9': (70 commits)
  Backport PR ipython#4346: getpass() on Windows & Python 2 needs bytes prompt
  Backport PR ipython#4336: use simple replacement rather than string formatting in format_kernel_cmd
  Backport PR ipython#4316: underscore missing on notebook_p4
  Backport PR ipython#4257: fix unicode argv parsing
  Backport PR ipython#4209: Magic doc fixes
  Backport PR ipython#4204: remove some extraneous print statements from IPython.parallel
  back to dev
  release 1.1.0
  don't upload to GitHub in release script
  1.1 backport stats
  Backport PR ipython#4188: Allow user_ns trait to be None
  Backport PR ipython#4189: always fire LOCAL_IPS.extend(PUBLIC_IPS)
  Backport PR ipython#4174: various issues in markdown and rst templates
  Backport PR ipython#4181: nbconvert: Fix, sphinx template not removing new lines from headers
  Backport PR ipython#4043: don't 'restore_bytes' in from_JSON
  Backport PR ipython#4178: add missing data_javascript
  Backport PR ipython#4136: catch javascript errors in any output
  Backport PR ipython#4163: Fix for incorrect default encoding on Windows.
  Backport PR ipython#4171: add nbconvert config file when creating profiles
  Backport PR ipython#4159: don't split `.cell` and `div.cell` CSS
  ...
mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014
Fix for incorrect default encoding on Windows.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants