Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR: Force reading of path stored in Spyder configuration folder as utf-8 #15702

Merged
merged 1 commit into from
May 26, 2021
Merged

PR: Force reading of path stored in Spyder configuration folder as utf-8 #15702

merged 1 commit into from
May 26, 2021

Conversation

rhkarls
Copy link
Contributor

@rhkarls rhkarls commented May 26, 2021

Description of Changes

  • Wrote at least one-line docstrings (for any new functions)
  • Added unit test(s) covering the changes (if testable)
  • Included a screenshot or animation (if affecting the UI, see Licecap)

See issue #15692 for description.
The path file in the spyder configuration directory, which holds the user provided paths added with the PYTHONPATH manager is not read with correct encoding, and thus not allowing for unicode characters in the paths (e.g. 'ö'). The current read functionality is handled by encoding.readlines(), with subsequent calls to encoding.read(), encoding.decode(), encoding.get_coding() which finally calls chardet.UniversalDetector(). chardet returns (at least sometimes) the wrong encoding as per issue #15692.

Other fixes were considered

  • write BOM to the path file to make sure chardet detects utf-8. Using BOM with utf-8 appears, however, to be generally discourages (https://stackoverflow.com/questions/2223882/whats-the-difference-between-utf-8-and-utf-8-without-bom/2223926#2223926)
  • rewrite functions in encoding.py to allow for defining the encoding when reading a file (note that read/readlines has a kw encoding, but this is ignored and not used). However, would require more changes, and considered unnecessary complicated and more likely to break other parts of the code due to the extensive use of the read() function in utils/encoding.py

The proposed and simple solution in this PR simply reads the file with encoding set to utf-8. The path file is written by encoding.write() and encoding.encode() should encode the text using utf-8.

Issue(s) Resolved

Fixes #15692

Affirmation

By submitting this Pull Request or typing my (user)name below,
I affirm the Developer Certificate of Origin
with respect to all commits and content included in this PR,
and understand I am releasing the same under Spyder's MIT (Expat) license.

I certify the above statement is true and correct: Reinert Huseby Karlsen

@rhkarls rhkarls changed the base branch from master to 5.x May 26, 2021 08:52
@ccordoba12 ccordoba12 added this to the v5.0.4 milestone May 26, 2021
Copy link
Member

@ccordoba12 ccordoba12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhkarls, thanks a lot for your contribution! Very clean and simple solution to the problem you reported.

@ccordoba12 ccordoba12 merged commit c864b1c into spyder-ide:5.x May 26, 2021
ccordoba12 added a commit that referenced this pull request May 26, 2021
@rhkarls rhkarls deleted the issue-15692-pathmanager-unicode branch August 25, 2021 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PYTHONPATH manager does not work with unicode characters in path
2 participants