PR: Force reading of path stored in Spyder configuration folder as utf-8 #15702
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of Changes
See issue #15692 for description.
The
path
file in the spyder configuration directory, which holds the user provided paths added with the PYTHONPATH manager is not read with correct encoding, and thus not allowing for unicode characters in the paths (e.g. 'ö'). The current read functionality is handled byencoding.readlines()
, with subsequent calls toencoding.read()
,encoding.decode()
,encoding.get_coding()
which finally callschardet.UniversalDetector()
.chardet
returns (at least sometimes) the wrong encoding as per issue #15692.Other fixes were considered
chardet
detects utf-8. Using BOM with utf-8 appears, however, to be generally discourages (https://stackoverflow.com/questions/2223882/whats-the-difference-between-utf-8-and-utf-8-without-bom/2223926#2223926)encoding.py
to allow for defining the encoding when reading a file (note thatread
/readlines
has a kw encoding, but this is ignored and not used). However, would require more changes, and considered unnecessary complicated and more likely to break other parts of the code due to the extensive use of theread()
function inutils/encoding.py
The proposed and simple solution in this PR simply reads the file with encoding set to utf-8. The
path
file is written byencoding.write()
andencoding.encode()
should encode the text using utf-8.Issue(s) Resolved
Fixes #15692
Affirmation
By submitting this Pull Request or typing my (user)name below,
I affirm the Developer Certificate of Origin
with respect to all commits and content included in this PR,
and understand I am releasing the same under Spyder's MIT (Expat) license.
I certify the above statement is true and correct: Reinert Huseby Karlsen