Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors on non-breaking space #72

Closed
le717 opened this issue May 19, 2015 · 6 comments
Closed

Errors on non-breaking space #72

le717 opened this issue May 19, 2015 · 6 comments

Comments

@le717
Copy link

le717 commented May 19, 2015

I've yet to track down the source file, but compiling this folder using the following command:

sass.compile(dirname=(".", "."), output_style="compressed")

Produces the following error:

Traceback (most recent call last):
  File "C:\Users\le717\Documents\Code\Sites\Own-Domain\build-css.py", line 8, in
 <module>
    sass.compile(dirname=(".", "."), output_style="compressed")
  File "C:\Python34\lib\site-packages\sass.py", line 509, in compile
    include_paths, precision, custom_functions,
  File "C:\Python34\lib\site-packages\sass.py", line 169, in compile_dirname
    output_file.write(v)
  File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufeff' in position
0: character maps to <undefined>
@asottile
Copy link
Member

Hmmm, this is because the open function in python 3.4 uses the preferred encoding of the system. Since you're on windows, this is cp1252, which doesn't have a way to represent a non breaking space and turns it into the "unknown" character '\ufeff'

Basically, libsass uses filesystem encoding and you have a file on your filesystem which doesn't match :)

compile would need to grow an encoding parameter to overcome this

EDIT: you can probably track down the specific file by using:

python -m pdb path\to\script\which\compiles.py and inspecting the locals in the frame

@le717
Copy link
Author

le717 commented May 19, 2015

I am still unable to track down the file, but I may have found the source of the error:

sass.py, lines 165-169

if s:
    v = v.decode('UTF-8')
    mkdirp(os.path.dirname(output_filename))
    with open(output_filename, 'w') as output_file:
        output_file.write(v)

The file contents are being decoded from bytes to a UTF-8 string, but then are being written to file using the OS's default encoding (in our case, cp1252). Editing the open() to include an encoding parameter, like so:

with open(output_filename, 'w', encoding='UTF-8') as output_file:

Seems to fix the problem.

@asottile
Copy link
Member

That'll only work in python3 (and will ignore the OS encoding). Feels like a good thing to respect the OS's encoding though?

If we want to fix this in the way described above, the 2+3 solution is to use the open function from the io module.

@le717
Copy link
Author

le717 commented May 19, 2015

Feels like a good thing to respect the OS's encoding though?

I believe the CSS spec says output is always supposed to be UTF-8 encoded.

If we want to fix this in the way described above, the 2+3 solution is to use the open function from the io module.

Interestingly, the duplicate code in sassutils.builder does just this, and I would think this would be the way to go.

@le717
Copy link
Author

le717 commented Aug 4, 2015

OK, I just hit this bug again so I decided to do some digging and find the root of the error.

As it would turn out out, I have this file that uses \u201c, AKA a curly left double quotation mark (you can see it in use in the blockquote here). Python is chocking on this character for the exact same reason @asottile gave a few months ago: cp1252 does not have a way to represent this character, so when it tries to write the UTF-8 encoded text to file using cp1252 (as I showed in #72 (comment)), it hits the error. For testing purposes, I also added a non-breaking space as I originally thought the code was breaking on and the got the same outcome.

Feels like a good thing to respect the OS's encoding though?

I checked the spec, it is supposed to respect the file's encoding, not the OS's. However, as of 3.4.0 even Sass only outputs UTF-8 encoded CSS because they too hit encoding issues.

Honestly, I still think the best fix here is to from io import open on Py2 and write the file in UTF-8 encoding. What's going on right now is the code is encoding it to UTF-8 at all times but then writing it to file using the OS's encoding. That is me is an oversight and the file should be written using UTF-8 at all times. I know I am not the only one who uses Unicode characters in the CSS (I actually ripped the quotes from a well-known site that uses Sass).

If you would like, I can put up a PR to apply the fix.

@asottile
Copy link
Member

asottile commented Aug 5, 2015

seems fine to me, +1 for encoding='UTF-8'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants