Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
UnicodeDecodeError: 'ascii' codec can't decode . . . #42
When I try to use python-pdfkit with certain HTML content that has certain characters in it, it fails with one of these errors if the html content is loaded into memory:
But, python pdfkit works just fine if it is provided with just a filename, and so does wkhtmltopdf.
I think that python pdfkit is doing something unsafe with strings; perhaps it should assume that the input is just bytes.
I have also problems when the source is already in utf-8 (encoding utf-8 to utf-8 gives weird results).
Removing the encode works for me. My HTML source files are in UTF-8, as we have many accents in Belgium.
I assume it's the programmers job to ensure correct encoding before calling the library, so he can be in complete control what to do if unsupported characters occur.