Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
UnicodeDecodeError: 'cp949' codec can't decode bytes #107
I'm getting this error on some specific rtf files.
File "/Library/Python/2.7/site-packages/textract/parsers/init.py", line 57, in process
e.g. attached rtf-file (zipped)
Thank you for providing the example! I am pretty sure this is a chardet version problem. I was able to successfully extract the text from your file when I