New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test fails straight out of the box #126
Comments
Thanks for flagging @JGCoelho. Judging by the error messages, this seems to be an issue with character encoding — possibly tied to Windows and/or Anaconda, but it's hard to tell. If you run the tests with a standard Python installation, instead of Anaconda, do you get the same problem? And can anyone else replicate these errors? |
Tried cloning it again and running the unittest with the default python 3.8.2. Same errors:
Maybe a problem with codecs? Opening the files sherlock.txt and senate-bills.txt i could see that they had the format utf-8 without BOM. Converted them to utf-8 with BOM and got the same error. Also converted the format to ANSI and UCS-2 to no avail. |
Also, the character 0x9d is the 'RIGHT DOUBLE QUOTATION MARK' (U+201D) ” 0x9D. |
0x9d is unmapped in windows-1252 according to wikipedia |
I've cloned the repository, and tried running the unittest test.test_itertext. This test doesn't require to set up the sherlock model. It reads the text files that come with the package and makes the models inside the test, so i didn't have any input into it. The error i keep getting is this:
Running a conda 3.7.6 environment on Windows 10.
The text was updated successfully, but these errors were encountered: