Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError with classifier.baseline() #14

Open
jcneshi opened this issue Feb 16, 2018 · 2 comments
Open

UnicodeDecodeError with classifier.baseline() #14

jcneshi opened this issue Feb 16, 2018 · 2 comments

Comments

@jcneshi
Copy link

jcneshi commented Feb 16, 2018

This is a similar but different issue an another posted here.

$ python jnlp-test-sentencePolarityScore.py
Traceback (most recent call last):
  File "jnlp-test-sentencePolarityScore.py", line 9, in <module>
    print classifier.baseline(text)
  File "build/bdist.macosx-10.13-intel/egg/jNlp/jSentiments.py", line 56, in baseline
  File "build/bdist.macosx-10.13-intel/egg/jNlp/jSentiments.py", line 49, in polarScores_text
  File "build/bdist.macosx-10.13-intel/egg/jNlp/jTokenize.py", line 30, in jTokenize
  File "build/bdist.macosx-10.13-intel/egg/jNlp/jCabocha.py", line 27, in cabocha
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb5 in position 105: invalid start byte

At first, I also had this same error with classifier.train(), but once I ran - ./configure --with-charset=utf8 for the mecab dictionary and for cabocha, the error disappeared.

However, with classifier.baseline() the error remains.
Is there another part of the toolchain that I need to configure for utf-8?
Am I missing something really basic?

Thanks!

@jcneshi
Copy link
Author

jcneshi commented Feb 16, 2018

By the way, my jnlp-test-sentencePolarityScore.py file uses your code in section 1.4.2, seen here:
http://jprocessing.readthedocs.io/en/latest/#how-to-use

@kevincobain2000
Copy link
Owner

Hi, is this issue fixed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants