UnicodeDecodeError with classifier.baseline() #14

jcneshi · 2018-02-16T09:51:46Z

This is a similar but different issue an another posted here.

$ python jnlp-test-sentencePolarityScore.py
Traceback (most recent call last):
  File "jnlp-test-sentencePolarityScore.py", line 9, in <module>
    print classifier.baseline(text)
  File "build/bdist.macosx-10.13-intel/egg/jNlp/jSentiments.py", line 56, in baseline
  File "build/bdist.macosx-10.13-intel/egg/jNlp/jSentiments.py", line 49, in polarScores_text
  File "build/bdist.macosx-10.13-intel/egg/jNlp/jTokenize.py", line 30, in jTokenize
  File "build/bdist.macosx-10.13-intel/egg/jNlp/jCabocha.py", line 27, in cabocha
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb5 in position 105: invalid start byte

At first, I also had this same error with classifier.train(), but once I ran - ./configure --with-charset=utf8 for the mecab dictionary and for cabocha, the error disappeared.

However, with classifier.baseline() the error remains.
Is there another part of the toolchain that I need to configure for utf-8?
Am I missing something really basic?

Thanks!

The text was updated successfully, but these errors were encountered:

jcneshi · 2018-02-16T09:53:48Z

By the way, my jnlp-test-sentencePolarityScore.py file uses your code in section 1.4.2, seen here:
http://jprocessing.readthedocs.io/en/latest/#how-to-use

kevincobain2000 · 2018-08-29T07:25:05Z

Hi, is this issue fixed?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UnicodeDecodeError with classifier.baseline() #14

UnicodeDecodeError with classifier.baseline() #14

jcneshi commented Feb 16, 2018

jcneshi commented Feb 16, 2018

kevincobain2000 commented Aug 29, 2018

UnicodeDecodeError with classifier.baseline() #14

UnicodeDecodeError with classifier.baseline() #14

Comments

jcneshi commented Feb 16, 2018

jcneshi commented Feb 16, 2018

kevincobain2000 commented Aug 29, 2018