Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

commend line Vs. API #4

Closed
kaharjan opened this issue Apr 5, 2016 · 2 comments
Closed

commend line Vs. API #4

kaharjan opened this issue Apr 5, 2016 · 2 comments

Comments

@kaharjan
Copy link

kaharjan commented Apr 5, 2016

I write python code to segment given words, main code is :

model=io.read_any_model(model.bin')
with open(test.txt,'r') as OutputFile:
                for line in InputFile:
                        words=line.strip().split()
                        morphemes=[(w," ".join(model.viterbi_segment(w)[0])) for w in words]

only few words segmented, but i used the same model on commend line to segment the same text, and most of the words are segmented,
$morfessor-segment -l model.bin test.txt

So any idea what is wrong in my python code? thank you!!!

@psmit
Copy link
Member

psmit commented Apr 6, 2016

The most logical thing would be if you are using the wrong encoding. Did you make sure that the InputFile is opened with the right encoding flag? Morfessor uses internally always unicode strings (unicode in python2, str in python3)

@kaharjan
Copy link
Author

Thank you!!! It helps...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants