Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whether it works in Chinese Word Segmentation #17

Closed
hzylmf opened this issue Nov 2, 2017 · 1 comment
Closed

Whether it works in Chinese Word Segmentation #17

hzylmf opened this issue Nov 2, 2017 · 1 comment

Comments

@hzylmf
Copy link

hzylmf commented Nov 2, 2017

Thank for your code. I wanna use this code for Chinese Word Segmentation, so does it work for applying the code to my word segmentation task?

@LiyuanLucasLiu
Copy link
Owner

Thanks:-)

The current code cannot work for Chinese, but you can definitely modified it a little to make it work on Chinese.

Basically, i would recommended you to use our word-level model, treat each Chinese character as a word, and modify the encoding for read & write. Also, pre-trained embeddings are crucial for performance, i would also recommend you to get some character-level embedding for Chinese (there are several papers about this).

Besides, you could also try to represent Chinese Characters by Wubi or Pingying, and treat them as character-level representation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants