New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
about chinese dataset #20
Comments
Hi! Could you please post the error stack trace? |
|
I think this is caused by not choosing a Chinese parser, but I don't know where to start. |
have you tried manually running |
here is the point i feel confused, i change it into edu.stanford.nlp.trees.GrammaticalStructure, but still get the following error: |
I am not sure how to do it with Chinese, but have a look here, it might help: |
Thank you for your answer. It can indeed be successfully implemented, but the following similar errors will occur: |
Another problem is that the document does not describe the role of these parameters. Where did you learn from? |
|
Just some sentences in the document |
Hm. If it's just a couple of sentences, why don't you ignore this error and see if everything else works? |
The parsing results of those sentences are wrong, so I directly discard them. |
In fact, my practice has shortcomings, because I destroy the integrity of the data. |
It really comes down to the percentage of such sentences. What is it? |
Hello, thank you for your great work of open source. I want to process Chinese datasets according to your process, but in convert_ to_ jsonlines.py. Py this step reports an error, do you know why?
Thanks.
The text was updated successfully, but these errors were encountered: