-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error using POS tagger #17
Comments
I think I've tracked that assertion down to here: https://github.com/honnibal/thinc/blob/master/thinc/learner.pyx#L99 But I'm unclear as to why my class label is negative. |
Hi, How is the data in wsj.10.txt formatted? Are the tests passing for you? This test shows passing a single training example to the train function: https://github.com/syllog1sm/redshift/blob/develop/tests/test_tagger.py |
wsj.10.txt is PTB-formatted: Why/WRB is/VBZ the/DT stock/NN market/NN suddenly/RB so/RB volatile/JJ ?/. This seems to be the expected format for the Input.from_pos constructor. I tried running the tests and two of them fail. As you can see from the snippet below, these failures are resulting from the same AssertionError that I mentioned above:
Are you able to reproduce this? I'm running on OS X 10.10, using Python 2.7.6. |
Okay, I think I've fixed this. The underlying problem is that I've broken the perceptron code out into its own module, thinc, and I'd been redshift against my local version of that library instead of the one on pip. Try pulling the new version, and running "pip install -r requirements.txt", to get thinc1.50. Then run "fab clean make test". |
Yay. Tests pass and I've trained a tagger. Thanks! |
Great! Thanks for the bug reports. Let me know if you have any other problems. |
Hi there. I'm trying to use your POS tagger and I'm getting the following error when I attempt to train on a very small sample (10 sentences) from the Penn Treebank WSJ dataset. Any thoughts as to what I'm doing wrong?
The text was updated successfully, but these errors were encountered: