-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transition-based dependency parsing algorithms #694
Comments
I strongly recommend to implement this algorithm or shift-reduce parser since these linear parsing algorithm could be very fast comparing to bottom-up parser. For some situation, say parsing large data set, this would be very useful. And in academic, these linear parsers could be almost as good (in accuracy) as bottom-up parser. I really want to work on this, but I don't have much after-work time. I wish I could help not as a coder, as a reviewer or at least in discussion. Thanks! |
I would be interested in contributing. I think that the biggest question is how to implement the classifier. We could either do the entire training process in Python or to start by importing a pre-trained model, e.g. from MaltParser. @charnugagoo we could start a thread on nltk-dev group and hear some feedback from the authors of ntlk |
Our philosophy in NLTK is to provide both, i.e. pure Python implementation(s) that have educational value, and interfaces to optimised third-party implementations. If you can pick a pre-trained model that you would like included with the NLTK-data collection, I'm happy to take care of adding that. |
@stnatic Personally, I recommend perceptron, which works well when I was writing my own shift-reduce parser. An average perceptron is very easy to implement. I searched there are some papers talking about perceptron + arc eager, so I think it would work in this case. I agree. Could you send out an email to nltk-dev, please? |
I had almost no time during the last 2 weeks so I apologize for not starting a mailing group thread. I see that @longdt219 is now assigned to this issue. Although, it seems to be a rather big task assuming that the classifier would be written in pure Python. Is there still any room for contribution? If so, please let me know. |
Hi @stnatic, I'm a bit confuse here, why we should re-implement the classifier given that nltk already provides many classifiers i.e. svm, maxent etc. |
Hi @longdt219, I didn't mean implementing SVM from the ground up, I rather meant that converting a CoNLL treebank data to learning data for the learning black-box might take a solid amount of programming effort. I think that the standard feature model used for arc-eager is the one described in http://www.maltparser.org/userguide.html#featurespec, I've read some paper that also worked assuming that model. I believe that for a treebank data to be converted to a set of learning examples it is required to solve the following problem - given feature vector as above one would have to find an operation (e.g. do operation LEFT-ARC with label OBJ) that would lead to a correct parse. If there is any way in which assist you on this then let me know. I am greatly interested in contributing to this particular issue. |
Hi @stevenbird |
It is fine to have a libsvm dependency. The only question is whether we need to redistribute the |
Another alternative is to rely on scikit-learn's libsvm wrappers. On 23 December 2014 at 10:07, Steven Bird notifications@github.com wrote:
|
Thanks @jnothman – that would be better, since we already have a dependency on scikit-learn. |
Hi @stevenbird and all,
Our implementation :
B. Arc-Eager (UAS,LAS)
Malt parser :
B. Arc-Eager
The discrepancies between our implementation and Malt parser are
Some points of implementation I would like to discuss :
|
Hi, You may be interested in my write up on shift-reduce dependency parsing here: https://honnibal.wordpress.com/2013/12/18/a-simple-fast-algorithm-for-natural-language-dependency-parsing/ . Note that the post describes the arc-hybrid transition system, just because it was slightly easier to describe. The implementation, https://gist.github.com/syllog1sm/10343947 , is BSD licensed, so you can adapt it for use in NLTK directly if you like. I think there are a few things worth emphasising, as far as performance goes (both efficiency and accuracy). First, it's really much better to use Averaged Perceptron, or some other method which can be trained in an error-driven way. You don't want to do batch learning. Batch learning makes it difficult to train from negative examples effectively, and this makes a very big difference to accuracy. Second, when you train the parser, you parser, you should really use the Goldberg and Nivre (2012) "dynamic oracle" strategy. This means that you follow the moves the parser predicts during training, rather than forcing the parser to follow only the gold-standard derivations. Again, this makes your training data match the run-time conditions much more closely, which makes it much more accurate. Third, I've not taken any trouble to optimise the perceptron in that implementation, since I wanted to keep the code concise for the tutorial. If you map all your strings to integers and use numpy arrays, you should be able to make the parser much faster. |
Thank you @syllog1sm. They are very detail comments. |
Thanks @syllog1sm – our goal here has just been to implement the published algorithms, for educational purposes. Further extensions are welcome, but beyond the scope of what @longdt219 is able to do. Feel free to submit a pull request any time. |
I've updated our roadmap for dependency parser support, including Honnibal's shift-reduce parser, cf. |
Implement Nivre's arc-eager and arc-standard algorithms.
http://aclweb.org/anthology/J08-4003.pdf
(Note that there have been some minor improvements made to the algorithms since these publications; please contact the author for details.)
The text was updated successfully, but these errors were encountered: