Transition-based dependency parsing algorithms #694

stevenbird · 2014-07-08T16:04:41Z

Implement Nivre's arc-eager and arc-standard algorithms.
http://aclweb.org/anthology/J08-4003.pdf
(Note that there have been some minor improvements made to the algorithms since these publications; please contact the author for details.)

charnugagoo · 2014-11-25T04:49:10Z

I strongly recommend to implement this algorithm or shift-reduce parser since these linear parsing algorithm could be very fast comparing to bottom-up parser. For some situation, say parsing large data set, this would be very useful. And in academic, these linear parsers could be almost as good (in accuracy) as bottom-up parser.

I really want to work on this, but I don't have much after-work time. I wish I could help not as a coder, as a reviewer or at least in discussion.

Thanks!

stnatic · 2014-11-25T06:25:58Z

I would be interested in contributing. I think that the biggest question is how to implement the classifier. We could either do the entire training process in Python or to start by importing a pre-trained model, e.g. from MaltParser.

@charnugagoo we could start a thread on nltk-dev group and hear some feedback from the authors of ntlk

stevenbird · 2014-11-25T06:44:29Z

Our philosophy in NLTK is to provide both, i.e. pure Python implementation(s) that have educational value, and interfaces to optimised third-party implementations. If you can pick a pre-trained model that you would like included with the NLTK-data collection, I'm happy to take care of adding that.

charnugagoo · 2014-11-26T02:02:13Z

@stnatic Personally, I recommend perceptron, which works well when I was writing my own shift-reduce parser. An average perceptron is very easy to implement. I searched there are some papers talking about perceptron + arc eager, so I think it would work in this case.

I agree. Could you send out an email to nltk-dev, please?

stnatic · 2014-12-12T19:30:42Z

I had almost no time during the last 2 weeks so I apologize for not starting a mailing group thread.

I see that @longdt219 is now assigned to this issue. Although, it seems to be a rather big task assuming that the classifier would be written in pure Python. Is there still any room for contribution? If so, please let me know.

longdt219 · 2014-12-14T22:34:36Z

Hi @stnatic, I'm a bit confuse here, why we should re-implement the classifier given that nltk already provides many classifiers i.e. svm, maxent etc.

stnatic · 2014-12-16T17:31:27Z

Hi @longdt219, I didn't mean implementing SVM from the ground up, I rather meant that converting a CoNLL treebank data to learning data for the learning black-box might take a solid amount of programming effort.

I think that the standard feature model used for arc-eager is the one described in http://www.maltparser.org/userguide.html#featurespec, I've read some paper that also worked assuming that model.

I believe that for a treebank data to be converted to a set of learning examples it is required to solve the following problem - given feature vector as above one would have to find an operation (e.g. do operation LEFT-ARC with label OBJ) that would lead to a correct parse.

If there is any way in which assist you on this then let me know. I am greatly interested in contributing to this particular issue.

longdt219 · 2014-12-22T07:03:47Z

Hi @stevenbird
In the original implementation, they use libsvm. Could we include libsvm to nltk or should we substitute with the implemented maxent classifier ?

stevenbird · 2014-12-22T23:07:22Z

It is fine to have a libsvm dependency. The only question is whether we need to redistribute the svm.py wrapper, or else use pip or some other package management tool to install it (cf https://github.com/Salinger/libsvm-python).

jnothman · 2014-12-23T03:52:56Z

Another alternative is to rely on scikit-learn's libsvm wrappers.

On 23 December 2014 at 10:07, Steven Bird notifications@github.com wrote:

It is fine to have a libsvm dependency. The only question is whether we
need to redistribute the svm.py wrapper, or else use pip or some other
package management tool to install it (cf
https://github.com/Salinger/libsvm-python).

—
Reply to this email directly or view it on GitHub
#694 (comment).

stevenbird · 2014-12-23T07:41:34Z

Thanks @jnothman – that would be better, since we already have a dependency on scikit-learn.

longdt219 · 2015-01-03T03:58:54Z

Hi @stevenbird and all,
Here is some results from our arc-eager and arc-standard implementation compared with Malt Parser on part of English PTB with

training data : 181 sentences (165: projective, 16: non-projective)
testing data : 214 sentences

Our implementation :
A. Arc-standard (UAS,LAS)

Training data : (0.961, 0.956)
Test data : (0.737, 0.708)

B. Arc-Eager (UAS,LAS)

Training data : (0.954, 0.948)
Test data : (0.726, 0.696)

Malt parser :
A. Arc-Standard (UAS,LAS)

Test data : (0.756, 0.725)

B. Arc-Eager

Test data (0.759, 0.726)

The discrepancies between our implementation and Malt parser are

Different feature sets: For simplicity we only use basic features.
Learning algorithm: Our implementation is slower mainly because scikit-learn libsvm wrapper haven't implemented SVC.decision_function for sparse matrix. I have to used SVC.predict_proba which use cross-validation.

Some points of implementation I would like to discuss :
(1) Currently, the code only accept ConLL shared task data format for both training and testing. What else should I implement ?
(2) Should I add more complicated features for better performance
(3) Should we focus on improving the learning speed by

Use different wrapper for libsvm
Separate the training data to different bin based on some features. Train different classifiers for each bin

syllog1sm · 2015-01-13T18:42:58Z

Hi,

You may be interested in my write up on shift-reduce dependency parsing here: https://honnibal.wordpress.com/2013/12/18/a-simple-fast-algorithm-for-natural-language-dependency-parsing/ . Note that the post describes the arc-hybrid transition system, just because it was slightly easier to describe.

The implementation, https://gist.github.com/syllog1sm/10343947 , is BSD licensed, so you can adapt it for use in NLTK directly if you like.

I think there are a few things worth emphasising, as far as performance goes (both efficiency and accuracy).

First, it's really much better to use Averaged Perceptron, or some other method which can be trained in an error-driven way. You don't want to do batch learning. Batch learning makes it difficult to train from negative examples effectively, and this makes a very big difference to accuracy.

Second, when you train the parser, you parser, you should really use the Goldberg and Nivre (2012) "dynamic oracle" strategy. This means that you follow the moves the parser predicts during training, rather than forcing the parser to follow only the gold-standard derivations. Again, this makes your training data match the run-time conditions much more closely, which makes it much more accurate.

Third, I've not taken any trouble to optimise the perceptron in that implementation, since I wanted to keep the code concise for the tutorial. If you map all your strings to integers and use numpy arrays, you should be able to make the parser much faster.

longdt219 · 2015-01-14T00:24:01Z

Thank you @syllog1sm. They are very detail comments.

Transition based parsing #694

stevenbird · 2015-01-23T00:23:25Z

Thanks @syllog1sm – our goal here has just been to implement the published algorithms, for educational purposes. Further extensions are welcome, but beyond the scope of what @longdt219 is able to do. Feel free to submit a pull request any time.

stevenbird · 2015-09-08T14:39:39Z

I've updated our roadmap for dependency parser support, including Honnibal's shift-reduce parser, cf.
https://github.com/nltk/nltk/wiki/Dependency-Parsing

stevenbird added the enhancement label Jul 8, 2014

stevenbird assigned longdt219 Dec 9, 2014

stevenbird added this to the late-December milestone Dec 9, 2014

longdt219 mentioned this issue Jan 13, 2015

Transition based parsing #694 #836

Merged

stevenbird modified the milestones: 3.0.2, late-December Jan 16, 2015

stevenbird added a commit that referenced this issue Jan 18, 2015

Merge pull request #836 from longdt219/Transition_based_parsing

b5fedd3

Transition based parsing #694

stevenbird closed this as completed Jan 23, 2015

IKKO-Ohta mentioned this issue Sep 13, 2017

nltk arc eagerまわり IKKO-Ohta/e2l#5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transition-based dependency parsing algorithms #694

Transition-based dependency parsing algorithms #694

stevenbird commented Jul 8, 2014

charnugagoo commented Nov 25, 2014

stnatic commented Nov 25, 2014

stevenbird commented Nov 25, 2014

charnugagoo commented Nov 26, 2014

stnatic commented Dec 12, 2014

longdt219 commented Dec 14, 2014

stnatic commented Dec 16, 2014

longdt219 commented Dec 22, 2014

stevenbird commented Dec 22, 2014

jnothman commented Dec 23, 2014

stevenbird commented Dec 23, 2014

longdt219 commented Jan 3, 2015

syllog1sm commented Jan 13, 2015

longdt219 commented Jan 14, 2015

stevenbird commented Jan 23, 2015

stevenbird commented Sep 8, 2015

Transition-based dependency parsing algorithms #694

Transition-based dependency parsing algorithms #694

Comments

stevenbird commented Jul 8, 2014

charnugagoo commented Nov 25, 2014

stnatic commented Nov 25, 2014

stevenbird commented Nov 25, 2014

charnugagoo commented Nov 26, 2014

stnatic commented Dec 12, 2014

longdt219 commented Dec 14, 2014

stnatic commented Dec 16, 2014

longdt219 commented Dec 22, 2014

stevenbird commented Dec 22, 2014

jnothman commented Dec 23, 2014

stevenbird commented Dec 23, 2014

longdt219 commented Jan 3, 2015

syllog1sm commented Jan 13, 2015

longdt219 commented Jan 14, 2015

stevenbird commented Jan 23, 2015

stevenbird commented Sep 8, 2015