-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stabilized MaltParser API #944
Conversation
This is in response to the multiple questions - http://stackoverflow.com/questions/14009330/how-to-use-malt-parser-in-python-nltk - http://stackoverflow.com/questions/21815891/dependency-parser-using-nltk-and-maltparser - http://stackoverflow.com/questions/20091698/malt-parser-throwing-class-not-found-exception - http://stackoverflow.com/questions/29513187/maltparser-not-working-in-python-nltk
By using the -cp, it's more dynamic than calling the jar file and then using `os.environ` to setup the dependencies.
TODO: train model from scratch
However there remain problems with Pre-trained models from http://www.maltparser.org/mco/mco.html outputs uncased chunk labels, e.g. nsubj, null, dobj, poss:
But DependencyChart is expecting nice chunk tags, e.g. ROOT, SUBJ, SPEC, OBJ. E.g.
The demo is fine with we parse using a trained model from NLTK. So the awkward But there's still problem when reading the parses from a pre-trained model in NLTK:
[out]:
Although, there was an outputfile created from MaltParser if we add
|
@dhgarrette , @kmike, @heatherleaf , @stevenbird . Any idea why the pre-trained model outputs is unreadable by I'll leave this as it is now and let someone else deal with the dependency parses. I'll go back to the |
Thanks Alvations!! I was wondering if you could give an example of how to use it in python. |
@Santosh-Gupta , the |
@alvations, that error message was introduced in e0f0630#diff-31ba76604fcce0dbd82cdfd1dba4233d. @dimazest it looks like this change gets in the way of loading pre-trained models. Are you able to investigate please? |
Just pinging you again @dimazest |
Sorry, I somehow missed the first mention, I'll have a look to this right now... |
…ion. This should resolve issues faced at nltk#944. However, there is code that depends on a fake root node, for example the tree visualisation code reads this and FStructure.to_depgraph() sets it.
@dimazest thanks for the PR. @alvations, are you able to load pre-trained models now? |
…one() and pre-trained models
Sorry for the late reply. @dimazest thanks for the fix!! @stevenbird, now the malt API works with pre-trained model. I'm not sure why it only works with
But when i tried to do
|
With help from http://goo.gl/TpW1iY, I manage to get a tree from
[out]:
|
@dimazest @stevenbird: Fixed at last, now we can easily malt any sentences with the API. And i'll be able to use this for tree2string models in |
Thanks @alvations and @dimazest. If either of you has time it would be nice to include a doctest with little demonstration in the docstring for the MaltParser class, cf: https://github.com/nltk/nltk/blob/develop/nltk/tag/stanford.py#L120 |
Syncing with bleeding edge develop branch
(shot I (elephant an) (in (pajamas my)) .) | ||
""" | ||
def __init__(self, parser_dirname, model_filename=None, tagger=None, | ||
additional_java_args=[]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make additional_java_args=None
and add this
if additional_java_args is None:
additional_java_args = []
as having mutable default parameters might lead to obscure bugs.
@dimazest , @stevenbird It's all patched up. |
Thanks @dimazest for the code review, and @alvations for all this work. It's looking good to me, so I'm going to merge. |
From #943,
MaltParser was requiring all sorts of weird
os.environ
to make it find the binary and then call jar file with environment java classpath.os.walk
and uses full classpath andorg.maltparser.Malt
to call Maltparser instead of -jargenerate_malt_command
makes updating the API to suit Maltparser easier.I've tried with
Maltparser-1.7.2
andMaltparser-1.8