Skip to content

vpekar/stanford-parser-in-jython

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 

stanford-parser-in-jython

A Jython interface to the Stanford parser (v.3.5.0, Java 8, Jython 2.5.2).

Includes various utilities to manipulate parsed sentences:

  • parse text containing XML tags,
  • obtain probabilities for different analyses,
  • extract dependency relations,
  • extract subtrees,
  • find the shortest path between two nodes,
  • print the parse in various formats.

See examples after the if __ name __ == "__ main __" hooks.

INSTALLATION:

1. Download the parser from http://nlp.stanford.edu/downloads/lex-parser.shtml
2. Unpack into a local dir, put the path to stanford-parser.jar into the classpath for jython
3. Put the path to englishPCFG.ser.gz as an arg to StanfordParser

USAGE:

Initialize a parser:

    parser = StanfordParser('englishPCFG.ser.gz')

To keep XML tags provided in the input text:

    sentence = parser.parse_xml('This is a <b>test</b>.')

To strip all XML before parsing:

    sentence = parser.parse('This is a <tag>test</tag>')

To print the sentence as a table (one word per line):

    sentence.print_table()

To print the sentence as a parse tree:

    sentence.print_tree()

On input, the script accepts unicode or utf8 or latin1.

On output, the script produces unicode.

About

A Jython interface to the Stanford parser. Includes various utilities to manipulate parsed sentences.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages