Python wrapper for Stanford CoreNLP
Switch branches/tags
Nothing to show
Clone or download
Latest commit 4c9bd82 May 19, 2016
Permalink
Failed to load latest commit information.
pycorenlp Unicode support May 1, 2016
.gitignore Add to PyPI Dec 28, 2015
MANIFEST Add to PyPI Dec 28, 2015
README.md Fix Python3 import Jan 16, 2016
example.py Fix Python3 import Jan 16, 2016
setup.cfg Add to PyPI Dec 28, 2015
setup.py Release 0.3 May 19, 2016

README.md

py-corenlp

Python wrapper for Stanford CoreNLP. This simply wraps the API from the server included with CoreNLP 3.6.0. See the CoreNLP server API documentation for details.

Install

pip install pycorenlp

Usage

First make sure you have the Stanford CoreNLP server running. See the instructions here for how to do that.

Then the setup just requires you to pass in the url of the server:

>>> from pycorenlp import StanfordCoreNLP
>>> nlp = StanfordCoreNLP('http://localhost:9000')

Supports annotation:

>>> text = (
  'Pusheen and Smitha walked along the beach. '
  'Pusheen wanted to surf, but fell off the surfboard.')
>>> output = nlp.annotate(text, properties={
  'annotators': 'tokenize,ssplit,pos,depparse,parse',
  'outputFormat': 'json'
  })
>>> print(output['sentences'][0]['parse'])
(ROOT
  (S
    (NP (NNP Pusheen)
      (CC and)
      (NNP Smitha))
    (VP (VBD walked)
      (PP (IN along)
        (NP (DT the) (NN beach))))
    (. .)))

And tokensregex + semgrex

>>> nlp.tokensregex(text, pattern='/Pusheen|Smitha/', filter=False)
{u'sentences': [
  {
    u'1': {u'text': u'Smitha', u'begin': 2, u'end': 3},
    u'0': {u'text': u'Pusheen', u'begin': 0, u'end': 1}, u'length': 2
  },
  {u'0': {u'text': u'Pusheen', u'begin': 0, u'end': 1}, u'length': 1}]}
>>> nlp.semgrex(text, pattern='{tag: VBD}', filter=False)
{u'sentences': [
  {u'0': {u'text': u'walked', u'begin': 3, u'end': 4}, u'length': 1},
  {
    u'1': {u'text': u'fell', u'begin': 6, u'end': 7},
    u'0': {u'text': u'wanted', u'begin': 1, u'end': 2}, u'length': 2
  }
]}

The code above is available in example.py.