Skip to content

Commit 24f8ac4

Browse files
Michael YoungMichael Young
authored andcommitted
Add CoreNLP server as main parser
1 parent 772d0d1 commit 24f8ac4

File tree

10 files changed

+411
-39
lines changed

10 files changed

+411
-39
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
.DS_Store
22
stanford-parser*
3+
stanford-corenlp*
34
build*
45
dist*
56
Lango.egg-info*

LICENSE.txt

Lines changed: 339 additions & 0 deletions
Large diffs are not rendered by default.

docs.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Docs
2+
3+
Pip Installs
4+
```
5+
sphinx-autobuild==0.6.0
6+
sphinx-rtd-theme==0.1.9
7+
sphinxcontrib-napoleon==0.5.0
8+
```
9+
10+
Generate docs
11+
```
12+
sphinx-apidoc -f -e -o docs lango
13+
cd docs
14+
make html
15+
```
16+

docs/installation.rst

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,23 +8,22 @@ Install package with pip
88

99
pip install lango
1010

11-
Download Stanford Models and Parser
11+
Download Stanford CoreNLP
1212
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1313

14-
Make sure you have Java installed for the Stanford parser to work.
14+
Make sure you have Java installed for the Stanford CoreNLP to work.
1515

16-
`Download Stanford Parser`_
16+
`Download Stanford CoreNLP`_
1717

18-
Set Environment Variables
18+
Extract to any folder
19+
20+
Run Server
1921
~~~~~~~~~~~~~~~~~~~~~~~~~
2022

21-
Set environment variables for STANFORD\_PARSER and STANFORD\_MODELS to
22-
where you downloaded the parser.
23+
In extracted folder, run the following command to start the server:
2324

24-
.. code:: python
25+
::
2526

26-
import os
27-
os.environ['STANFORD_PARSER'] = 'stanford-parser-full-2015-12-09'
28-
os.environ['STANFORD_MODELS'] = 'stanford-parser-full-2015-12-09'
27+
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer
2928

30-
.. _Download Stanford Parser: http://nlp.stanford.edu/software/stanford-parser-full-2015-12-09.zip
29+
.. _Download Stanford CoreNLP: http://stanfordnlp.github.io/CoreNLP/#download

examples/matching.py

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,10 @@
11
from collections import OrderedDict
22
import os
3-
from lango.parser import StanfordLibParser
3+
from lango.parser import StanfordServerParser
44
from lango.matcher import match_rules
55

66

7-
os.environ['STANFORD_PARSER'] = 'stanford-parser-full-2015-12-09'
8-
os.environ['STANFORD_MODELS'] = 'stanford-parser-full-2015-12-09'
9-
10-
parser = StanfordLibParser()
7+
parser = StanfordServerParser()
118

129
sents = [
1310
'Call me an Uber.',

examples/parser_input.py

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,9 @@
11
import os
2-
from lango.parser import StanfordLibParser
2+
from lango.parser import StanfordServerParser
33
from lango.matcher import match_rules
44

5-
os.environ['STANFORD_PARSER'] = 'stanford-parser-full-2015-12-09'
6-
os.environ['STANFORD_MODELS'] = 'stanford-parser-full-2015-12-09'
7-
85
def main():
9-
parser = StanfordLibParser()
6+
parser = StanfordServerParser()
107
while True:
118
try:
129
line = raw_input("Enter line: ")

lango/parser.py

Lines changed: 29 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
1-
from nltk.parse.stanford import StanfordParser
1+
from nltk.parse.stanford import StanfordParser, GenericStanfordParser
22
from nltk.internals import find_jars_within_path
3+
from nltk.tree import Tree
4+
from pycorenlp import StanfordCoreNLP
35

46

57
class Parser:
@@ -25,6 +27,7 @@ def parse(self, line):
2527
2628
Returns:
2729
Tree object representing parsed sentence
30+
None if parse fails
2831
"""
2932
tree = list(self.parser.raw_parse(line))[0]
3033
tree = tree[0]
@@ -37,4 +40,28 @@ def __init__(self):
3740
self.parser = StanfordParser(
3841
model_path='edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz')
3942
stanford_dir = self.parser._classpath[0].rpartition('/')[0]
40-
self.parser._classpath = tuple(find_jars_within_path(stanford_dir))
43+
self.parser._classpath = tuple(find_jars_within_path(stanford_dir))
44+
45+
46+
class StanfordServerParser(Parser, GenericStanfordParser):
47+
"""Follow the readme to setup the Stanford CoreNLP server"""
48+
def __init__(self, host='localhost', port=9000):
49+
url = 'http://{0}:{1}'.format(host, port)
50+
self.nlp = StanfordCoreNLP(url)
51+
52+
def _make_tree(self, result):
53+
return Tree.fromstring(result)
54+
55+
def parse(self, sent):
56+
output = self.nlp.annotate(sent, properties={
57+
'annotators': 'parse',
58+
'outputFormat': 'json'
59+
})
60+
61+
# Got random html, return empty tree
62+
if isinstance(output, unicode):
63+
return Tree('', [])
64+
65+
parse_output = output['sentences'][0]['parse'] + '\n\n'
66+
tree = next(next(self._parse_trees_output(parse_output)))[0]
67+
return tree

readme.md

Lines changed: 10 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -13,21 +13,19 @@ Lango is a natural language processing library for working with the building blo
1313
pip install lango
1414
```
1515

16-
### Download Stanford Models and Parser
16+
### Download Stanford CoreNLP
1717

18-
Make sure you have Java installed for the Stanford parser to work.
18+
Make sure you have Java installed for the Stanford CoreNLP to work.
1919

20-
[Download Stanford Parser](http://nlp.stanford.edu/software/stanford-parser-full-2015-12-09.zip)
20+
[Download Stanford CoreNLP](http://stanfordnlp.github.io/CoreNLP/#download)
2121

22-
### Set Environment Variables
22+
Extract to any folder
2323

24-
Set environment variables for STANFORD_PARSER and STANFORD_MODELS to where you
25-
downloaded the parser.
24+
### Run the Stanford CoreNLP server
2625

27-
```python
28-
import os
29-
os.environ['STANFORD_PARSER'] = 'stanford-parser-full-2015-12-09'
30-
os.environ['STANFORD_MODELS'] = 'stanford-parser-full-2015-12-09'
26+
Run the following command in the folder where you extracted Stanford CoreNLP
27+
```
28+
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer
3129
```
3230

3331
## Docs
@@ -218,10 +216,10 @@ Returned context:
218216
Full code:
219217

220218
```python
221-
from lango.parser import StanfordLibParser
219+
from lango.parser import StanfordServerParser
222220
from lango.matcher import match_rules
223221

224-
parser = StanfordLibParser()
222+
parser = StanfordServerParser()
225223

226224
rules = {
227225
'( S ( NP:np ) ( VP ( VBD:action-o ) ( PP:pp ) ) )': {

requirements.txt

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,2 @@
11
nltk==3.1
2-
sphinx-autobuild==0.6.0
3-
sphinx-rtd-theme==0.1.9
4-
sphinxcontrib-napoleon==0.5.0
2+
pycorenlp==0.3.0

setup.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
setup(
44
name='Lango',
5-
version='0.11',
5+
version='0.12',
66
description='Natural Language Framework for Matching Parse Trees and Modeling Conversation',
77
packages=find_packages(),
88
author='Michael Young',

0 commit comments

Comments
 (0)