Skip to content

Commit

Permalink
Extract only simple NPs from Penn Treebank
Browse files Browse the repository at this point in the history
  • Loading branch information
evelinacs committed Jul 27, 2018
1 parent 15e9b83 commit c498a88
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions exp/alto/tools/get_nps_from_treebanks.py
Expand Up @@ -4,5 +4,6 @@
tree_file = "wsj_{}.mrg".format(str(n).zfill(4))
sentences = treebank.parsed_sents(tree_file)
for s in sentences:
for subtree in s.subtrees(lambda t: t.label().startswith("NP")):
print(subtree)
for subtree in s.subtrees(lambda t: t.label() == "NP"):
print(subtree.pformat(100000))
#print(subtree)

0 comments on commit c498a88

Please sign in to comment.