Natural language processing in Clojure/ClojureScript based on the Stanford-CoreNLP parser.
Warning: Under heavy rewrite. Please refrain from trying to use this until it is complete!
(use 'corenlp) (def text "This is a simple sentence.") (tokenize text)
(use 'corenlp) (pos-tag (tokenize "Colorless green ideas sleep furiously.")) ;; => [#<TaggedWord Colorless/JJ> #<TaggedWord green/JJ> ...]
Returns a list of
TaggedWord objects. Call
.tag() on a
to get its tag. For more information, see the relevant Javadoc
To parse a sentence:
(use 'corenlp) (parse (tokenize text))
You will get back a LabeledScoredTreeNode which you can plug in to other Stanford CoreNLP functions or can convert to a standard Treebank string with:
(str (parse (tokenize text)))
(dependency-graph "I like cheese.")
will parse the sentence and return the dependency graph as a loom graph, which you can then traverse with standard graph algorithms like shortest path, etc. You can also view it:
(def graph (dependency-graph "I like cheese.")) (use 'loom.io) (view graph)
This requires GraphViz to be installed.
Copyright (C) 2011-2016 Contributors (Clojure code only)
Distributed under the Eclipse Public License, the same as Clojure.
- Cory Giles
- Hans Engel
- Damien Stanton