Skip to content

Example Project for Natural Language Processing and Machine Learning Libraries

License

Notifications You must be signed in to change notification settings

plandes/clj-example-nlp-ml

Repository files navigation

Example Project for Natural Language Processing and Machine Learning Libraries

Note: This repository has a more up to date and better example of a real (academic) project of how to use these libraries.

This is a simple and small example of how to use the following libraries:

This project extends Carin Meier's speech act classifier. None of her code was used, only the data to test and train (found in the resources directory).

Note that the library also illustrates how to use the action command line interface library as you can build out a CLI version.

Table of Contents

Documentation

API (incomplete) documentation.

Usage

This project provides a real working example of a statistical natural language processing program. The code itself is given as examples in the libraries it uses (see top of this README). To use, clone the repository and build with lein (see the command line docs).

REPL

user> (System/setProperty "zensols.model" "path-to-model")
user> (require '[zensols.example.sa-model :as sa])
user> (sa/classify-utterance "when will we get there")
INFO  2016-07-15 18:19:00.957: stanford: parsing: <when will we get there>
INFO  2016-07-15 18:19:00.979: stanford: creating tagger model at .../stanford/pos/english-left3words-distsim.tagger
INFO  2016-07-15 18:19:01.565: stanford: creating ner annotators: ["edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz"]
=> "question"

Command line

  1. Install Leiningen (this is just a script)
  2. Install GNU make
  3. Install Git
  4. Follow the directions in build section
  5. Create the distribution on the desktop: make dist
  6. Start the Elasticsearch server using the ML Dataset project
  7. Load the corpus into Elasticsearch: cd ~/Desktop/nlp-ml-example/bin ; ./saclassify load-corpus
  8. Run: ./saclassify classify -u 'when will we get there'

Building

All leiningen tasks will work in this project. For additional build functionality (git tag convenience utility functionality) clone the Clojure build repo in the same (parent of this file) directory as this project:

   cd ..
   git clone https://github.com/plandes/clj-zenbuild

License

Copyright (c) 2016, 2017, 2018 Paul Landes

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.