Skip to content
Go to file

NLP Pure

Gem Version Code Climate Build Status Coverage Status

Natural language processing algorithms implemented in pure Ruby with minimal dependencies.

NOTE: this is not affiliated with, endorsed by, or in any way connected with Pure NLP, a trademark of John La Valle.

This project aims to provide functionality similar to Treat, open-nlp, and stanford-core-nlp but with fewer dependencies. The code is tested against English language but the algorithm implementations aim to be flexible for other languages.

Table of Contents


Add this line to your application’s Gemfile:

gem 'nlp-pure'

And then execute:

$ bundle

Or install it yourself as:

$ gem install nlp-pure


Simply require a library file and start using its interfaces! To preserve modularity and a small installation footprint, classes and modules are not recursively loaded up front.

Word Segmentation

$ bundle exec irb
irb(main):001:0> require 'nlp_pure/segmenting/default_word'
=> true
irb(main):002:0> NlpPure::Segmenting::DefaultWord.parse 'The quick brown fox jumps over the lazy dog.'
=> ["The", "quick", "brown", "fox", "jumps", "over", "the", "lazy", "dog."]
irb(main):003:0> NlpPure::Segmenting::DefaultWord.parse 'The New York-based company hired new staff.'
=> ["The", "New", "York", "based", "company", "hired", "new", "staff."]
irb(main):004:0> NlpPure::Segmenting::DefaultWord.parse 'The U.S.A. is a member of NATO.'
=> ["The", "U.S.A.", "is", "a", "member", "of", "NATO."]
irb(main):005:0> NlpPure::Segmenting::DefaultWord.parse "Mary had a little lamb,\nHis fleece was white as snow,\nAnd everywhere that Mary went,\nThe lamb was sure to go."
=> ["Mary", "had", "a", "little", "lamb,", "His", "fleece", "was", "white", "as", "snow,", "And", "everywhere", "that", "Mary", "went,", "The", "lamb", "was", "sure", "to", "go."]

Sentence Segmentation

M017-PDX:nlp-pure rp0616$ bundle exec irb
irb(main):001:0> require 'nlp_pure/segmenting/default_sentence'
=> true
irb(main):002:0> NlpPure::Segmenting::DefaultSentence.parse 'The U.S.A. is a member of NATO.'
=> ["The U.S.A. is a member of NATO."]
irb(main):003:0> NlpPure::Segmenting::DefaultSentence.parse 'Mary had a little lamb. The lamb\U+FFE2s fleece was white as snow. Everywhere that Mary went, the lamb was sure to go.'
=> ["Mary had a little lamb.", "The lambs fleece was white as snow.", "Everywhere that Mary went, the lamb was sure to go."]
irb(main):004:0> NlpPure::Segmenting::DefaultSentence.parse 'I am excited! Today is Friday.'
=> ["I am excited!", "Today is Friday."]

Supported Ruby Versions

This library aims to support and is tested against the following Ruby implementations:

If something doesn't work on one of these interpreters, it's a bug.

This library may inadvertently work (or seem to work) on other Ruby implementations, however support will only be provided for the versions listed above.


This library aims to adhere to Semantic Versioning 2.0.0. Violations of this scheme should be reported as bugs. Specifically, if a minor or patch version is released that breaks backward compatibility, that version should be immediately yanked and/or a new version should be immediately released that restores compatibility. Breaking changes to the public API will only be introduced with new major versions. As a result of this policy, you can (and should) specify a dependency on this gem using the Pessimistic Version Constraint with two digits of precision. For example:

spec.add_dependency 'nlp-pure', '~> 0.1'

See Also

Search “nlp” at


Natural language processing algorithms implemented in pure Ruby with minimal dependencies





No packages published


You can’t perform that action at this time.