Skip to content
Just a list of stop words with a handy Module
Ruby
Find file
Pull request Compare This branch is 2 commits ahead, 1 commit behind brez:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
lib
spec
LICENSE
README.textile
Rakefile

README.textile

STOPWORDS

REALLY JUST A LIST OF STOPWORDS WITH SOME HELPERS

Obviously part of something bigger but worth breaking out for reuse.

USAGE


	
require 'stopwords'

#List all stop words
Stopwords::STOP_WORDS

#Test to see if a token is a stop word
Stopwords.is?('and')

=>true

#Ensures a token is both a 'word' and not a stop word
Stopwords.valid?('vector')

=>true

SPECS


$ rake specs

SANITIZE

Not part of the library but you should probably sanitize tokens before using them (if your tokenize doesn’t already)


SANITIZE_REGEXP = /('|\"|‘|’|\/|\\)/
text.downcase.gsub(SANITIZE_REGEXP, '')

ENDAX

Software Services shop (primarily Ruby) in Brooklyn, NY.

Something went wrong with that request. Please try again.