Permalink
Browse files

removing stopwords and squeezing whitespaces

  • Loading branch information...
1 parent eafbec2 commit ea1ef88f24e1b491adbf1426d0d41b04d43c43dc @ikh831 committed Jun 7, 2012
Showing with 5 additions and 1 deletion.
  1. +5 −1 lib/stopwords.rb
View
@@ -39,5 +39,9 @@ def self.is?(token)
def self.valid?(token)
(((token =~ TOKEN_REGEXP) == 0)) and !(STOP_WORDS.member?(token))
end
+
+ def self.remove(string)
+ string.downcase.gsub(/\b(#{STOP_WORDS.join('|')})\b/mi, '').squeeze(" ").strip
+ end
-end
+end

0 comments on commit ea1ef88

Please sign in to comment.