This is a quick project to demonstrate/prototype keyword searches for suspicious texts, with features to be added over time.
Briefly, Keywords will extract the words in a text file above a specified size, filter out common words, and collect the words by stem/root word.
Please don't use this for any sort of production work.
I mean, seriously, for the sake of expedience, I'm using a search engine as a search engine and (to use the term a bit liberally) spidering another site. I'm also assuming that neither site layout will ever change. I even take a very naive view of HTML structure, just for the sake of it.
The number of things that can go wrong and the number of people you might offend is astronomical. So, just don't actually use the thing for anything more than a quick test or a learning experience.
Rather than try to reinvent the wheel with my own stemming algorithm, I happily use the stemmify gem to collect words by (likely) common root.