if it makes you say "ugh why would you write a script to do that with words"
then it belongs here
welcome to shittynlp
- simple lexical metrics for discerning obviously autogenerated strings (frequency of case change & alpha<>numeric change? let's say all words in unix dictionary should be marked non-generated.)