Scan your text for possible style problems such as passive tense, duplicated words and clichés.
Ruby
Switch branches/tags
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bin
lib
spec
.gitignore
.rspec
.rvmrc
Gemfile
Rakefile
generate_stemmed_cliches.rb
readme.textile
style-scanner.gemspec

readme.textile

Style-Scanner

Style-Scanner, written by Jack Kinsella (blog), helps you improve your English language writing. Based on stylistic advice given by authors such as Skrunk and White, Bill Bryson and George Orwell, Style-Scanner scans your text and markup files then lists issues found. Currently it spots the following problems:

  • Use of 600 different clichés
    He pushed the envelope
  • Use of 11 different forms of passive tense.
    The letter was sent
  • Broken links
    http://www.thiswebsite404s.com
  • Ugly words
    The government utilized the resources.
  • Speaking in generalities
    Many people think that
  • Useless words
    It was a very fast app.
  • Repeating the same non structural word within a sentence
    Trask makes an excellent point, forcefully and with taste, in his excellent work ‘Mind The Gaffe’
  • Using latin abbreviations within your text
    E.g.
  • Spelling mistakes
    addd
  • Repeating the same word consecutively
    We went to the the shopping centre.
  • Excess white space
    We went to the shopping centre.
  • Capitalization Problems
    On monday we will have an exam.

Linguistic analysis is a hard problem, so you cannot expect perfect results. Think of Style-Scanner as guidelines to alert you of
possible faults. *Remember at best prescriptive stylistic guides can bring you up to a B minus standard of writing. * Anything more is beyond the scope of this tool.

Installation

Install Style-Scanner using Ruby gems:

gem install style-scanner

Next install Hunspell, an open source spell-checker that Style-Scanner depends upon.

brew install hunspell

Usage

Style-Scanner is command line only. Cd into the directory with the files you’d like to scan, then type

style-scanner filename

After a few seconds Style-Scanner will print a list problems it finds.

Style-Scanner is pipe friendly.

cat filename | style-scanner

See all command line options

style-scanner -h

By default some scans, such as the Adverb scan, are turned off. Use command line options to enable them.

style-scanner -a filename #scan for adverbs

Scan HTML files
style-scanner filename.html --help

Scan Textile files
style-scanner filename.html -t

Dependencies

Gems

punkt-segmenter
Splits text into sentences.

colorize
Formats terminal output in red,green and other colours.

entagger
Assigns parts of speech tags to English text based on a lookup dictionary and a set of probability values.

RedCloth
Converts Textile to HTML

Sanitize
Converts HTML to plain text

Other

Hunspell
An open source spell checker used by Open Office, Mozilla Firefox and Google Chrome.

Contributing

Contribution is welcomed and encouraged, although no modifications will be accepted if they are not accompanied by passing tests.

Parts of Speech

Tag Part Of Speech Examples
CC Conjunction, coordinating and, or
CD Adjective, cardinal number 3, fifteen
DET Determiner this, each, some
EX Pronoun, existential there there
FW Foreign words
IN Preposition / Conjunction for, of, although, that
JJ Adjective happy, bad
JJR Adjective, comparative happier, worse
JJS Adjective, superlative happiest, worst
LS Symbol, list item A, A.
MD Verb, modal can, could, ’ll
NN Noun aircraft, data
NNP Noun, proper London, Michael
NNPS Noun, proper, plural Australians, Methodists
NNS Noun, plural women, books
PDT Determiner, prequalifier quite, all, half
POS Possessive ’s, ’
PRP Determiner, possessive second mine, yours
PRPS Determiner, possessive their, your
RB Adverb often, not, very, here
RBR Adverb, comparative faster
RBS Adverb, superlative fastest
RP Adverb, particle up, off, out
SYM Symbol *
TO Preposition to
UH Interjection oh, yes, mmm
VB Verb, infinitive take, live
VBD Verb, past tense took, lived
VBG Verb, gerund taking, living
VBN Verb, past/passive participle taken, lived
VBP Verb, base present form take, live
VBZ Verb, present 3SG -s form takes, lives
WDT Determiner, question which, whatever
WP Pronoun, question who, whoever
WPS Determiner, possessive & question whose
WRB Adverb, question when, how, however
PP Punctuation, sentence ender , !, ?
PPC Punctuation, comma ,
PPD Punctuation, dollar sign $
PPL Punctuation, quotation mark left ``
PPR Punctuation, quotation mark right ’’
PPS Punctuation, colon, semicolon, elipsis :, …, –
LRB Punctuation, left bracket (, {, [
RRB Punctuation, right bracket ), }, ]

Roadmap

1 Add more acronyms
2 Refactor the collection methods.
4 More useless words.
5 Troublesome words from Stunk.
8 Tense switches
9 Improve tests by:
a Creating a sentence helper
b Making a better Rspec matcher (more informative erros)
c Automatically detecting the correct problem
10 Desktop app
11 Globalize spell check
12 Most overused words
13 Suggestions for synonyms
14 Code overview screencast

Contributors

Jack Kinsella
Robert Dooley, developer in Galway Western, Ireland

Licence

Style-Scanner is released under the MIT license
Copyright Jack Kinsella