a Unix CLI for normalizing whitespace in text
Shell Makefile
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bin
doc
man
test
.gitignore
.npmignore
CHANGELOG.md
LICENSE.md
Makefile
README.md
package.json

README.md

npm version license

Contents

nws — whitespace normalization

nws is a Unix CLI that normalizes whitespace in text, offering several modes, grouped into two categories:

  • Whitespace transliteration modes:

Line endings can be changed to be Windows- or Unix-specific, and select
Unicode whitespace and punctuation can be replaced with their closest ASCII
equivalents.

  • Whitespace condensing modes:

Trims leading and trailing runs of any mix of tabs and spaces and replaces
them with a single space each. The individual modes in this category differ only with respect to how multi-line input is treated.

Input can be provided either via filename arguments or via stdin. Option -i offers in-place updating.

See the examples below, get concise usage information further below, or read the manual.

Examples

Note:

Transliteration Examples

# Converts a CRLF line-endings file (Windows) to a LF-only file (Unix).
# No output is produced, because the file is updated in-place; a backup
# of the original file is created with suffix '.bak'. 
$ nws --mode lf --in-place=.bak from-windows.txt

# Converts a LF-only file (Unix) to a CRLF line-endings file (Windows).
# No output is produced, because the file is updated in-place; since no
# backup suffix is specified, no backup file is created.
$ nws --crlf -i from-unix.txt

# Converts select Unicode whitespace and punctuation chars. to their 
# closest ASCII equivalents and sends the output to a different file.
# Note that any other non-ASCII characters are left untouched.
# Helpful for converting code samples that were formatted for display back to
# valid source code. 
$ nws --ascii unicode-punct.txt > ascii-punct.txt 

Condensing Examples

  • Output from the example commands is piped to cat -et to better illustrate the output; cat -et shows line endings as $ (and control chars. as ^M<char>; e.g., a tab would show as ^I).
# -- Single-input-line normalization (mode option doesn't apply).

> nws <<<'    I   will   be normalized.   ' | cat -et 
I will be normalized.$
  # Ditto, but with a mix of spaces and tabs.
> nws "$(printf ' I \t\t will   be normalized.\t\t')" | cat -et 
I will be normalized.$

# -- Multi-input-line normalizations, using different modes.

  # Create demo file.
> cat <<EOF > /tmp/nws-demo

    $(printf '\t')

one
  two  

   $(printf '\t')

three


EOF

  # Multi-paragraph mode - by default, or with `--mp` or `-m mp` or 
  # `--mode multi-para`.
  # In addition to line-internal normalization, 
  # folds runs of blank/empty lines into 1 empty line each.
$ nws < /tmp/nws-demo | cat -et
$
one$
two$
$
three$
$

  # Single-paragraph mode: `--sp` or `-m sp` or `--mode single-para`
  # In addition to line-internal normalization, 
  # removes all blank/empty lines.
$ nws --sp < /tmp/nws-demo | cat -et
one$
two$
three$

  # Flattened-multi-pargraph mode: `--fp` or `-m fp` or `--mode flat-para`
  # In addition to line-internal normalization, 
  # joins paragraph-internal lines with a space each.
$ nws --fp < /tmp/nws-demo | cat -et
$
one two$
$
three$
$

  # Single-output-line mode: `sl` or `-m sl` or `--mode single-line`.
  # In addition to line-internal normalization, 
  # joins all non-empty/non-blank lines with a space each
  # to form a single, long output line.
$ nws --sl < /tmp/nws-demo | cat -et
one two three$

Installation

Supported platforms

  • When installing from the npm registry: Linux and OSX
  • When installing manually: any Unix-like platform with Bash and POSIX-compatible utilities.

Installation from the npm registry

Note: Even if you don't use Node.js, its package manager, npm, works across platforms and is easy to install; try curl -L http://git.io/n-install | bash

With Node.js or io.js installed, install the package as follows:

[sudo] npm install nws-cli -g

Note:

  • Whether you need sudo depends on how you installed Node.js / io.js and whether you've changed permissions later; if you get an EACCES error, try again with sudo.
  • The -g ensures global installation and is needed to put nws in your system's $PATH.

Manual installation

  • Download the CLI as nws.
  • Make it executable with chmod +x nws.
  • Move it or symlink it to a folder in your $PATH, such as /usr/local/bin (OSX) or /usr/bin (Linux).

Usage

Find concise usage information below; for complete documentation, read the manual online or, once installed, run man nws (nws --man if installed manually).

$ nws --help


Normalizes whitespace in one of several modes.

    nws [-m <mode>] [[-i[<ext>]] file...]

    Condensing <mode>s:

    All these modes normalize runs of tabs and spaces to a single space  
    each and trim leading and trailing runs; they only differ with respect to
    how multi-line input is processed.

    mp   (default) multi-paragraph: folds multiple blank lines into one
    fp   flattened multi-paragraph: normalizes each paragraph to single line
    sp   single-paragraph: removes all blank lines.
    sl   single-line: normalizes to single output line

    Transliteration <mode>s:

    lf     translates line endings to LF-only (\n)
    crlf   translates line endings to CRLF (\r\n)
    ascii  translates Unicode whitespace and punctuation to ASCII

Alternatively, specify mode values directly as options; e.g., --sp in lieu  
of -m sp

Standard options: --help, --man, --version, --home

License

Copyright (c) 2015-2016 Michael Klement mklement0@gmail.com (http://same2u.net), released under the MIT license.

Acknowledgements

This project gratefully depends on the following open-source components, according to the terms of their respective licenses.

npm dependencies below have optional suffixes denoting the type of dependency; the absence of a suffix denotes a required run-time dependency: (D) denotes a development-time-only dependency, (O) an optional dependency, and (P) a peer dependency.

npm dependencies

Changelog

Versioning complies with semantic versioning (semver).

  • v0.3.2 (2016-12-11):

    • [fix] Mode --crlf is now idempotent with input that is already CRLF- terminated (previously, an extra CR was mistakenly added).
  • v0.3.1 (2016-12-10):

    • [doc] Copy-editing in read-me file.
  • v0.3.0 (2016-11-13):

    • [BREAKING CHANGE] nws is now file-based: operands are interpreted as filenames, and option -i allows in-place updating. Use stdin to provide strings as input, such as via echo ... | nws ....
    • [enhancement] New transliteration modes added for changing line-ending styles and for translating non-ASCII Unicode whitespace/punctuation to their closest ASCII equivalents.
  • v0.2.0 (2015-09-18):

    • [usability improvement] New, mnemonic mode names supersede the old numeric normalization modes (option-arguments for -m); mode names come in both short and long forms; similarly, --mode is now supported as a verbose alternative to -m.
    • [deprecation] The numeric modes (0..3) still work, but should no longer be used and are no longer documented.
    • [doc] nws now has a man page (if manually installed, use nws --man); nws -h now just prints concise usage information.
  • v0.1.4 (2015-09-15):

    • [dev] Makefile improvements; various other behind-the-scenes tweaks.
  • v0.1.3 (2015-06-13):

    • [doc] Read-me improvements.
  • v0.1.2 (2015-06-13):

    • [doc] Read-me improvements.
  • v0.1.1 (2015-06-13):

    • [doc] Read-me improvements.
  • v0.1.0 (2015-06-13):

    • Initial release.