Skip to content

mklement0/nws-cli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

npm version license

Contents

nws — whitespace normalization

nws is a Unix CLI that normalizes whitespace in text, offering several modes, grouped into two categories:

  • Whitespace transliteration modes:

Line endings can be changed to be Windows- or Unix-specific, and select
Unicode whitespace and punctuation can be replaced with their closest ASCII
equivalents.

  • Whitespace condensing modes:

Trims leading and trailing runs of any mix of tabs and spaces and replaces
them with a single space each. The individual modes in this category differ only with respect to how multi-line input is treated.

Input can be provided either via filename arguments or via stdin. Option -i offers in-place updating.

See the examples below, get concise usage information further below, or read the manual.

Examples

Transliteration Examples

# Converts a CRLF line-endings file (Windows) to a LF-only file (Unix).
# No output is produced, because the file is updated in-place; a backup
# of the original file is created with suffix '.bak'. 
$ nws --mode lf --in-place=.bak from-windows.txt

# Converts a LF-only file (Unix) to a CRLF line-endings file (Windows).
# No output is produced, because the file is updated in-place; since no
# backup suffix is specified, no backup file is created.
$ nws --crlf -i from-unix.txt

# Converts select Unicode whitespace and punctuation chars. to their 
# closest ASCII equivalents and sends the output to a different file.
# Note that any other non-ASCII characters are left untouched.
# Helpful for converting code samples that were formatted for display back to
# valid source code. 
# IMPORTANT: This only works with properly encoded UTF-8 files.
$ nws --ascii unicode-punct.txt > ascii-punct.txt 

Condensing Examples

  • Output from the example commands is piped to cat -et to better illustrate the output; cat -et shows line endings as $ (and control chars. as ^M<char>; e.g., a tab would show as ^I).
# -- Single-input-line normalization (mode option doesn't apply).

> nws <<<'    I   will   be normalized.   ' | cat -et 
I will be normalized.$
  # Ditto, but with a mix of spaces and tabs.
> nws "$(printf ' I \t\t will   be normalized.\t\t')" | cat -et 
I will be normalized.$

# -- Multi-input-line normalizations, using different modes.

  # Create demo file.
> cat <<EOF > /tmp/nws-demo

    $(printf '\t')
  
one
  two  

   $(printf '\t')

three


EOF

  # Multi-paragraph mode - by default, or with `--mp` or `-m mp` or 
  # `--mode multi-para`.
  # In addition to line-internal normalization, 
  # folds runs of blank/empty lines into 1 empty line each.
$ nws < /tmp/nws-demo | cat -et
$
one$
two$
$
three$
$

  # Single-paragraph mode: `--sp` or `-m sp` or `--mode single-para`
  # In addition to line-internal normalization, 
  # removes all blank/empty lines.
$ nws --sp < /tmp/nws-demo | cat -et
one$
two$
three$

  # Flattened-multi-pargraph mode: `--fp` or `-m fp` or `--mode flat-para`
  # In addition to line-internal normalization, 
  # joins paragraph-internal lines with a space each.
$ nws --fp < /tmp/nws-demo | cat -et
$
one two$
$
three$
$

  # Single-output-line mode: `sl` or `-m sl` or `--mode single-line`.
  # In addition to line-internal normalization, 
  # joins all non-empty/non-blank lines with a space each
  # to form a single, long output line.
$ nws --sl < /tmp/nws-demo | cat -et
one two three$

Installation

Supported platforms

  • When installing from the npm registry: Linux and OSX
  • When installing manually: any Unix-like platform with Bash and POSIX-compatible utilities.

Installation from the npm registry

Note: Even if you don't use Node.js, its package manager, npm, works across platforms and is easy to install; try curl -L http://git.io/n-install | bash

With Node.js or io.js installed, install the package as follows:

[sudo] npm install nws-cli -g

Note:

  • Whether you need sudo depends on how you installed Node.js / io.js and whether you've changed permissions later; if you get an EACCES error, try again with sudo.
  • The -g ensures global installation and is needed to put nws in your system's $PATH.

Manual installation

  • Download the CLI as nws.
  • Make it executable with chmod +x nws.
  • Move it or symlink it to a folder in your $PATH, such as /usr/local/bin (OSX) or /usr/bin (Linux).

Usage

Find concise usage information below; for complete documentation, read the manual online or, once installed, run man nws (nws --man if installed manually).

$ nws --help


Normalizes whitespace in one of several modes.

    nws [-m <mode>] [[-i[<ext>]] file...]

    Condensing <mode>s:

    All these modes normalize runs of tabs and spaces to a single space  
    each and trim leading and trailing runs; they only differ with respect to
    how multi-line input is processed.

    mp   (default) multi-paragraph: folds multiple blank lines into one
    fp   flattened multi-paragraph: normalizes each paragraph to single line
    sp   single-paragraph: removes all blank lines.
    sl   single-line: normalizes to single output line

    Transliteration <mode>s:

    lf     translates line endings to LF-only (\n)
    crlf   translates line endings to CRLF (\r\n)
    ascii  translates Unicode whitespace and punctuation to ASCII

Alternatively, specify mode values directly as options; e.g., --sp in lieu  
of -m sp

Standard options: --help, --man, --version, --home

License

Copyright (c) 2015-2017 Michael Klement mklement0@gmail.com (http://same2u.net), released under the MIT license.

Acknowledgements

This project gratefully depends on the following open-source components, according to the terms of their respective licenses.

npm dependencies below have optional suffixes denoting the type of dependency; the absence of a suffix denotes a required run-time dependency: (D) denotes a development-time-only dependency, (O) an optional dependency, and (P) a peer dependency.

npm dependencies

Changelog

Versioning complies with semantic versioning (semver).

  • v0.3.4 (2017-09-06):

    • [doc] Clarified that --mode ascii (--asci) only works with properly encoded UTF-8 files.
  • v0.3.3 (2017-09-05):

    • [enhancement] Error message for -i mode improved to reflect the count of input files in case the pre-updating check fails; this is an improvement with potentially batched xargs-mediated invocations to at least provide a hint that only a given batch failed.
    • [doc] Fixed typo in man page.
  • v0.3.2 (2016-12-11):

    • [fix] Mode --crlf is now idempotent with input that is already CRLF- terminated (previously, an extra CR was mistakenly added).
  • v0.3.1 (2016-12-10):

    • [doc] Copy-editing in read-me file.
  • v0.3.0 (2016-11-13):

    • [BREAKING CHANGE] nws is now file-based: operands are interpreted as filenames, and option -i allows in-place updating. Use stdin to provide strings as input, such as via echo ... | nws ....
    • [enhancement] New transliteration modes added for changing line-ending styles and for translating non-ASCII Unicode whitespace/punctuation to their closest ASCII equivalents.
  • v0.2.0 (2015-09-18):

    • [usability improvement] New, mnemonic mode names supersede the old numeric normalization modes (option-arguments for -m); mode names come in both short and long forms; similarly, --mode is now supported as a verbose alternative to -m.
    • [deprecation] The numeric modes (0..3) still work, but should no longer be used and are no longer documented.
    • [doc] nws now has a man page (if manually installed, use nws --man); nws -h now just prints concise usage information.
  • v0.1.4 (2015-09-15):

    • [dev] Makefile improvements; various other behind-the-scenes tweaks.
  • v0.1.3 (2015-06-13):

    • [doc] Read-me improvements.
  • v0.1.2 (2015-06-13):

    • [doc] Read-me improvements.
  • v0.1.1 (2015-06-13):

    • [doc] Read-me improvements.
  • v0.1.0 (2015-06-13):

    • Initial release.