a command line tool for scrubbing html/xml artifacts from csv data
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
characters
cli
csvcleaner
scrub
.gitignore
.npmignore
index.js
package.json
readme.md
test.js

readme.md

About

csvcleaner is a command line application for manipulating csv data with markup language elements or entities, url encoded string data, and escaped strings into human readable text.

csvcleaner has poor test coverage and is not intended for use in other projects at this time. csvcleaner is for quickly fixing some very specific problems with csv data from a csv file via the command line.

Examples

Default Fixes

Encoded/Escaped Strings/Entities

Before: Sam%20and%20Bob

After: Sam and Bob

HTML Elements

Before: <font face=""Verdana, Arial, Helvetica, sans-serif"" size=""2"">Yes.

After: Yes.

XML Elements

Before:

After: Failure to...

Optional Fixes

Mis-Encoded Characters

Before: Apostrophe’s, “Double quotesâ€, ‘single quotes’, en—dash, em–dash, hyphen (•)…

After: Apostrophe's, "Double quotes", 'single quotes', en-dash, em-dash, hyphen (-)…

Note: Only use the option to find/replace mis-encoded characters if you actually have mis-encoded characters. Do not examine your csv data in Excel to determine if you have misencoded characters

Installation

csvcleaner requires node.js. To install:

npm install csvcleaner -g

It is also possible to incorporate csvcleaner in other projects. However, csvcleaner is intended for use as a command line tool. csvcleaner always reads from a file and writes to a new file. To add to a project:

npm install csvcleaner --save

Usage

Basic Usage

<infile> <outfile>

Applies default processing to all columns and rows.

csvcleaner ~/bad-csv-data.csv ./fixed-csv-data.csv

Note: CSV cleaner will not save over an existing file via the command line. You must specify a new file to send your fixed csv data to.

Advanced Usage

Applies default processing to specified columns.

<infile> <outfile> [columns...]

csvcleaner ~/bad-csv-data.csv ./fixed-csv-data.csv ColumnA ColumnB

Options

<infile> <outfile> [options...]

Applies optional processing to all columns

csvcleaner ~/bad-csv-data.csv ./fixed-csv-data.csv -c

Note: You can also specify specific columns to processes with option flags.