Skip to content
I need to script a lot of CSV stream cleanup.
Python Shell
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bin Added demo; renamed Mar 10, 2017
demo Added demo; renamed Mar 10, 2017
kilgoretrout Added demo; renamed Mar 10, 2017
.gitignore Added gitignore Mar 10, 2017
README Updated readme with install from repo instructions Mar 10, 2017
setup.py Added demo; renamed Mar 10, 2017

README

KILGORE TROUT

	A tabular data manipulation from the command line

ABOUT

	This is just an outgrowth of a cleanup project I'm working on. It grew
	into a pretty useful tool, but it's bumping up against the functionality
	of csvkit (csvkit.readthedocs.io), which is better, so use that.

PROS

	One cool feature of kilgore is that you can write procedures -- python
	scripts to do something "domain-specific" -- and add them to the
	package, then call those procedures with the --load argument. This is
	how I kept the project-specific functionality separate from the general
	use functionality of CSV manipulation.

CONS

	This was not built for efficiency. This is mostly just a Pandas wrapper,
	and each transformation you do adds a (usually) O(n) operation.

	Again, csvkit is better and more robust, so use that.

INSTALLATION

	pip install git+git://github.com/jakekara/kilgoretrout

EXAMPLE


DEMO 

	The demo folder contains some raw data, some cleaned up data in both CSV
	and JSON formats, and a scripts folder containing all the code to get
	the data from raw to clean.

	They are meant to be run from the demo directory, since the paths are
	relative. To run them all at once, use:

	      $ scripts/all.sh

	scripts:
		scripts/forceutf8.sh           - use iconv to make sure stream
					         is valid utf-8 only, dropping
						 bad characters 
		scripts/absenteeism.sh 	       - create tables related to chronic
				       	       	 absenteeism
		scripts/master_school_list.sh  - create master list of schools
		scripts/all_to_json.sh	       - converts all the CSVs created
					       	 with the above scripts to JSON
		scripts/all.sh		       - run all of the scripts above

			

USAGE

$ kilgore --help
usage: kilgore [-h] [-s SKIPROWS] [-t] [-d DROP] [-a ADD] [-p] [-j] [-f] [-i INPUT] [-c COLNAMES] [-r ROWS] [-l LOAD]

Kilgore - tabular data stream manipulation

optional arguments:
  -h, --help            show this help message and exit
  -s SKIPROWS, --skiprows SKIPROWS
                        Number of rows to skip
  -t, --transpose       Transpose table
  -d DROP, --drop DROP  Drop (exclude) a column
  -a ADD, --add ADD     Add (include) a column (supercedes drop)
  -p, --pretty          Prettify column names
  -j, --json
  -f, --force           Read all input as strings
  -i INPUT, --input INPUT
                        File to process
  -c COLNAMES, --colnames COLNAMES
                        set columns: --colnames=[COL1,COL2,COL3,...]
  -r ROWS, --rows ROWS  print N rows
  -l LOAD, --load LOAD  load registered procedure

You can’t perform that action at this time.