Skip to content
A command line tool for working with and transforming delimiter-separated values files
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
include
tests
.gitattributes
.gitignore
.gitmodules
.travis.yml
CMakeLists.txt
CMakeSettings.json
CppProperties.json
Doxyfile
LICENSE
Makefile
README.md
codecov.yml

README.md

shuffle

Shuffle is a command line tool for working with and transforming delimiter-separated values (DSV) files, such as CSV (RFC 4180), tab-delimited TXT, and so on.

Screenshot of shuffle in action (Shuffle can correctly guess most common delimiters. The file above is pipe-delimited.)

Features

A full list of features can be listed by typing shuffle in the terminal:

  • Pretty printing
  • Joining, merging, and reordering/subsetting
  • Calculating statistics and frequency counts for each column
  • Converting files to JSON and SQLite databases (with type-casting!)

Why Shuffle?

Because life is short and RAM isn't cheap. Unlike many other tools, Shuffle is designed for speed and memory efficiency and can take advantage of multi-core processors.

For example, on my machine it takes Python's pandas about 32 seconds to convert a 150MB comma-separated TXT file to a SQLite3 database, compared to 12 seconds for Shuffle. It also does this with at most 50MB of memory, whereas pandas eats your RAM for breakfast, lunch, and dinner.

Similarly, it takes Shuffle just under 3 seconds to generate summary statistics and frequency counts for this 80MB CSV. On the other hand, CSVKit--a popular Python package, still hasn't finished running even after 4 minutes.

You can’t perform that action at this time.