Skip to content

Commit

Permalink
Merge pull request #536 from onyxfish/docs
Browse files Browse the repository at this point in the history
Improve documentation
  • Loading branch information
James McKinney committed Jan 30, 2016
2 parents 2839446 + 66ac98f commit 1710670
Show file tree
Hide file tree
Showing 23 changed files with 113 additions and 127 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Improvements:

* "import csvkit as csv" will now defer to agate readers/writers.
* in2csv "csv itself" conversions now use agate.Table.
* Documentation: Update utility usage, remove shell prompts, document connection string, correct typos.
* Documentation: Update tool usage, remove shell prompts, document connection string, correct typos.

Fixes:

Expand Down
4 changes: 2 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,9 @@
:target: https://pypi.python.org/pypi/csvkit
:alt: Support Python versions

csvkit is a suite of utilities for converting to and working with CSV, the king of tabular file formats.
csvkit is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats.

It is inspired by pdftk, gdal and the original csvcut utility by Joe Germuska and Aaron Bycoffe.
It is inspired by pdftk, gdal and the original csvcut tool by Joe Germuska and Aaron Bycoffe.

Important links:

Expand Down
2 changes: 1 addition & 1 deletion csvkit/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

class CustomException(Exception):
"""
A base exception that handles pretty-printing errors for command-line utilities.
A base exception that handles pretty-printing errors for command-line tools.
"""

def __init__(self, msg):
Expand Down
2 changes: 1 addition & 1 deletion csvkit/utilities/csvcut.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@


class CSVCut(CSVKitUtility):
description = 'Filter and truncate CSV files. Like unix "cut" command, but for tabular data.'
description = 'Filter and truncate CSV files. Like the Unix "cut" command, but for tabular data.'

def add_arguments(self):
self.argparser.add_argument('-n', '--names', dest='names_only', action='store_true',
Expand Down
2 changes: 1 addition & 1 deletion csvkit/utilities/csvgrep.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@


class CSVGrep(CSVKitUtility):
description = 'Search CSV files. Like the unix "grep" command, but for tabular data.'
description = 'Search CSV files. Like the Unix "grep" command, but for tabular data.'

def add_arguments(self):
self.argparser.add_argument('-n', '--names', dest='names_only', action='store_true',
Expand Down
2 changes: 1 addition & 1 deletion csvkit/utilities/csvsort.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@


class CSVSort(CSVKitUtility):
description = 'Sort CSV files. Like unix "sort" command, but for tabular data.'
description = 'Sort CSV files. Like the Unix "sort" command, but for tabular data.'

def add_arguments(self):
self.argparser.add_argument('-y', '--snifflimit', dest='snifflimit', type=int,
Expand Down
2 changes: 1 addition & 1 deletion csvkit/utilities/in2csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

class In2CSV(CSVKitUtility):
description = 'Convert common, but less awesome, tabular data formats to CSV.'
epilog = 'Some command line flags only pertain to specific input formats.'
epilog = 'Some command-line flags only pertain to specific input formats.'
override_flags = ['f']

def add_arguments(self):
Expand Down
17 changes: 8 additions & 9 deletions docs/cli.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
==================
Command-Line Usage
==================
=====
Usage
=====

csvkit is comprised of a number of individual command line utilities that can be loosely divided into a few major categories: Input, Processing, and Output. Documentation and examples for each utility are described on the following pages.
csvkit is composed of command-line tools that can be divided into three major categories: Input, Processing, and Output. Documentation and examples for each tool are described on the following pages.

Input
=====
Expand All @@ -26,8 +26,8 @@ Processing
scripts/csvsort
scripts/csvstack

Output (and Analysis)
=====================
Output and Analysis
===================

.. toctree::
:maxdepth: 1
Expand All @@ -39,12 +39,11 @@ Output (and Analysis)
scripts/csvsql
scripts/csvstat

Appendices
==========
Common arguments
================

.. toctree::
:maxdepth: 2

common_arguments
tricks

10 changes: 5 additions & 5 deletions docs/common_arguments.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
=================================
Arguments common to all utilities
=================================
=============================
Arguments common to all tools
=============================

All utilities which accept CSV as input share a set of common command-line arguments::
All tools which accept CSV as input share a set of common command-line arguments::

-d DELIMITER, --delimiter DELIMITER
Delimiting character of the input CSV file.
Expand Down Expand Up @@ -39,5 +39,5 @@ All utilities which accept CSV as input share a set of common command-line argum

These arguments may be used to override csvkit's default "smart" parsing of CSV files. This is frequently necessary if the input file uses a particularly unusual style of quoting or is an encoding that is not compatible with utf-8. Not every command is supported by every tool, but the majority of them are.

Note that the output of csvkit's utilities is always formatted with "default" formatting options. This means that when executing multiple csvkit commands (either with a pipe or via intermediary files) it is only ever necessary to specify formatting arguments the first time. (And doing so for subsequent commands will likely cause them to fail.)
Note that the output of csvkit's tools is always formatted with "default" formatting options. This means that when executing multiple csvkit commands (either with a pipe or via intermediary files) it is only ever necessary to specify formatting arguments the first time. (And doing so for subsequent commands will likely cause them to fail.)

54 changes: 35 additions & 19 deletions docs/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,28 @@
Contributing to csvkit
======================

Getting Started
===============

Set up your environment for development::

git clone git://github.com/onyxfish/csvkit.git
cd csvkit
mkvirtualenv csvkit

# If running Python 2:
pip install -r requirements-py2.txt

# If running Python 3:
pip install -r requirements-py3.txt

python setup.py develop
tox

Principles
==========

csvkit is to tabular data what the standard Unix text processing suite (grep, sed, cut, sort) is to text. As such, csvkit adheres to `the Unix philosophy <http://en.wikipedia.org/wiki/Unix_philosophy>`_.
csvkit is to tables as Unix text processing commands (cut, grep, cat, sort) are to text. As such, csvkit adheres to `the Unix philosophy <http://en.wikipedia.org/wiki/Unix_philosophy>`_.

#. Small is beautiful.
#. Make each program do one thing well.
Expand All @@ -17,32 +35,30 @@ csvkit is to tabular data what the standard Unix text processing suite (grep, se
#. Avoid captive user interfaces.
#. Make every program a filter.

As there is no formally defined CSV format, csvkit encourages well-known formatting standards:

* Output favors compatibility with the widest range of applications. This means that quoting is done with double-quotes and only when necessary, columns are separated with commas, and lines are terminated with unix style line endings ("\\n").
As there is no single, standard CSV format, csvkit encourages popular formatting options:

* Data that is modified or generated will prefer consistency over brevity. Floats always include at least one decimal place, even if they are round. Dates and times are written in ISO8601 format.
* Output targets broad compatibility. Quoting is done with double-quotes and only when required, fields are delimited with commas, and rows are terminated with Unix line endings ("\\n").

Process for contributing code
=============================
* Output favors consistency over brevity. Floats always include at least one decimal place, even if they are round. Dates and times are output in ISO 8601 format.

Contributors should use the following roadmap to guide them through the process of submitting a contribution:
How to contribute
=================

#. Fork the project on `Github <https://github.com/onyxfish/csvkit>`_.
#. Check out the `issue tracker <https://github.com/onyxfish/csvkit/issues>`_ and find a task that needs to be done and is of a scope you can realistically expect to complete in a few days. Don't worry about the priority of the issues at first, but try to choose something you'll enjoy. You're much more likely to finish something to the point it can be merged if it's something you really enjoy hacking on.
#. Comment on the ticket letting everyone know you're going to be hacking on it so that nobody duplicates your effort. It's also good practice to provide some general idea of how you plan on resolving the issue so that other developers can make suggestions.
#. Write tests for the feature you're building. Follow the format of the existing tests in the test directory to see how this works. You can run all the tests with the command ``tox``.
#. Write the code. Try to stay consistent with the style and organization of the existing codebase. A good patch won't be refused for stylistic reasons, but large parts of it may be rewritten and nobody wants that.
#. As you're coding, periodically merge in work from the master branch and verify you haven't broken anything by running the test suite.
#. Write documentation for user-facing features.
#. Once it works, is tested, and has documentation, submit a pull request on Github.
#. Wait for it to either be merged or to receive a comment about what needs to be fixed.
#. Fork the project on `GitHub <https://github.com/onyxfish/csvkit>`_.
#. Look through the `open issues <https://github.com/onyxfish/csvkit/issues>`_ for a task that you can realistically expect to complete in a few days. Don't worry about the issue's priority; instead, choose something you'll enjoy. You're more likely to finish something if you enjoy hacking on it.
#. Comment on the issue to let people know you're going to work on it so that no one duplicates your effort. It's good practice to provide a general idea of how you plan to resolve the issue so that others can make suggestions.
#. Write tests for any changes to the code's behavior. Follow the format of the tests in the ``tests/`` directory to see how this works. You can run all the tests with the command ``tox``.
#. Write the code. Try to be consistent with the style and organization of the existing code. A good contribution won't be refused for stylistic reasons, but large parts of it may be rewritten and nobody wants that.
#. As you're working, periodically merge in changes from the upstream master branch to avoid having to resolve large merge conflicts. Check that you haven't broken anything by running the tests.
#. Write documentation for any user-facing features.
#. Once it works, is tested, and is documented, submit a pull request on GitHub.
#. Wait for it to be merged or for a comment about what needs to be changed.
#. Rejoice.

Legalese
========

To the extent that they care, contributors should keep the following legal mumbo-jumbo in mind:
To the extent that contributors care, they should keep the following legal mumbo-jumbo in mind:

The source of csvkit and therefore of any contributions are licensed under the permissive `MIT license <http://www.opensource.org/licenses/mit-license.php>`_. By submitting a patch or pull request you are agreeing to release your code under this license. You will be acknowledged in the AUTHORS file. As the owner of your specific contributions you retain the right to privately relicense your specific code contributions (and no others), however, the released version of the code can never be retracted or relicensed.
The source of csvkit and therefore of any contributions are licensed under the permissive `MIT license <http://www.opensource.org/licenses/mit-license.php>`_. By submitting a patch or pull request you are agreeing to release your contribution under this license. You will be acknowledged in the AUTHORS file. As the owner of your specific contributions you retain the right to privately relicense your specific contributions (and no others), however, the released version of the code can never be retracted or relicensed.

2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,9 @@ Table of contents
.. toctree::
:maxdepth: 3

install
tutorial
cli
tricks
contributing
release

Expand Down
62 changes: 0 additions & 62 deletions docs/install.rst

This file was deleted.

4 changes: 2 additions & 2 deletions docs/scripts/csvcut.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,14 @@ csvcut
Description
===========

Filters and truncates CSV files. Like unix "cut" command, but for tabular data::
Filters and truncates CSV files. Like the Unix "cut" command, but for tabular data::

usage: csvcut [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
[-p ESCAPECHAR] [-z MAXFIELDSIZE] [-e ENCODING] [-S] [-H] [-v]
[-l] [--zero] [-n] [-c COLUMNS] [-C NOT_COLUMNS] [-x]
[FILE]

Filter and truncate CSV files. Like unix "cut" command, but for tabular data.
Filter and truncate CSV files. Like the Unix "cut" command, but for tabular data.

positional arguments:
FILE The CSV file to operate on. If omitted, will accept
Expand Down
2 changes: 1 addition & 1 deletion docs/scripts/csvgrep.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Filter tabular data to only those rows where certain columns contain a given val
[-f MATCHFILE] [-i]
[FILE]

Search CSV files. Like the unix "grep" command, but for tabular data.
Search CSV files. Like the Unix "grep" command, but for tabular data.

positional arguments:
FILE The CSV file to operate on. If omitted, will accept
Expand Down
2 changes: 1 addition & 1 deletion docs/scripts/csvlook.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,6 @@ Basic use::

csvlook examples/testfixed_converted.csv

This utility is especially useful as a final operation when piping through other utilities::
This tool is especially useful as a final operation when piping through other tools::

csvcut -c 9,1 examples/realdata/FY09_EDU_Recipients_by_State.csv | csvlook
2 changes: 1 addition & 1 deletion docs/scripts/csvpy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Loads a CSV file into a :class:`agate.Reader` object and then drops into a Pytho
-h, --help show this help message and exit
--dict Use a CSV DictReader instead of a normal reader.

This utility will automatically use the IPython shell if it is installed, otherwise it will use the running Python shell.
This tool will automatically use the IPython shell if it is installed, otherwise it will use the running Python shell.

.. note::

Expand Down
4 changes: 2 additions & 2 deletions docs/scripts/csvsort.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,15 @@ csvsort
Description
===========

Sort CSV files. Like unix "sort" command, but for tabular data::
Sort CSV files. Like the Unix "sort" command, but for tabular data::

usage: csvsort [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
[-p ESCAPECHAR] [-z MAXFIELDSIZE] [-e ENCODING] [-S] [-H] [-v]
[-l] [--zero] [-y SNIFFLIMIT] [-n] [-c COLUMNS] [-r]
[--no-inference]
[FILE]

Sort CSV files. Like unix "sort" command, but for tabular data.
Sort CSV files. Like the Unix "sort" command, but for tabular data.

positional arguments:
FILE The CSV file to operate on. If omitted, will accept
Expand Down
4 changes: 2 additions & 2 deletions docs/scripts/in2csv.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,14 +76,14 @@ Standardize the formatting of a CSV file (quoting, line endings, etc.)::

in2csv examples/realdata/FY09_EDU_Recipients_by_State.csv

Fetch csvkit's open issues from the Github API, convert the JSON response into a CSV and write it to a file::
Fetch csvkit's open issues from the GitHub API, convert the JSON response into a CSV and write it to a file::

curl https://api.github.com/repos/onyxfish/csvkit/issues?state=open | in2csv -f json -v > issues.csv

Convert a DBase DBF file to an equivalent CSV::

in2csv examples/testdbf.dbf > testdbf_converted.csv

Fetch the ten most recent robberies in Oakland, convert the GeoJSON response into a CSV and write it to a file::

curl "http://oakland.crimespotting.org/crime-data?format=json&type=robbery&count=10" | in2csv -f geojson > robberies.csv

Loading

0 comments on commit 1710670

Please sign in to comment.