Merge pull request #536 from onyxfish/docs

Improve documentation
wireservice · Jan 30, 2016 · 1710670 · 1710670
2 parents 2839446 + 66ac98f
commit 1710670
Show file tree

Hide file tree

Showing 23 changed files with 113 additions and 127 deletions.
diff --git a/CHANGELOG b/CHANGELOG
@@ -16,7 +16,7 @@ Improvements:
 
 * "import csvkit as csv" will now defer to agate readers/writers.
 * in2csv "csv itself" conversions now use agate.Table.
-* Documentation: Update utility usage, remove shell prompts, document connection string, correct typos.
+* Documentation: Update tool usage, remove shell prompts, document connection string, correct typos.
 
 Fixes:
 

diff --git a/README.rst b/README.rst
@@ -26,9 +26,9 @@
     :target: https://pypi.python.org/pypi/csvkit
     :alt: Support Python versions
 
-csvkit is a suite of utilities for converting to and working with CSV, the king of tabular file formats.
+csvkit is a suite of command-line tools for converting to and working with CSV, the king of tabular file formats.
 
-It is inspired by pdftk, gdal and the original csvcut utility by Joe Germuska and Aaron Bycoffe.
+It is inspired by pdftk, gdal and the original csvcut tool by Joe Germuska and Aaron Bycoffe.
 
 Important links:
 

diff --git a/csvkit/exceptions.py b/csvkit/exceptions.py
@@ -3,7 +3,7 @@
 
 class CustomException(Exception):
     """
-    A base exception that handles pretty-printing errors for command-line utilities.
+    A base exception that handles pretty-printing errors for command-line tools.
     """
 
     def __init__(self, msg):

diff --git a/csvkit/utilities/csvcut.py b/csvkit/utilities/csvcut.py
@@ -18,7 +18,7 @@
 
 
 class CSVCut(CSVKitUtility):
-    description = 'Filter and truncate CSV files. Like unix "cut" command, but for tabular data.'
+    description = 'Filter and truncate CSV files. Like the Unix "cut" command, but for tabular data.'
 
     def add_arguments(self):
         self.argparser.add_argument('-n', '--names', dest='names_only', action='store_true',

diff --git a/csvkit/utilities/csvgrep.py b/csvkit/utilities/csvgrep.py
@@ -12,7 +12,7 @@
 
 
 class CSVGrep(CSVKitUtility):
-    description = 'Search CSV files. Like the unix "grep" command, but for tabular data.'
+    description = 'Search CSV files. Like the Unix "grep" command, but for tabular data.'
 
     def add_arguments(self):
         self.argparser.add_argument('-n', '--names', dest='names_only', action='store_true',

diff --git a/csvkit/utilities/csvsort.py b/csvkit/utilities/csvsort.py
@@ -9,7 +9,7 @@
 
 
 class CSVSort(CSVKitUtility):
-    description = 'Sort CSV files. Like unix "sort" command, but for tabular data.'
+    description = 'Sort CSV files. Like the Unix "sort" command, but for tabular data.'
 
     def add_arguments(self):
         self.argparser.add_argument('-y', '--snifflimit', dest='snifflimit', type=int,

diff --git a/csvkit/utilities/in2csv.py b/csvkit/utilities/in2csv.py
@@ -6,7 +6,7 @@
 
 class In2CSV(CSVKitUtility):
     description = 'Convert common, but less awesome, tabular data formats to CSV.'
-    epilog = 'Some command line flags only pertain to specific input formats.'
+    epilog = 'Some command-line flags only pertain to specific input formats.'
     override_flags = ['f']
 
     def add_arguments(self):

diff --git a/docs/cli.rst b/docs/cli.rst
@@ -1,8 +1,8 @@
-==================
-Command-Line Usage
-==================
+=====
+Usage
+=====
 
-csvkit is comprised of a number of individual command line utilities that can be loosely divided into a few major categories: Input, Processing, and Output. Documentation and examples for each utility are described on the following pages.
+csvkit is composed of command-line tools that can be divided into three major categories: Input, Processing, and Output. Documentation and examples for each tool are described on the following pages.
 
 Input
 =====
@@ -26,8 +26,8 @@ Processing
     scripts/csvsort
     scripts/csvstack
 
-Output (and Analysis)
-=====================
+Output and Analysis
+===================
 
 .. toctree::
     :maxdepth: 1 
@@ -39,12 +39,11 @@ Output (and Analysis)
     scripts/csvsql
     scripts/csvstat
 
-Appendices
-==========
+Common arguments
+================
 
 .. toctree::
     :maxdepth: 2 
 
     common_arguments
-    tricks
 
diff --git a/docs/common_arguments.rst b/docs/common_arguments.rst
@@ -1,8 +1,8 @@
-=================================
-Arguments common to all utilities
-=================================
+=============================
+Arguments common to all tools
+=============================
 
-All utilities which accept CSV as input share a set of common command-line arguments::
+All tools which accept CSV as input share a set of common command-line arguments::
 
   -d DELIMITER, --delimiter DELIMITER
                         Delimiting character of the input CSV file.
@@ -39,5 +39,5 @@ All utilities which accept CSV as input share a set of common command-line argum
 
 These arguments may be used to override csvkit's default "smart" parsing of CSV files. This is frequently necessary if the input file uses a particularly unusual style of quoting or is an encoding that is not compatible with utf-8. Not every command is supported by every tool, but the majority of them are.
 
-Note that the output of csvkit's utilities is always formatted with "default" formatting options. This means that when executing multiple csvkit commands (either with a pipe or via intermediary files) it is only ever necessary to specify formatting arguments the first time. (And doing so for subsequent commands will likely cause them to fail.)
+Note that the output of csvkit's tools is always formatted with "default" formatting options. This means that when executing multiple csvkit commands (either with a pipe or via intermediary files) it is only ever necessary to specify formatting arguments the first time. (And doing so for subsequent commands will likely cause them to fail.)
 
diff --git a/docs/contributing.rst b/docs/contributing.rst
@@ -2,10 +2,28 @@
 Contributing to csvkit
 ======================
 
+Getting Started
+===============
+
+Set up your environment for development::
+
+    git clone git://github.com/onyxfish/csvkit.git
+    cd csvkit
+    mkvirtualenv csvkit
+
+    # If running Python 2:
+    pip install -r requirements-py2.txt
+
+    # If running Python 3:
+    pip install -r requirements-py3.txt
+
+    python setup.py develop
+    tox
+
 Principles
 ==========
 
-csvkit is to tabular data what the standard Unix text processing suite (grep, sed, cut, sort) is to text. As such, csvkit adheres to `the Unix philosophy <http://en.wikipedia.org/wiki/Unix_philosophy>`_.
+csvkit is to tables as Unix text processing commands (cut, grep, cat, sort) are to text. As such, csvkit adheres to `the Unix philosophy <http://en.wikipedia.org/wiki/Unix_philosophy>`_.
 
 #. Small is beautiful.
 #. Make each program do one thing well.
@@ -17,32 +35,30 @@ csvkit is to tabular data what the standard Unix text processing suite (grep, se
 #. Avoid captive user interfaces.
 #. Make every program a filter.
 
-As there is no formally defined CSV format, csvkit encourages well-known formatting standards:
-
-* Output favors compatibility with the widest range of applications. This means that quoting is done with double-quotes and only when necessary, columns are separated with commas, and lines are terminated with unix style line endings ("\\n").
+As there is no single, standard CSV format, csvkit encourages popular formatting options:
 
-* Data that is modified or generated will prefer consistency over brevity. Floats always include at least one decimal place, even if they are round. Dates and times are written in ISO8601 format.
+* Output targets broad compatibility. Quoting is done with double-quotes and only when required, fields are delimited with commas, and rows are terminated with Unix line endings ("\\n").
 
-Process for contributing code
-=============================
+* Output favors consistency over brevity. Floats always include at least one decimal place, even if they are round. Dates and times are output in ISO 8601 format.
 
-Contributors should use the following roadmap to guide them through the process of submitting a contribution:
+How to contribute
+=================
 
-#. Fork the project on `Github <https://github.com/onyxfish/csvkit>`_.
-#. Check out the `issue tracker <https://github.com/onyxfish/csvkit/issues>`_ and find a task that needs to be done and is of a scope you can realistically expect to complete in a few days. Don't worry about the priority of the issues at first, but try to choose something you'll enjoy. You're much more likely to finish something to the point it can be merged if it's something you really enjoy hacking on.
-#. Comment on the ticket letting everyone know you're going to be hacking on it so that nobody duplicates your effort. It's also good practice to provide some general idea of how you plan on resolving the issue so that other developers can make suggestions.
-#. Write tests for the feature you're building. Follow the format of the existing tests in the test directory to see how this works. You can run all the tests with the command ``tox``.
-#. Write the code. Try to stay consistent with the style and organization of the existing codebase. A good patch won't be refused for stylistic reasons, but large parts of it may be rewritten and nobody wants that. 
-#. As you're coding, periodically merge in work from the master branch and verify you haven't broken anything by running the test suite.
-#. Write documentation for user-facing features.
-#. Once it works, is tested, and has documentation, submit a pull request on Github.
-#. Wait for it to either be merged or to receive a comment about what needs to be fixed.
+#. Fork the project on `GitHub <https://github.com/onyxfish/csvkit>`_.
+#. Look through the `open issues <https://github.com/onyxfish/csvkit/issues>`_ for a task that you can realistically expect to complete in a few days. Don't worry about the issue's priority; instead, choose something you'll enjoy. You're more likely to finish something if you enjoy hacking on it.
+#. Comment on the issue to let people know you're going to work on it so that no one duplicates your effort. It's good practice to provide a general idea of how you plan to resolve the issue so that others can make suggestions.
+#. Write tests for any changes to the code's behavior. Follow the format of the tests in the ``tests/`` directory to see how this works. You can run all the tests with the command ``tox``.
+#. Write the code. Try to be consistent with the style and organization of the existing code. A good contribution won't be refused for stylistic reasons, but large parts of it may be rewritten and nobody wants that.
+#. As you're working, periodically merge in changes from the upstream master branch to avoid having to resolve large merge conflicts. Check that you haven't broken anything by running the tests.
+#. Write documentation for any user-facing features.
+#. Once it works, is tested, and is documented, submit a pull request on GitHub.
+#. Wait for it to be merged or for a comment about what needs to be changed.
 #. Rejoice.
 
 Legalese
 ========
 
-To the extent that they care, contributors should keep the following legal mumbo-jumbo in mind:
+To the extent that contributors care, they should keep the following legal mumbo-jumbo in mind:
 
-The source of csvkit and therefore of any contributions are licensed under the permissive `MIT license <http://www.opensource.org/licenses/mit-license.php>`_. By submitting a patch or pull request you are agreeing to release your code under this license. You will be acknowledged in the AUTHORS file. As the owner of your specific contributions you retain the right to privately relicense your specific code contributions (and no others), however, the released version of the code can never be retracted or relicensed.
+The source of csvkit and therefore of any contributions are licensed under the permissive `MIT license <http://www.opensource.org/licenses/mit-license.php>`_. By submitting a patch or pull request you are agreeing to release your contribution under this license. You will be acknowledged in the AUTHORS file. As the owner of your specific contributions you retain the right to privately relicense your specific contributions (and no others), however, the released version of the code can never be retracted or relicensed.
 
diff --git a/docs/index.rst b/docs/index.rst
@@ -64,9 +64,9 @@ Table of contents
 .. toctree::
     :maxdepth: 3 
 
-    install
     tutorial
     cli 
+    tricks
     contributing
     release 
 

diff --git a/docs/install.rst b/docs/install.rst
diff --git a/docs/scripts/csvcut.rst b/docs/scripts/csvcut.rst
@@ -5,14 +5,14 @@ csvcut
 Description
 ===========
 
-Filters and truncates CSV files. Like unix "cut" command, but for tabular data::
+Filters and truncates CSV files. Like the Unix "cut" command, but for tabular data::
 
     usage: csvcut [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
                   [-p ESCAPECHAR] [-z MAXFIELDSIZE] [-e ENCODING] [-S] [-H] [-v]
                   [-l] [--zero] [-n] [-c COLUMNS] [-C NOT_COLUMNS] [-x]
                   [FILE]
 
-    Filter and truncate CSV files. Like unix "cut" command, but for tabular data.
+    Filter and truncate CSV files. Like the Unix "cut" command, but for tabular data.
 
     positional arguments:
       FILE                  The CSV file to operate on. If omitted, will accept

diff --git a/docs/scripts/csvgrep.rst b/docs/scripts/csvgrep.rst
@@ -13,7 +13,7 @@ Filter tabular data to only those rows where certain columns contain a given val
                    [-f MATCHFILE] [-i]
                    [FILE]
 
-    Search CSV files. Like the unix "grep" command, but for tabular data.
+    Search CSV files. Like the Unix "grep" command, but for tabular data.
 
     positional arguments:
       FILE                  The CSV file to operate on. If omitted, will accept

diff --git a/docs/scripts/csvlook.rst b/docs/scripts/csvlook.rst
@@ -34,6 +34,6 @@ Basic use::
 
     csvlook examples/testfixed_converted.csv
 
-This utility is especially useful as a final operation when piping through other utilities::
+This tool is especially useful as a final operation when piping through other tools::
 
     csvcut -c 9,1 examples/realdata/FY09_EDU_Recipients_by_State.csv | csvlook
diff --git a/docs/scripts/csvpy.rst b/docs/scripts/csvpy.rst
@@ -22,7 +22,7 @@ Loads a CSV file into a :class:`agate.Reader` object and then drops into a Pytho
       -h, --help            show this help message and exit
       --dict                Use a CSV DictReader instead of a normal reader.
 
-This utility will automatically use the IPython shell if it is installed, otherwise it will use the running Python shell.
+This tool will automatically use the IPython shell if it is installed, otherwise it will use the running Python shell.
 
 .. note::
 

diff --git a/docs/scripts/csvsort.rst b/docs/scripts/csvsort.rst
@@ -5,15 +5,15 @@ csvsort
 Description
 ===========
 
-Sort CSV files. Like unix "sort" command, but for tabular data::
+Sort CSV files. Like the Unix "sort" command, but for tabular data::
 
     usage: csvsort [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
                    [-p ESCAPECHAR] [-z MAXFIELDSIZE] [-e ENCODING] [-S] [-H] [-v]
                    [-l] [--zero] [-y SNIFFLIMIT] [-n] [-c COLUMNS] [-r]
                    [--no-inference]
                    [FILE]
 
-    Sort CSV files. Like unix "sort" command, but for tabular data.
+    Sort CSV files. Like the Unix "sort" command, but for tabular data.
 
     positional arguments:
       FILE                  The CSV file to operate on. If omitted, will accept

diff --git a/docs/scripts/in2csv.rst b/docs/scripts/in2csv.rst
@@ -76,14 +76,14 @@ Standardize the formatting of a CSV file (quoting, line endings, etc.)::
 
     in2csv examples/realdata/FY09_EDU_Recipients_by_State.csv
 
-Fetch csvkit's open issues from the Github API, convert the JSON response into a CSV and write it to a file::
+Fetch csvkit's open issues from the GitHub API, convert the JSON response into a CSV and write it to a file::
 
     curl https://api.github.com/repos/onyxfish/csvkit/issues?state=open | in2csv -f json -v > issues.csv 
+
 Convert a DBase DBF file to an equivalent CSV::
 
     in2csv examples/testdbf.dbf > testdbf_converted.csv
 
 Fetch the ten most recent robberies in Oakland, convert the GeoJSON response into a CSV and write it to a file::
 
     curl "http://oakland.crimespotting.org/crime-data?format=json&type=robbery&count=10" | in2csv -f geojson > robberies.csv
-