Converts JSON files to CSV (pulling data from nested structures). Useful for Mongo data
A converter to extract nested JSON data to CSV files.

Created specifically to convert multi-line Mongo query results to a single CSV (since data nerds like CSV).


git clone
cd json2csv
pip install -r requirements.txt


Basic (convert from a JSON file to a CSV file in same path):

python /path/to/json_file.json /path/to/outline_file.json

Specify CSV file

python /path/to/json_file.json /path/to/outline_file.json -o /some/other/file.csv

For MongoDB (multiple JSON objects per file, which is non-standard JSON):

python --each-line /path/to/json_file.json /path/to/outline_file.json

Outline Format

For this JSON file:

  "nodes": [
    {"source": {"author": "Someone"}, "message": {"original": "Hey!", "Revised": "Hey yo!"}},
    {"source": {"author": "Another"}, "message": {"original": "Howdy!", "Revised": "Howdy partner!"}},
    {"source": {"author": "Me too"}, "message": {"original": "Yo!", "Revised": "Yo, 'sup?"}}

Use this outline file:

  "map": [
    ["author", ""],
    ["message", "message.original"]
  "collection": "nodes"

Generating outline files

To automatically generate an outline file from a json file:

python --collection nodes /path/to/the.json

This will generate an outline file with the union of all keys in the json collection at /path/to/the.outline.json. You can specify the output file with the -o option, as above.

Unquoting strings

To remove quotation marks from strings in nested data types:

python /path/to/json_file.json /path/to/outline_file.json --strings

This will modify field contents such that:

  "sandwiches": ["ham", "turkey", "egg salad"],
  "toppings": {
    "cheese": ["cheddar", "swiss"],
    "spread": ["mustard", "mayonaise", "tapenade"]

Is parsed into

sandwiches toppings
ham, turkey, egg salad cheese: cheddar, swiss
spread: mustard, mayonaise, tapenade

The class variables SEP_CHAR, KEY_VAL_CHAR, DICT_SEP_CHAR, DICT_OPEN, and DICT_CLOSE can be changed to modify the output formatting. For nested dictionaries, there are settings that have been commented out that work well.

