Pull request Compare This branch is 19 commits ahead of histograph:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Space/Time ETL tool

Extract/Transform/Load tool for Space/Time data: loads separate data modules which perform ETL tasks, such as downloading and transforming data to the Space/Time data model.


The configuration of the data tool is done in the Space/Time configuration file, under the data key:

Parameter Description
baseDir Path (absolute, or relative to data tool) where data tool looks for data modules
modulePrefix Directory prefix used to identify data modules (e.g. etl-mapwarper)
outputDir Directory to which data modules write their data

The configuration of the separate data modules can also be done in configuration file.


First, clone this repository and the repository of the data modules you need:

git clone

# Data module repositories:
git clone
git clone
git clone

Then, install dependencies:

cd spacetime-etl
npm install
cd ..

# Data module repositories:    
cd etl-wards
npm install
cd ..
cd etl-mapwarper
npm install
cd ..
cd etl-oldnyc
npm install

Download and convert data

Run the data tool without command line arguments to get a list of the available data modules:

node index.js

To execute a module, provide their dataset IDs as command line parameters:

node index.js mapwarper oldnyc ...

Alternatively, you can select the processing steps you want to run:

node index.js --steps=convert mapwarper

By default, all steps are run consecutively.

Copyright (C) 2015 Waag Society, 2016 The New York Public Library