Skip to content
Code and metadata for releasing MEDSL datasets
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

data management

This repo provides code and metadata for releasing MEDSL datasets.

We use the medsl module to:

  • Validate precinct-level returns from each state
  • Combine and transform these returns into release-ready datasets
  • Generate accompanying documentation
  • Update the elections R package


To generate docs and datasets:

$ python

We assume the following directory structure:

├── 2016-precinct-data
├── data-management
├── elections
└── precinct-returns


  • 2016-precinct-data contains input data, the returns for each state (not yet available online);
  • data-management is this repository;
  • elections contains our R package for election data, and is an output target;
  • precinct-returns is the repo for released datasets, and is an output target.


Requires Python 3. Not yet tested on anything but Fedora Linux and Python 3.6, but should be fine on MacOS.

virtualenv installation:

$ git clone data-management
$ cd data-management
$ python3 -m virtualenv env -p python3
$ source env/bin/activate
$ pip install -r medsl/requirements.txt

The feather-format library depends on pyarrow, whose availability on pip varies by platform (instructions).

You can’t perform that action at this time.