Skip to content
Code and metadata for releasing MEDSL datasets
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
medsl
.gitignore
LICENSE
Pipfile
Pipfile.lock
README.md

README.md

data management

This repo provides code and metadata for releasing MEDSL datasets.

We use the medsl module to:

  • Validate precinct-level returns from each state
  • Combine and transform these returns into release-ready datasets
  • Generate accompanying documentation
  • Update the elections R package

use

To generate docs and datasets:

$ python release.py

We assume the following directory structure:

├── 2016-precinct-data
├── data-management
├── elections
└── precinct-returns

Where:

  • 2016-precinct-data contains input data, the returns for each state (not yet available online);
  • data-management is this repository;
  • elections contains our R package for election data, and is an output target;
  • precinct-returns is the repo for released datasets, and is an output target.

installation

Requires Python 3. Not yet tested on anything but Fedora Linux and Python 3.6, but should be fine on MacOS.

virtualenv installation:

$ git clone git@github.com:MEDSL/data-management.git data-management
$ cd data-management
$ python3 -m virtualenv env -p python3
$ source env/bin/activate
$ pip install -r medsl/requirements.txt

The feather-format library depends on pyarrow, whose availability on pip varies by platform (instructions).

You can’t perform that action at this time.