Skip to content

evz/civic-json-worker

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

civic-json-worker

Flask API for tracking civic tech projects across the world. Project data is stored / output using the civic.json data standard. (Soon!)

A project by:

Open City

Beta NYC

Code for America

For the story behind this API, read this. For our design philosophy, read this.

How It Works

Looking at other civic tech listing projects like this that have gone stale, the real sticking point is keeping the list of projects - and their details - up to date. The less work people have to do, the more the archive will stay up to date and useful.

The goal of this project is to make humans responsible for one thing: deciding what gets tracked. They submit github repo urls to this API, which curates a simple projects list:

[
    "https://github.com/dssg/census-communities-usa",
    "https://github.com/open-city/open-gov-hack-night",
    ...
]

The rest is up to computers. Every 10 minutes the run_update.py script runs over the project urls in the list, and pings the Github API to gather the following fields for each project:

[
    {
        "contributors": [
            {
                "avatar_url": "https://0.gravatar.com/avatar/5e5eb188a0e4d3a7c8f38ee0fc3a6cbd?d=https%3A%2F%2Fidenticons.github.com%2Fd8c3ef3ed05a213a7225bf5e6e46101a.png", 
                "contributions": 51, 
                "html_url": "https://github.com/derekeder", 
                "login": "derekeder"
            }, 
            {
                "avatar_url": "https://2.gravatar.com/avatar/813d23c289052af417387a9270d0da31?d=https%3A%2F%2Fidenticons.github.com%2Ffa9357bb22fd993fc9795619c7e1d4f7.png", 
                "contributions": 46, 
                "html_url": "https://github.com/fgregg", 
                "login": "fgregg"
            }, 
            {
                "avatar_url": "https://2.gravatar.com/avatar/1d0c5faee140af87d7d6967bc946ecc6?d=https%3A%2F%2Fidenticons.github.com%2F44e80db9ed8527f429c969e804432b0f.png", 
                "contributions": 9, 
                "html_url": "https://github.com/evz", 
                "login": "evz"
            }
        ], 
        "contributors_url": "https://api.github.com/repos/datamade/csvdedupe/contributors", 
        "created_at": "2013-07-11T14:23:33Z", 
        "description": "Command line tool for deduplicating CSV files", 
        "forks_count": 2, 
        "homepage": null, 
        "html_url": "https://github.com/datamade/csvdedupe", 
        "id": 11343900, 
        "language": "Python", 
        "name": "csvdedupe", 
        "open_issues": 8, 
        "owner": {
            "avatar_url": "https://2.gravatar.com/avatar/0a89207d38feff1dcd938bdc1e4a9b5e?d=https%3A%2F%2Fidenticons.github.com%2F3424042f8cb2b04950903794ad9c8daf.png", 
            "html_url": "https://github.com/datamade", 
            "login": "datamade"
        }, 
        "updated_at": "2013-09-20T06:32:39Z", 
        "watchers_count": 26
    },
    ...
]

NOTE: these fields will eventually reflect the proposed civic.json standard (see below.)

This data is hosted on a publicly on S3 as JSON with a CORS configuration that allows it to be loaded via an Ajax call, for use on any projects list site.

bonus: anyone can use this JSON file for their own purposes.

Civic.json data standard

Civic.json is proposed meta-data standard for describing civic tech projects. The goal is for this standard to be simple, and for the data fields that describe projects to be largely assembled programatically.

The standard is still very much in planning phases, and we welcome discussion. Once we settle on v1, civic-json-worker will outputs - and potentially store - project data in this format.

Benefits

By pushing everything on to Github, we will have very little to maintain, content-wise, as administrators. Simultaneously, we will encourage people to:

  • sign up for Github if they aren't already
  • keep their projects open source (we can't crawl private repos)
  • make sure their description and website urls are up to date
  • use the issue tracker

Installation

NOTE: If you're a Code for America Brigade interested in setting up your own civic-json-worker API, hold it! Our goal is to make life easy for you: you shouldn't have to adapt, deploy, or maintain your own API, just read and write data from a single source. (This way, all the data is centralized, too!)

If you want to help out with development, or you don't want to play nice with the other kids in the schoolyard, read on...

Propping this sucker up for oneself is pretty simple. However, there are some basic requirements which can be gotten in the standard Python fashion (assuming you are working in a virtualenv):

$ pip install -r requirements.txt

Besides that, there are a few environmental variables that you'll need to set:

$ export FLASK_KEY=[whatever you want] # This is a string that you'll check to make sure that only trusted people are deleting things
$ export GITHUB_TOKEN=[Github API token] # Read about setting that up here: http://developer.github.com/v3/oauth/
$ export S3_BUCKET=[Name of the bucket] # This is the bucket where you'll store the JSON files 
$ export AWS_ACCESS_KEY=[Amazon Web Services Key] # This will need access to the bucket above
$ export AWS_SECRET_KEY=[Amazon Web Services Secret] # This will need access to the bucket above

Probably easiest placed in the .bashrc (or the like) of the user that the app is running as rather than manually set but you get the idea...

Running the updater

To get this going the first time, you’ll need to create a projects.json file in the root directory of the S3 Bucket where you will be storing your civic JSON files. The structure is pretty simple, just an array with a list of github URLs like so:

[
    "https://github.com/open-city/dedupe",
    "https://github.com/censusreporter/censusreporter"
]

Once that is setup and you have your python virtualenv activated, you should be able to run the run_update.py script thusly:

$ python run_update.py

That should go through and create all the other files in your S3 Bucket as needed.

Contribute

Get in touch with Andrew Hyder (andrewh@codeforamerica.org) from Code for America or Eric Van Zanten (eric.vanzanten@gmail.com) from Open City.

The issue tracker is actively watched and pull requests are welcome!

About

Worker script that pulls metadata about opengov projects

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.0%
  • Shell 4.0%