A Singer tap for extracting data from the GitHub API
Python
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
tap_github
.gitignore
LICENSE
README.md
config.sample.json
setup.py
tap_github.py

README.md

tap-github

This is a Singer tap that produces JSON-formatted data from the GitHub API following the Singer spec.

This tap:

  • Pulls raw data from the GitHub REST API
  • Extracts the following resources from GitHub for a single repository:
  • Outputs the schema for each resource
  • Incrementally pulls data based on the input state

Quick start

  1. Install

    We recommend using a virtualenv:

    > virtualenv -p python 3 venv
    > source venv/bin/activate
    > pip install tap-github
  2. Create a GitHub access token

    Login to your GitHub account, go to the Personal Access Tokens settings page, and generate a new token with at least the repo scope. Save this access token, you'll need it for the next step.

  3. Create the config file

    Create a JSON file containing the access token you just created and the path to the repository. The repo path is relative to https://github.com/. For example the path for this repository is singer-io/tap-github.

    {"access_token": "your-access-token",
     "repository": "singer-io/tap-github"}
  4. [Optional] Create the initial state file

    You can provide JSON file that contains a date for the "commit" and "issues" endpoints to force the application to only fetch commits and issues newer than those dates. If you omit the file it will fetch all commits and issues.

    {"commits": "2017-01-17T20:32:05Z",
     "issues":  "2017-01-17T20:32:05Z"}
  5. Run the application

    tap-github can be run with:

    tap-github --config config.json [--state state.json]

Copyright © 2017 Stitch