Skip to content

Automattic/tap-github

 
 

Repository files navigation

tap-github

This is a Singer tap that produces JSON-formatted data from the GitHub API following the Singer spec.

This tap:

Quick start

  1. Install

    We recommend using a virtualenv:

    > virtualenv -p python3 venv
    > source venv/bin/activate
    > pip install tap-github
  2. Create a GitHub access token

    Login to your GitHub account, go to the Personal Access Tokens settings page, and generate a new token with at least the repo scope. Save this access token, you'll need it for the next step.

  3. Create the config file

    Create a JSON file containing the start date, access token you just created and the path to one or multiple repositories that you want to extract data from. Each repo path should be space delimited. The repo path is relative to "base_url" (Default: https://github.com/). For example the path for this repository is singer-io/tap-github. You can also add request timeout to set the timeout for requests which is an optional parameter with default value of 300 seconds.

    {
      "access_token": "your-access-token",
      "repository": "singer-io/tap-github singer-io/getting-started",
      "start_date": "2021-01-01T00:00:00Z",
      "request_timeout": 300,
      "base_url": "https://api.github.com"
    }

Note: The max results per page is configurable with the parameter max_per_page, as default it will return 100 (that is the max of most of the endpoints)

  1. Run the tap in discovery mode to get properties.json file

    tap-github --config config.json --discover > properties.json
  2. In the properties.json file, select the streams to sync

    Each stream in the properties.json file has a "schema" entry. To select a stream to sync, add "selected": true to that stream's "schema" entry. For example, to sync the pull_requests stream:

    ...
    "tap_stream_id": "pull_requests",
    "schema": {
      "selected": true,
      "properties": {
        "updated_at": {
          "format": "date-time",
          "type": [
            "null",
            "string"
          ]
        }
    ...
    
  3. Run the application

    tap-github can be run with:

    tap-github --config config.json --properties properties.json

Copyright © 2018 Stitch

About

A Singer tap for extracting data from the GitHub API

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%