meltano add extractor --custom tap-circleci
In the interactive portion, use these variables
name: tap-circleci
pypi: git+https://github.com/JChouCode/tap-circleci.git
executable: tap-circleci
capabilities: discover,catalog,state
config: project_slugs,token:password
This fork improves the tap to handle edge cases that cause errors.
- Edge Case: Job is cancelled and build number is not created, causing a 404 error when requesting unknown build number.
- Improved Bookmarking
- Added
tooling/
for various scripts which help wrangle some of the sharp corners of CircleCI
This is a Singer tap that produces JSON-formatted data following the Singer spec.
This tap:
- Pulls raw data from Circle CI
- Extracts the following resources:
- Outputs the schema for each resource
- Incrementally pulls data based on the input state
-
Install
git clone git@github.com:apollographql/tap-circleci.git && cd tap-circleci && pip install -e .
-
Create a Circle CI access token
Login to your Circle CI account, go to the Personal API Tokens page, and generate a new token. Copy the token and save it somewhere safe.
-
Create the config file (see below)
Create a JSON file containing the token you just created as well as the project slug to the project you want to extract data from. Retrieve the project slug from the url for a workflow - it should be the VCS your project uses (
gh
for Github orbb
for Bitbucket), followed by the owner or organization, followed by the repository name ex.gh/singer-io/singer-python
. You can enter multiple project slugs separated by spaces to pull data from multiple projects.{ "token": "your-access-token", "project_slugs": "gh/singer-io/singer-python gh/singer-io/getting-started" }
-
Run the tap in discovery mode to get catalog.json file
tap-circleci --config config.json --discover > catalog.json
-
In the catalog.json file, select the streams to sync
Each stream in the properties.json file has a "metadata" entry. To select a stream to sync, add
{"breadcrumb": [], "metadata": {"selected": true}}
to that stream's "metadata" entry.
For example, to sync the pipelines stream:... "type": [ "null", "object" ], "additionalProperties": false }, "stream": "pipelines", "metadata": [{"breadcrumb": [], "metadata": {"selected": true}}] }, ...
Another way to select a stream to sync is to add
"selected": true
into that stream's schema:... "tap_stream_id": "workflows", "key_properties": [], "schema": { "selected": true, "properties": { "_pipeline_id": { "type": [ "null", "string" ] ...
Either way is acceptable, but the first way is preferred.
-
Run the application (will print records and other messages to the console)
tap-circleci
can be run with:tap-circleci --config config.json --catalog catalog.json
To save output to a file:
tap-circleci --config config.json --catalog catalog.json > output.txt
It is our intention that this singer tap gets used with a singer target, which will load the output into a database. More information on singer targets here.
-
To rerun using the last output
STATE
record:In your output records, you will see something like:
{ "type": "STATE", "value": { "bookmarks": { "gh/apollographql/tap-circleci": { "pipelines": { "since": "2023-11-15T00:00:00.000000Z" } } } } }
Select the
value
key, store it to a JSON file, and run:tap-circleci --config config.json --catalog catalog.json --state state.json
Detailed configuration information for the --config
key.
key | type | default | description |
---|---|---|---|
token |
string |
N/A |
Personal API Token |
project_slugs |
string |
N/A |
Space delimited string of CCI project slugs |
Copyright © 2020 Sisu Data