Singer is an open source standard for moving data between databases,
web APIs, files, queues, and just about anything else you can think
of. The Singer spec describes how data extraction scripts — called
“Taps” — and data loading scripts — called “Targets” — should
communicate using a standard JSON-based data format over stdout. By
conforming to this spec, Taps and Targets can be used in any
combination to move data from any source to any destination.
Topics
The Google Sheets Target can be combined with any Singer Tap to populate a Google Sheet with data. This example will use currency exchange rate data from the Fixer.io Tap. Fixer.io is a free API for current and historical foreign exchange rates published by the European Central Bank.
The steps are:
(originally found in the Google API docs)
-
Use this wizard to create or select a project in the Google Developers Console and activate the Sheets API. Click Continue, then Go to credentials.
-
On the Add credentials to your project page, click the Cancel button.
-
At the top of the page, select the OAuth consent screen tab. Select an Email address, enter a Product name if not already set, and click the Save button.
-
Select the Credentials tab, click the Create credentials button and select OAuth client ID.
-
Select the application type Other, enter the name "Singer Sheets Target", and click the Create button.
-
Click OK to dismiss the resulting dialog.
-
Click the Download button to the right of the client ID.
-
Move this file to your working directory and rename it
client_secret.json.
Created a file called config.json in your working directory,
following config.sample.json. The required
spreadsheet_id parameter is the value between the "/d/" and the
"/edit" in the URL of your spreadsheet. For example, consider the
following URL that references a Google Sheets spreadsheet:
https://docs.google.com/spreadsheets/d/1qpyC0XzvTcKT6EISywvqESX3A0MwQoFDE8p-Bll4hps/edit#gid=0
The ID of this spreadsheet is
1qpyC0XzvTcKT6EISywvqESX3A0MwQoFDE8p-Bll4hps.
First, make sure Python 3 is installed on your system or follow these installation instructions for Mac or Ubuntu.
target-gsheet can be run with any Singer Tap to move data from
sources like Braintree, Freshdesk and Hubspot to Google
Sheets. We'll use the Fixer.io Tap - which pulls currency exchange
rate data from a public data set - as an example.
We recommend installing each Tap and Target in a separate Python virtual environment. This will insure that you won't have conflicting dependencies between any Taps and Targets.
These commands will install tap-fixerio and target-gsheet with pip in
their own virtual environments:
# Install tap-fixerio in its own virtualenv
virtualenv -p python3 tap-fixerio
tap-fixerio/bin/pip install tap-fixerio
# Install target-gsheet in its own virtualenv
virtualenv -p python3 target-gsheet
target-gsheet/bin/pip install target-gsheetThis command will pipe the output of tap-fixerio to target-gsheet,
using the configuration file created in Step 2:
› tap-fixerio/bin/tap-fixerio | target-gsheet/bin/target-gsheet -c config.json
INFO Replicating the latest exchange rate data from fixer.io
INFO Tap exiting normallytarget-gsheet will attempt to open a new window or tab in your
default browser to perform authentication. If this fails, copy the URL
from the console and manually open it in your browser.
If you are not already logged into your Google account, you will be
prompted to log in. If you are logged into multiple Google accounts,
you will be asked to select one account to use for the
authorization. Click the Accept button to allow target-gsheet to
access your Google Sheet. You can close the tab after the signup flow
is complete.
Each stream generated by the Tap will be written to a different sheet
in your Google Sheet. For the Fixer.io Tap you'll see a single sheet
named exchange_rate.
When target-gsheet is run as above it writes log lines to stderr,
but stdout is reserved for outputting State messages. A State
message is a JSON-formatted line with data that the Tap wants
persisted between runs - often "high water mark" information that the
Tap can use to pick up where it left off on the next run. Read more
about State messages in the Singer spec.
Targets write State messages to stdout once all data that appeared
in the stream before the State message has been processed by the
Target. Note that although the State message is sent into the target,
in most cases the target's process won't actually store it anywhere or
do anything with it other than repeat it back to stdout.
Taps like the Fixer.io Tap can also accept a --state argument
that, if present, points to a file containing the last persisted State
value. This enables Taps to work incrementally - the State
checkpoints the last value that was handled by the Target, and the
next time the Tap is run it should pick up from that point.
To run the Fixer.io Tap incrementally, point it to a State file and
capture the persister's stdout like this:
› tap-fixerio --state state.json | target-gsheet -c config.json >> state.json
› tail -1 state.json > state.json.tmp && mv state.json.tmp state.json
(rinse and repeat)If you can't find an existing Tap for your data source, then it's time to build your own.
Topics:
A Tap is just a program, written in any language, that outputs data to
stdout according to the Singer spec. In fact, your first Tap can
be written from the command line, without any programming at all:
› printf '{"type":"SCHEMA", "stream":"hello","key_properties":[],"schema":{"type":"object", "properties":{"value":{"type":"string"}}}}\n{"type":"RECORD","stream":"hello","schema":"hello","record":{"value":"world"}}\n'This writes the datapoint {"value":"world"} to the hello
stream along with a schema indicating that value is a string.
That data can be piped into any Target, like the Google Sheets
Target, over stdin:
› printf '{"type":"SCHEMA", "stream":"hello","key_properties":[],"schema":{"type":"object", "properties":{"value":{"type":"string"}}}}\n{"type":"RECORD","stream":"hello","schema":"hello","record":{"value":"world"}}\n' | target-gsheet -c config.jsonTo move beyond Hello, world you'll need a real programming language. Although any language will do, we have built a Python library to help you get up and running quickly.
Let's write a Tap called tap_ip.py that retrieves the current
IP using icanhazip.com, and writes that data with a timestamp.
First, install the Singer helper library with pip:
› pip install singer-pythonThen, open up a new file called tap_ip.py in your favorite editor.
import singer
import urllib.request
from datetime import datetime, timezoneWe'll use the datetime module to get the current timestamp, the
singer module to write data to stdout in the correct format, and
the urllib.request module to make a request to icanhazip.com.
now = datetime.now(timezone.utc).isoformat()
schema = {
'properties': {
'ip': {'type': 'string'},
'timestamp': {'type': 'string', 'format': 'date-time'},
},
}This sets up some of the data we'll need - the current time, and the schema of the data we'll be writing to the stream formatted as a JSON Schema.
with urllib.request.urlopen('http://icanhazip.com') as response:
ip = response.read().decode('utf-8').strip()
singer.write_schema('my_ip', schema, 'timestamp')
singer.write_records('my_ip', [{'timestamp': now, 'ip': ip}])Finally, we make the HTTP request, parse the response, and then make
two calls to the singer library:
singer.write_schemawhich writes the schema of themy_ipstream and defines its primary keysinger.write_recordsto write a record to that stream
We can send this data to Google Sheets by running our new Tap with the Google Sheets Target:
› python tap_ip.py | target-gsheet -c config.jsonJoin the Singer Slack channel to get help from members of the Singer community.
Copyright © 2017 Stitch