Skip to content

A Singer.io tap for Google Sheets data

License

Notifications You must be signed in to change notification settings

miroapp/tap-gsheets

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tap-gsheets

This is a Singer tap that extracts data from Google Spreadshits. It produces JSON-formatted data following the Singer spec.

Configuration

The tap uses the Google API underneath for accessing sheets data, so you need to create some credentials accordingly.

Once you have your credentials, you need to put them in the gsheets_api key in the configuration file so they are picked up by the tap. With another key sheet_name one selects the spreadsheet to be extracted.

The tap supports configuration files in JSON as much as in HOCON format. Use a .json extension for JSON format and a .conf extension for HOCON format.

Example

Tapping onto a Google sheet named Investor Loans can be achieved with a HOCO config file such as

{ # config.conf
  sheet_name = "Investor Loans",
  gsheets_api {
    type = "service_account",
    project_id = "pmt-spv",
    private_key_id = ${PRIVATE_KEY_ID},
    private_key = ${PRIVATE_KEY},
    client_email = ${CLIENT_EMAIL},
    client_id = ${CLIENT_ID},
    auth_uri =  "https://accounts.google.com/o/oauth2/auth",
    token_uri = "https://oauth2.googleapis.com/token",
    auth_provider_x509_cert_url = "https://www.googleapis.com/oauth2/v1/certs",
    client_x509_cert_url = ${CERT_URL}
  }
}

Which will output something like

tap-gsheets -c config.conf
# {"type": "RECORD", "stream": "Investor Loans", "record": {"id": 11, "start_date": "2018-08-22", "end_date": "2020-04-17", "investor": "Banking Corp", "amount": 20000000, "interest_rate": 0.20, "add_to_capital": "FALSE", "user_id": "some_user", "created_at": "2018-08-22 11:00:40", "updated_at": "2018-08-22 11:00:40"}}

Additional config

Import your service account config from the file Google provides:

{ # config.conf
  gsheets_api {
    include "gcloud-team-name-001122334455.json"
  }
}

Change the table name format to singular form and convert column names to the valid database format:

{ # config.conf
  underscore_columns = True,
  singular_table_name = True,
}

Process several sheets and worksheets in that sheets at a time:

{ # config.conf
  sheets = [
    {
      name = "Investor Loans",
      worksheets = [
        PageA,
        PageB,
        PageC
      ]
    },
    {
      # ...
    }
  ]
}

By default date field define as date with "%Y-%m-%d %H:%M:%S" format. If there is necessary to be changed need to set specific_date_format:

{ # config.conf
  sheets = [
    {
      name = "Investor Loans",
      worksheets = [
        PageA,
        PageB,
        PageC
      ],
    specific_date_format = "%Y-%m-%d" 
    },
    {
      # ...
    }
  ]
}

By default date field define as date. If there is necessary to be changed need to set date_processing to False:

{ # config.conf
  sheets = [
    {
      name = "Investor Loans",
      worksheets = [
        PageA,
        PageB,
        PageC
      ],
    date_processing = False
    },
    {
      # ...
    }
  ]
}

Specify the row number (1-based) to start processing from, in case you want to skip some unnecessary rows. The default number is 1.

{ # config.conf
  sheets = [
    {
      name = "Investor Loans",
      start_from_row = 5,
      worksheets = [
        PageA,
        PageB,
        PageC
      ]
    },
    {
      # ...
    }
  ]
}

Overriding configuration

The configurations in the file can be overriden with the command line parameter --overrides, which takes configuration overrides in a JSON string and applies them over the passed config file.

tap-gsheets -c config.conf --overrides '{"sheet_name":"2019 Baseball Games"}'

Install

To install the tap-gsheets utility as a system command, create and activate a Python3 virtual environment

$ cd tap-gsheets
$ python3 -m venv ~/.virtualenvs/tap-gsheets
$ source ~/.virtualenvs/tap-gsheets/bin/activate

and install the package

$ pip install -e .

About

A Singer.io tap for Google Sheets data

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%