Generate BigQuery tables, load and extract data, based on JSON Table Schema descriptors.
Python Makefile
Latest commit 8b48cb2 Oct 26, 2016 @roll roll committed on GitHub v0.3.0

README.md

jsontableschema-bigquery-py

Travis Coveralls PyPi SemVer Gitter

Generate and load BigQuery tables based on JSON Table Schema descriptors.

Version v0.3 contains breaking changes:

  • renamed Storage.tables to Storage.buckets
  • changed Storage.read to read into memory
  • added Storage.iter to yield row by row

Getting Started

Installation

pip install jsontableschema-bigquery

Storage

Package implements Tabular Storage interface.

To start using Google BigQuery service:

  • Create a new project - link
  • Create a service key - link
  • Download json credentials and set GOOGLE_APPLICATION_CREDENTIALS environment variable

We can get storage this way:

import io
import os
import json
from apiclient.discovery import build
from oauth2client.client import GoogleCredentials
from jsontableschema_bigquery import Storage

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '.credentials.json'
credentials = GoogleCredentials.get_application_default()
service = build('bigquery', 'v2', credentials=credentials)
project = json.load(io.open('.credentials.json', encoding='utf-8'))['project_id']
storage = Storage(service, project, 'dataset', prefix='prefix')

Then we could interact with storage:

storage.buckets
storage.create('bucket', descriptor)
storage.delete('bucket')
storage.describe('bucket') # return descriptor
storage.iter('bucket') # yields rows
storage.read('bucket') # return rows
storage.write('bucket', rows)

Mappings

schema.json -> bigquery table schema
data.csv -> bigquery talbe data

Drivers

Default Google BigQuery client is used - docs.

API Reference

Snapshot

https://github.com/frictionlessdata/jsontableschema-py#snapshot

Detailed

Contributing

Please read the contribution guideline:

How to Contribute

Thanks!