Skip to content
Datastore Connectivity for BigQuery in go
Go
Branch: master
Clone or download
adrianwit@gmail.com adrianwit@gmail.com
adrianwit@gmail.com and adrianwit@gmail.com patched repeated mode
Latest commit 9e5a864 Sep 11, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
test reformatted Nov 19, 2018
CHANGELOG.md patched repeated mode Sep 11, 2019
LICENSE first commit Jun 13, 2016
NOTICE
README.md
compressed.go
connection.go updated user agent Mar 7, 2019
dialect.go
dialect_test.go
doc.go Increased test coverage Aug 4, 2016
dsc.go Renamed Dialectable To Dialect Jul 27, 2016
insert.go added job reference for retries Aug 7, 2019
manager.go minor refactoring Apr 17, 2019
manager_factory.go
manager_test.go updated client to enable DDL,DML, customized insert as either load or… May 13, 2018
query_info.go minor refactoring Apr 17, 2019
query_iterator.go
scanner.go patched repeated mode Sep 11, 2019
task.go updated steaming completion check, added insert retry on 503 backend … May 22, 2018
util.go refactored and parameterized insert Apr 16, 2019
util_test.go

README.md

Datastore Connectivity for BigQuery (bgc)

Datastore Connectivity library for BigQuery in Go. GoDoc

This library is compatible with Go 1.5+

Please refer to CHANGELOG.md if you encounter breaking changes.

This library uses SQL mode and streaming API to insert data as default. To use legacy SQL please use the following /* USE LEGACY SQL */ hint, in this case you will not be able to fetch repeated and nested fields.

Configuration parameters

insertMethod

To control insert method just provide config.parameters with the following value:

_table_name_.insertMethod = "load"

Note that if streaming is used, currently UPDATE and DELETE statements are not supported.

insertIdColumn

For streaming you can specify which column to use as insertId with the following config.params

_table_name_.insertMethod = "stream"
_table_name_.insertIdColumn = "sessionId"
streamBatchCount

streamBatchCount controls row cound in batch (default 9999)

insertWaitTimeoutInMs

When inserting data data this library checks upto 60 sec if data has been added. To control this behaviour you can set insertWaitTimeoutInMs (default 60 sec)

To disable this mechanism set: insertWaitTimeoutInMs: -1

insertMaxRetires

Retries insert when 503 internal error

datasetId

Default dataset

pageSize

Default 500

The maximum number of rows of data to return per page of results. In addition to this limit, responses are also limited to 10 MB.

Credentials

  1. Google secrets for service account

a) set GOOGLE_APPLICATION_CREDENTIALS environment variable

b) credential can be a name with extension of the JSON secret file placed into ~/.secret/ folder

config.yaml

driverName: bigquery
credentials: bq # place your big query secret json to ~/.secret/bg.json
parameters:
  datasetId: myDataset

c) full URL to secret file

config.yaml

driverName: bigquery
credentials: file://tmp/secret/mySecret.json
parameters:
  datasetId: myDataset

Secret file has to specify the following attributes:

type Config struct {
	//google cloud credential
	ClientEmail  string `json:"client_email,omitempty"`
	TokenURL     string `json:"token_uri,omitempty"`
	PrivateKey   string `json:"private_key,omitempty"`
	PrivateKeyID string `json:"private_key_id,omitempty"`
	ProjectID  string `json:"project_id,omitempty"`
}
  1. Private key (pem)

config.yaml

driverName: bigquery
credentials: bq # place your big query secret json to ~/.secret/bg.json
parameters:
  serviceAccountId: "***@developer.gserviceaccount.com"
  datasetId: MyDataset
  projectId: spheric-arcadia-98015
  privateKeyPath: /tmp/secret/bq.pem

Usage:

The following is a very simple example of Reading and Inserting data

package main

import (
    "github.com/viant/bgc"
    "github.com/viant/dsc"
    "time"
    "fmt"
    "log"
)


type MostLikedCity struct {
	City      string
	Visits    int
	Souvenirs []string
}

type  Traveler struct {
	Id            int
	Name          string
	LastVisitTime time.Time
	Achievements  []string
	MostLikedCity MostLikedCity
	VisitedCities []struct {
		City   string
		Visits int
	}
}


func main() {

    config, err := dsc.NewConfigWithParameters("bigquery", "",
    	    "bq", // google cloud secret placed in ~/.secret/bg.json
            map[string]string{
                "datasetId":"MyDataset",
            })

    if err != nil {
        log.Fatal(err)
    }

		
    factory := dsc.NewManagerFactory()
    manager, err := factory.Create(config)
    if err != nil {
        log.Fatalf("Failed to create manager %v", err)
    }
   

    traveler := Traveler{}
    success, err := manager.ReadSingle(&traveler, " SELECT id, name, lastVisitTime, visitedCities, achievements, mostLikedCity FROM travelers WHERE id = ?", []interface{}{4}, nil)
    if err != nil {
        panic(err.Error())
    }

    travelers :=  make([]Traveler, 0)
    err:= manager.ReadAll(&interest, "SELECT iid, name, lastVisitTime, visitedCities, achievements, mostLikedCity",nil, nil)
    if err != nil {
        panic(err.Error())
    }

   // ...

    inserted, updated, err := manager.PersistAll(&travelers, "travelers", nil)
    if err != nil {
           panic(err.Error())
    }
    // ...
    




   //Custom reading handler with reading query info type to get CacheHit, TotalRows, TotalBytesProcessed

   var resultInfo = &bgc.QueryResultInfo{}
   var perf = make(map[string]int)  
   	err = manager.ReadAllWithHandler(`SELECT DATE(date), COUNT(*) FROM performance_agg WHERE DATE(date) = ?  GROUP BY 1`, []interface{}{
   		"2018-05-03",
   		resultInfo,
   	}, func(scanner dsc.Scanner) (toContinue bool, err error) {
   	        var date string
   	        var count int
   	        err = scanner.Scan(&date, &count)
   	        if err != nil {
   	        	return false, err
   	        }
   	        perf[date] = count
   		return true, nil
   	})
   	log.Printf("cache: %v,  rows: %v, bytes: %v", resultInfo.CacheHit, resultInfo.TotalRows, resultInfo.TotalBytesProcessed)

   
    dialect := dsc.GetDatastoreDialect(config.DriverName)
    DDL, err := dialect.ShowCreateTable(manager, "performance_agg")
    fmt.Printf("%v %v\n", DDL, err)
   
}

GoCover

GoCover

License

The source code is made available under the terms of the Apache License, Version 2, as stated in the file LICENSE.

Individual files may be made available under their own specific license, all compatible with Apache License, Version 2. Please see individual files for details.

Credits and Acknowledgements

Library Author: Adrian Witas

Contributors: Mikhail Berlyant

You can’t perform that action at this time.