Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a source BigQuery #1876

Closed
awoehrl opened this issue Jan 28, 2021 · 5 comments · Fixed by #4457
Closed

Add a source BigQuery #1876

awoehrl opened this issue Jan 28, 2021 · 5 comments · Fixed by #4457

Comments

@awoehrl
Copy link

awoehrl commented Jan 28, 2021

Requires #4401

Tell us about the new integration you’d like to have

Which source and which destination? Which frequency?
I'm looking into a way to get data out of BigQuery and into a postgres database.

Describe the context around this new integration

This is important for our internal APIs that are feed by a postgres database. The raw data lies in BigQuery and the aggregates need to go to postgres.

Describe the alternative you are considering or using

I will have to setup some messy cloud functions to get the data out or setup a meltano instance.

┆Issue is synchronized with this Asana task by Unito

@awoehrl awoehrl added area/connectors Connector related issues new-connector labels Jan 28, 2021
@sherifnada
Copy link
Contributor

sherifnada commented Feb 5, 2021

@awoehrl Thanks for creating the issue! what scale of data would you want to sync with Airbyte?

@awoehrl
Copy link
Author

awoehrl commented Feb 8, 2021 via email

@sherifnada
Copy link
Contributor

A user (@erosen3 ) has expressed interest in being able to sync views from BigQuery as well. It would be nice to have this feature

@edbizarro
Copy link
Contributor

👍

Our use case is to sync some tables/datasets between GCP projects, like tables exported by Google Analytics 360 and Firebase to bigquery, about to 2-5TB/day.

The native functionality in BigQuery (BQ Transfer Service) for this purpose don't fully met our need because they have a minimum frequency of 12 hours and we need at least 6 hours for this sync interval.

@DoNotPanicUA
Copy link
Contributor

The latest state of the comprehensive tests after #4981

Data Type Insert values Expected values Comment Common test result
int64 null, -128, 127, 9223372036854775807, -9223372036854775808 null, -128, 127, 9223372036854775807, -9223372036854775808 Ok
int null, -128, 127 null, -128, 127 Ok
smallint null, -128, 127 null, -128, 127 Ok
integer null, -128, 127 null, -128, 127 Ok
bigint null, -128, 127 null, -128, 127 Ok
tinyint null, -128, 127 null, -128, 127 Ok
byteint null, -128, 127 null, -128, 127 Ok
numeric null, -128, 127, 999999999999999999, -999999999999999999, 0.123456789, -0.123456789 null, -128, 127, 999999999999999999, -999999999999999999, 0.123456789, -0.123456789 Ok
bignumeric null, -128, 127, 999999999999999999, -999999999999999999, 0.123456789, -0.123456789 null, -128, 127, 999999999999999999, -999999999999999999, 0.123456789, -0.123456789 Ok
decimal null, -128, 127, 999999999999999999, -999999999999999999, 0.123456789, -0.123456789 null, -128, 127, 999999999999999999, -999999999999999999, 0.123456789, -0.123456789 Ok
bigdecimal null, -128, 127, 999999999999999999, -999999999999999999, 0.123456789, -0.123456789 null, -128, 127, 999999999999999999, -999999999999999999, 0.123456789, -0.123456789 Ok
float64 null, -128, 127, 0.123456789, -0.123456789 null, -128.0, 127.0, 0.123456789, -0.123456789 Ok
bool true, false, null true, false, null Ok
bytes FROM_BASE64("test"), null test, null Ok
date date('2021-10-20'), date('9999-12-31'), date('0001-01-01'), null 2021-10-20T00:00:00Z, 9999-12-31T00:00:00Z, 0001-01-01T00:00:00Z, null Ok
datetime datetime('2021-10-20 11:22:33'), datetime('9999-12-31 11:22:33'), datetime('0001-01-01 11:22:33'), null 2021-10-20T11:22:33Z, 9999-12-31T11:22:33Z, 0001-01-01T11:22:33Z, null Ok
timestamp timestamp('2021-10-20 11:22:33'), null 2021-10-20T11:22:33Z, null Ok
geography ST_GEOGFROMTEXT('POINT(1 2)'), null POINT(1 2), null Ok
string 'qwe', 'йцу', null qwe, йцу, null Ok
struct STRUCT("B.A",12), null Ok
time TIME(15, 30, 00), null 15:30:00, null Ok
array ['a', 'b'] [{"test_column":"a"},{"test_column":"b"}] Ok
struct STRUCT('s' as frst, 1 as sec, STRUCT(555 as id_col, STRUCT(TIME(15, 30, 00) as time) as mega_obbj) as obbj) {"frst":"s","sec":1,"obbj":{"id_col":555,"mega_obbj":{"last_col":"15:30:00"}}} Ok
array [STRUCT('qqq' as fff, 1 as ggg), STRUCT('kkk' as fff, 2 as ggg)] [{"fff":"qqq","ggg":1},{"fff":"kkk","ggg":2}] Ok
array [STRUCT('qqq' as fff, [STRUCT('fff' as ooo, 1 as kkk), STRUCT('hhh' as ooo, 2 as kkk)] as ggg)] [{"fff":"qqq","ggg":[{"ooo":"fff","kkk":1},{"ooo":"hhh","kkk":2}]}] Ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

10 participants