Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BigQueryReader.to_recap #325

Merged
merged 1 commit into from
Jul 19, 2023
Merged

Add BigQueryReader.to_recap #325

merged 1 commit into from
Jul 19, 2023

Conversation

criccomini
Copy link
Contributor

Recap can now convert BigQuery table schemas to Recap StructTypes.

Some notes:

  1. Nested types are supported (RECORD, STRUCT)
  2. Repeated/ARRAYs are supported
  3. JSON types are treated as STRING
  4. STRING and BYTES have a max length of 65KiB by default

For (4), I had a really hard time nailing down the exact max string size in BQ. This page says they have a 2 byte header on strings/bytes, which I assume is to store the byte length. Thus I went with 65KiB.

I'm also using bigquery-emulator to test. I've only included integration tests.

Closes #285

@criccomini criccomini force-pushed the add-bq-reader branch 4 times, most recently from 5cce4ad to 16365c7 Compare July 18, 2023 18:37
@criccomini
Copy link
Contributor Author

Dang. The BQ emulator isn't able to start in GH workflows right now because of goccy/bigquery-emulator#208.

@criccomini criccomini force-pushed the add-bq-reader branch 20 times, most recently from 801c00c to 4cb7cc0 Compare July 19, 2023 19:56
Recap can now convert BigQuery table schemas to Recap StructTypes.

Some notes:

1. Nested types are supported (RECORD, STRUCT)
2. Repeated/ARRAYs are supported
3. JSON types are treated as STRING
4. STRING and BYTES have a max length of 65KiB by default

For (4), I had a really hard time nailing down the exact max string size in BQ.
[This](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types)
page says they have a 2 byte header on strings/bytes, which I assume is to store
the byte length. Thus I went with 65KiB.

I'm also using [bigquery-emulator](https://github.com/goccy/bigquery-emulator)
to test. I've only included integration tests.

Closes #285
@criccomini criccomini merged commit 85f9a8e into main Jul 19, 2023
@criccomini criccomini deleted the add-bq-reader branch July 19, 2023 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement BigQueryReader
1 participant