Skip to content

Commit

Permalink
Update Bigquery docs page
Browse files Browse the repository at this point in the history
  • Loading branch information
sherifnada committed Dec 21, 2021
1 parent 5dad5de commit 0210071
Showing 1 changed file with 5 additions and 15 deletions.
20 changes: 5 additions & 15 deletions docs/integrations/destinations/bigquery.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,24 +100,13 @@ Additional options can also be customized:

Once you've configured BigQuery as a destination, delete the Service Account Key from your computer.

#### Uploading Options
## Uploading Options

There are 2 available options to upload data to BigQuery `Standard` and `GCS Staging`.

* `Standard` is option to upload data directly from your source to BigQuery storage. This way is faster and requires less resources than GCS one.

Please be aware you may see some fails for big datasets and slow sources, i.e. if reading from source takes more than 10-12 hours.

This is caused by the Google BigQuery SDK client limitations. For more details please check [https://github.com/airbytehq/airbyte/issues/3549](https://github.com/airbytehq/airbyte/issues/3549)

* `GCS Uploading (CSV format)`: This approach has been implemented in order to avoid the issue for big datasets mentioned above.

At the first step all data is uploaded to GCS bucket and then all moved to BigQuery at one shot stream by stream.

The [destination-gcs connector](gcs.md) is partially used under the hood here, so you may check its documentation for more details.

For the GCS Staging upload type additional params must be configured:
### `GCS Staging`

This is the recommended configuration for uploading data to BigQuery. It works by first uploading all the data to a [GCS](https://cloud.google.com/storage) bucket, then ingesting the data to BigQuery. To configure GCS Staging, you'll need the following parameters:
* **GCS Bucket Name**
* **GCS Bucket Path**
* **GCS Bucket Keep files after migration**
Expand All @@ -131,7 +120,8 @@ For the GCS Staging upload type additional params must be configured:
* This depends on your networking setup.
* The easiest way to verify if Airbyte is able to connect to your GCS bucket is via the check connection tool in the UI.

Note: It partially re-uses the destination-gcs connector under the hood. So you may also refer to its guide for additional clarifications. **GCS Region** for GCS would be used the same as set for BigQuery **Format** - Gcs format is set to CSV
### `Standard` uploads
This uploads data directly from your source to BigQuery. While this is faster to setup initially, **we strongly recommend that you do not use this option for anything other than a quick demo**. It is more than 10x slower than the GCS uploading option and will fail for many datasets. Please be aware you may see some failures for big datasets and slow sources, e.g. if reading from source takes more than 10-12 hours. This is caused by the Google BigQuery SDK client limitations. For more details please check [https://github.com/airbytehq/airbyte/issues/3549](https://github.com/airbytehq/airbyte/issues/3549)

## Naming Conventions

Expand Down

0 comments on commit 0210071

Please sign in to comment.