diff --git a/airbyte-integrations/connectors/source-google-ads/Dockerfile b/airbyte-integrations/connectors/source-google-ads/Dockerfile index 789dabfa2fad..bc4b1bc1cc0d 100644 --- a/airbyte-integrations/connectors/source-google-ads/Dockerfile +++ b/airbyte-integrations/connectors/source-google-ads/Dockerfile @@ -13,5 +13,5 @@ COPY main.py ./ ENTRYPOINT ["python", "/airbyte/integration_code/main.py"] -LABEL io.airbyte.version=0.2.24 +LABEL io.airbyte.version=0.3.0 LABEL io.airbyte.name=airbyte/source-google-ads diff --git a/airbyte-integrations/connectors/source-google-ads/metadata.yaml b/airbyte-integrations/connectors/source-google-ads/metadata.yaml index 995dcaee3900..2cb29b637844 100644 --- a/airbyte-integrations/connectors/source-google-ads/metadata.yaml +++ b/airbyte-integrations/connectors/source-google-ads/metadata.yaml @@ -6,11 +6,11 @@ data: connectorSubtype: api connectorType: source definitionId: 253487c0-2246-43ba-a21f-5116b20a2c50 - dockerImageTag: 0.2.24 + dockerImageTag: 0.3.0 dockerRepository: airbyte/source-google-ads githubIssueLabel: source-google-ads icon: google-adwords.svg - license: MIT + license: Elv2 name: Google Ads registries: cloud: diff --git a/airbyte-integrations/connectors/source-google-analytics-data-api/Dockerfile b/airbyte-integrations/connectors/source-google-analytics-data-api/Dockerfile index c047cb7fd9b6..c11cf61b88e7 100644 --- a/airbyte-integrations/connectors/source-google-analytics-data-api/Dockerfile +++ b/airbyte-integrations/connectors/source-google-analytics-data-api/Dockerfile @@ -28,5 +28,5 @@ COPY source_google_analytics_data_api ./source_google_analytics_data_api ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py" ENTRYPOINT ["python", "/airbyte/integration_code/main.py"] -LABEL io.airbyte.version=1.0.0 +LABEL io.airbyte.version=1.1.0 LABEL io.airbyte.name=airbyte/source-google-analytics-data-api diff --git a/airbyte-integrations/connectors/source-google-analytics-data-api/metadata.yaml b/airbyte-integrations/connectors/source-google-analytics-data-api/metadata.yaml index a28e15354bf9..b0f61890315a 100644 --- a/airbyte-integrations/connectors/source-google-analytics-data-api/metadata.yaml +++ b/airbyte-integrations/connectors/source-google-analytics-data-api/metadata.yaml @@ -7,11 +7,11 @@ data: connectorSubtype: api connectorType: source definitionId: 3cc2eafd-84aa-4dca-93af-322d9dfeec1a - dockerImageTag: 1.0.0 + dockerImageTag: 1.1.0 dockerRepository: airbyte/source-google-analytics-data-api githubIssueLabel: source-google-analytics-data-api icon: google-analytics.svg - license: MIT + license: Elv2 name: Google Analytics 4 (GA4) registries: cloud: diff --git a/airbyte-integrations/connectors/source-google-analytics-v4/Dockerfile b/airbyte-integrations/connectors/source-google-analytics-v4/Dockerfile index 7afcfc7ca01c..6fc46f759232 100644 --- a/airbyte-integrations/connectors/source-google-analytics-v4/Dockerfile +++ b/airbyte-integrations/connectors/source-google-analytics-v4/Dockerfile @@ -12,5 +12,5 @@ RUN pip install . ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py" ENTRYPOINT ["python", "/airbyte/integration_code/main.py"] -LABEL io.airbyte.version=0.1.36 +LABEL io.airbyte.version=0.2.0 LABEL io.airbyte.name=airbyte/source-google-analytics-v4 diff --git a/airbyte-integrations/connectors/source-google-analytics-v4/metadata.yaml b/airbyte-integrations/connectors/source-google-analytics-v4/metadata.yaml index 34d4827bc5a6..0247fd8290ab 100644 --- a/airbyte-integrations/connectors/source-google-analytics-v4/metadata.yaml +++ b/airbyte-integrations/connectors/source-google-analytics-v4/metadata.yaml @@ -8,11 +8,11 @@ data: connectorSubtype: api connectorType: source definitionId: eff3616a-f9c3-11eb-9a03-0242ac130003 - dockerImageTag: 0.1.36 + dockerImageTag: 0.2.0 dockerRepository: airbyte/source-google-analytics-v4 githubIssueLabel: source-google-analytics-v4 icon: google-analytics.svg - license: MIT + license: Elv2 name: Google Analytics (Universal Analytics) registries: cloud: diff --git a/airbyte-integrations/connectors/source-google-search-console/Dockerfile b/airbyte-integrations/connectors/source-google-search-console/Dockerfile index 25a87bb38eff..7f6deb142914 100755 --- a/airbyte-integrations/connectors/source-google-search-console/Dockerfile +++ b/airbyte-integrations/connectors/source-google-search-console/Dockerfile @@ -12,5 +12,5 @@ RUN pip install . ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py" ENTRYPOINT ["python", "/airbyte/integration_code/main.py"] -LABEL io.airbyte.version=1.0.2 +LABEL io.airbyte.version=1.1.0 LABEL io.airbyte.name=airbyte/source-google-search-console diff --git a/airbyte-integrations/connectors/source-google-search-console/metadata.yaml b/airbyte-integrations/connectors/source-google-search-console/metadata.yaml index 3ecafbd8860e..5b32eb9eb459 100644 --- a/airbyte-integrations/connectors/source-google-search-console/metadata.yaml +++ b/airbyte-integrations/connectors/source-google-search-console/metadata.yaml @@ -5,11 +5,11 @@ data: connectorSubtype: api connectorType: source definitionId: eb4c9e00-db83-4d63-a386-39cfa91012a8 - dockerImageTag: 1.0.2 + dockerImageTag: 1.1.0 dockerRepository: airbyte/source-google-search-console githubIssueLabel: source-google-search-console icon: googlesearchconsole.svg - license: MIT + license: Elv2 name: Google Search Console registries: cloud: diff --git a/airbyte-integrations/connectors/source-google-sheets/Dockerfile b/airbyte-integrations/connectors/source-google-sheets/Dockerfile index 722701c713b5..c08fa909cdfc 100644 --- a/airbyte-integrations/connectors/source-google-sheets/Dockerfile +++ b/airbyte-integrations/connectors/source-google-sheets/Dockerfile @@ -34,5 +34,5 @@ COPY source_google_sheets ./source_google_sheets ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py" ENTRYPOINT ["python", "/airbyte/integration_code/main.py"] -LABEL io.airbyte.version=0.2.39 +LABEL io.airbyte.version=0.3.0 LABEL io.airbyte.name=airbyte/source-google-sheets diff --git a/airbyte-integrations/connectors/source-google-sheets/metadata.yaml b/airbyte-integrations/connectors/source-google-sheets/metadata.yaml index 6996a910ab7c..209fa3c35047 100644 --- a/airbyte-integrations/connectors/source-google-sheets/metadata.yaml +++ b/airbyte-integrations/connectors/source-google-sheets/metadata.yaml @@ -5,11 +5,11 @@ data: connectorSubtype: file connectorType: source definitionId: 71607ba1-c0ac-4799-8049-7f4b90dd50f7 - dockerImageTag: 0.2.39 + dockerImageTag: 0.3.0 dockerRepository: airbyte/source-google-sheets githubIssueLabel: source-google-sheets icon: google-sheets.svg - license: MIT + license: Elv2 name: Google Sheets registries: cloud: diff --git a/docs/integrations/sources/google-ads.md b/docs/integrations/sources/google-ads.md index 304f78635740..bbd0e53998b0 100644 --- a/docs/integrations/sources/google-ads.md +++ b/docs/integrations/sources/google-ads.md @@ -10,7 +10,9 @@ This page contains the setup guide and reference information for the Google Ads ## Setup guide + + ### Step 1: (For Airbyte Open Source) Apply for a developer token :::note @@ -29,6 +31,7 @@ When you apply for a token, make sure to mention: - That you have full access to the server running the code (because you're self-hosting Airbyte) ### Step 2: Set up the Google Ads connector in Airbyte + @@ -51,6 +54,7 @@ To set up Google Ads as a source in Airbyte Cloud: + **For Airbyte Open Source:** To set up Google Ads as a source in Airbyte Open Source: @@ -116,21 +120,25 @@ Due to Google Ads API constraints, the `click_view` stream retrieves data one da ::: For incremental streams, data is synced up to the previous day using your Google Ads account time zone since Google Ads can filter data only by [date](https://developers.google.com/google-ads/api/fields/v11/ad_group_ad#segments.date) without time. Also, some reports cannot load data real-time due to Google Ads [limitations](https://support.google.com/google-ads/answer/2544985?hl=en). + ## Custom Query: Understanding Google Ads Query Language -Additional streams for Google Ads can be dynamically created using custom queries. + +Additional streams for Google Ads can be dynamically created using custom queries. The Google Ads Query Language queries the Google Ads API. Review the [Google Ads Query Language](https://developers.google.com/google-ads/api/docs/query/overview) and the [query builder](https://developers.google.com/google-ads/api/fields/v13/query_validator) to validate your query. You can then add these as custom queries when configuring the Google Ads source. Example GAQL Custom Query: + ``` -SELECT - campaign.name, - metrics.conversions, - metrics.conversions_by_conversion_date +SELECT + campaign.name, + metrics.conversions, + metrics.conversions_by_conversion_date FROM ad_group ``` + Note the segments.date is automatically added to the output, and does not need to be specified in the custom query. All custom reports will by synced by day. Each custom query in the input configuration must work for all the customer account IDs. Otherwise, the customer ID will be skipped for every query that fails the validation test. For example, if your query contains metrics fields in the select clause, it will not be executed against manager accounts. @@ -142,6 +150,7 @@ For an existing Google Ads source, when you are updating or removing Custom GAQL ::: + ## Performance considerations This source is constrained by the [Google Ads API limits](https://developers.google.com/google-ads/api/docs/best-practices/quotas) @@ -151,7 +160,8 @@ Due to a limitation in the Google Ads API which does not allow getting performan ## Changelog | Version | Date | Pull Request | Subject | -|:---------|:-----------|:---------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------| +| :------- | :--------- | :------------------------------------------------------- | :----------------------------------------------------------------------------------------------------------------------------------- | +| `0.3.0` | 2023-06-26 | [27738](https://github.com/airbytehq/airbyte/pull/27738) | License Update: Elv2 | | `0.2.24` | 2023-06-06 | [27608](https://github.com/airbytehq/airbyte/pull/27608) | Improve Google Ads exception handling | | `0.2.23` | 2023-06-06 | [26905](https://github.com/airbytehq/airbyte/pull/26905) | Replace deprecated `authSpecification` in the connector specification with `advancedAuth` | | `0.2.22` | 2023-06-02 | [26948](https://github.com/airbytehq/airbyte/pull/26948) | Refactor error messages; add `pattern_descriptor` for fields in spec | diff --git a/docs/integrations/sources/google-analytics-data-api.md b/docs/integrations/sources/google-analytics-data-api.md index ede1cb87e0bf..c6d1c30d330f 100644 --- a/docs/integrations/sources/google-analytics-data-api.md +++ b/docs/integrations/sources/google-analytics-data-api.md @@ -4,7 +4,7 @@ This page contains the setup guide and reference information for the Google Anal :::note -[Google Analytics Universal Analytics (UA) connector](https://docs.airbyte.com/integrations/sources/google-analytics-v4), uses the older version of Google Analytics, which has been the standard for tracking website and app user behavior since 2012. +[Google Analytics Universal Analytics (UA) connector](https://docs.airbyte.com/integrations/sources/google-analytics-v4), uses the older version of Google Analytics, which has been the standard for tracking website and app user behavior since 2012. Google Analytics 4 (GA4) connector is the latest version of Google Analytics, which was introduced in 2020. It offers a new data model that emphasizes events and user properties, rather than pageviews and sessions. This new model allows for more flexible and customizable reporting, as well as more accurate measurement of user behavior across devices and platforms. @@ -12,9 +12,9 @@ Google Analytics 4 (GA4) connector is the latest version of Google Analytics, wh ## Prerequisites -* JSON credentials for the service account that has access to Google Analytics. For more details check [instructions](https://support.google.com/analytics/answer/1009702) -* OAuth 2.0 credentials for the service account that has access to Google Analytics -* Property ID +- JSON credentials for the service account that has access to Google Analytics. For more details check [instructions](https://support.google.com/analytics/answer/1009702) +- OAuth 2.0 credentials for the service account that has access to Google Analytics +- Property ID ## Step 1: Set up Source @@ -61,7 +61,6 @@ Use the service account email address to [add a user](https://support.google.com 7. Enter the **Custom Reports (Optional)** a JSON array describing the custom reports you want to sync from Google Analytics. 8. Enter the **Data request time increment in days (Optional)**. The bigger this value is, the faster the sync will be, but the more likely that sampling will be applied to your data, potentially causing inaccuracies in the returned results. We recommend setting this to 1 unless you have a hard requirement to make the sync faster at the expense of accuracy. The minimum allowed value for this field is 1, and the maximum is 364. (Not applied to custom Cohort reports). - ## Supported sync modes The Google Analytics source connector supports the following [sync modes](https://docs.airbyte.com/cloud/core-concepts#connection-sync-modes): @@ -75,25 +74,25 @@ The Google Analytics source connector supports the following [sync modes](https: This connector outputs the following incremental streams: -* Preconfigured streams: - * [daily_active_users](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) - * [devices](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) - * [four_weekly_active_users](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) - * [locations](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) - * [pages](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) - * [traffic_sources](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) - * [website_overview](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) - * [weekly_active_users](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) -* [Custom stream\(s\)](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) +- Preconfigured streams: + - [daily_active_users](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) + - [devices](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) + - [four_weekly_active_users](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) + - [locations](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) + - [pages](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) + - [traffic_sources](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) + - [website_overview](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) + - [weekly_active_users](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) +- [Custom stream\(s\)](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/properties/runReport) ## Connector-specific features :::note - * Custom reports should be provided in format `[{"name": "", "dimensions": ["", ...], "metrics": ["", ...], "cohortSpec": "", "pivots": ""}]` - * Both `pivots` and `cohortSpec` are optional. Detailed description of the `cohortSpec` and the `pivots` objects you can find [here](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/CohortSpec) and [here](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/Pivot). - * To enable Incremental sync for Custom reports, you need to include the `date` dimension (except for custom Cohort reports). -::: +- Custom reports should be provided in format `[{"name": "", "dimensions": ["", ...], "metrics": ["", ...], "cohortSpec": "", "pivots": ""}]` +- Both `pivots` and `cohortSpec` are optional. Detailed description of the `cohortSpec` and the `pivots` objects you can find [here](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/CohortSpec) and [here](https://developers.google.com/analytics/devguides/reporting/data/v1/rest/v1beta/Pivot). +- To enable Incremental sync for Custom reports, you need to include the `date` dimension (except for custom Cohort reports). + ::: ## Performance Considerations @@ -102,7 +101,7 @@ This connector outputs the following incremental streams: ## Data type map | Integration Type | Airbyte Type | Notes | -|:-----------------|:-------------|:------| +| :--------------- | :----------- | :---- | | `string` | `string` | | | `number` | `number` | | | `array` | `array` | | @@ -111,7 +110,8 @@ This connector outputs the following incremental streams: ## Changelog | Version | Date | Pull Request | Subject | -|:--------|:-----------|:---------------------------------------------------------|:------------------------------------------------------------------------------| +| :------ | :--------- | :------------------------------------------------------- | :---------------------------------------------------------------------------- | +| 1.1.0 | 2023-06-26 | [27738](https://github.com/airbytehq/airbyte/pull/27738) | License Update: Elv2 | | 1.0.0 | 2023-06-22 | [26283](https://github.com/airbytehq/airbyte/pull/26283) | Added primary_key and lookback window | | 0.2.7 | 2023-06-21 | [27531](https://github.com/airbytehq/airbyte/pull/27531) | Fix formatting | | 0.2.6 | 2023-06-09 | [27207](https://github.com/airbytehq/airbyte/pull/27207) | Improve api rate limit messages | diff --git a/docs/integrations/sources/google-analytics-v4.md b/docs/integrations/sources/google-analytics-v4.md index 28d11c259d5b..540748321709 100644 --- a/docs/integrations/sources/google-analytics-v4.md +++ b/docs/integrations/sources/google-analytics-v4.md @@ -18,7 +18,7 @@ For more information, see ["Universal Analytics is going away"](https://support. :::note -Google Analytics Universal Analytics (UA) connector, uses the older version of Google Analytics, which has been the standard for tracking website and app user behavior since 2012. +Google Analytics Universal Analytics (UA) connector, uses the older version of Google Analytics, which has been the standard for tracking website and app user behavior since 2012. [Google Analytics 4 (GA4) connector](https://docs.airbyte.com/integrations/sources/google-analytics-data-api) is the latest version of Google Analytics, which was introduced in 2020. It offers a new data model that emphasizes events and user properties, rather than pageviews and sessions. This new model allows for more flexible and customizable reporting, as well as more accurate measurement of user behavior across devices and platforms. @@ -31,6 +31,7 @@ A Google Cloud account with [Viewer permissions](https://support.google.com/anal ## Setup guide + **For Airbyte Cloud:** To set up Google Analytics as a source in Airbyte Cloud: @@ -40,14 +41,15 @@ To set up Google Analytics as a source in Airbyte Cloud: 3. On the Set up the source page, select **Google Analytics** from the **Source type** dropdown. 4. For Name, enter a name for the Google Analytics connector. 5. Authenticate your Google account via OAuth or Service Account Key Authentication. - - To authenticate your Google account via OAuth, click **Sign in with Google** and complete the authentication workflow. - - To authenticate your Google account via Service Account Key Authentication, enter your [Google Cloud service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating_service_account_keys) in JSON format. Make sure the Service Account has the Project Viewer permission. + - To authenticate your Google account via OAuth, click **Sign in with Google** and complete the authentication workflow. + - To authenticate your Google account via Service Account Key Authentication, enter your [Google Cloud service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating_service_account_keys) in JSON format. Make sure the Service Account has the Project Viewer permission. 6. Enter the **Replication Start Date** in YYYY-MM-DD format. The data added on and after this date will be replicated. If this field is blank, Airbyte will replicate all data. 7. Enter the [**View ID**](https://ga-dev-tools.appspot.com/account-explorer/) for the Google Analytics View you want to fetch data from. 8. Leave **Data request time increment in days (Optional)** blank or set to 1. For faster syncs, set this value to more than 1 but that might result in the Google Analytics API returning [sampled data](#sampled-data-in-reports), potentially causing inaccuracies in the returned results. The maximum allowed value is 364. + **For Airbyte Open Source:** To set up Google Analytics as a source in Airbyte Open Source: @@ -56,8 +58,8 @@ To set up Google Analytics as a source in Airbyte Open Source: 2. On the Set up the source page, select **Google Analytics** from the **Source type** dropdown. 3. Enter a name for the Google Analytics connector. 4. Authenticate your Google account via OAuth or Service Account Key Authentication: - - To authenticate your Google account via OAuth, enter your Google application's [client ID, client secret, and refresh token](https://developers.google.com/identity/protocols/oauth2). - - To authenticate your Google account via Service Account Key Authentication, enter your [Google Cloud service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating_service_account_keys) in JSON format. Use the service account email address to [add a user](https://support.google.com/analytics/answer/1009702) to the Google analytics view you want to access via the API and grant [Read and Analyze permissions](https://support.google.com/analytics/answer/2884495). + - To authenticate your Google account via OAuth, enter your Google application's [client ID, client secret, and refresh token](https://developers.google.com/identity/protocols/oauth2). + - To authenticate your Google account via Service Account Key Authentication, enter your [Google Cloud service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating_service_account_keys) in JSON format. Use the service account email address to [add a user](https://support.google.com/analytics/answer/1009702) to the Google analytics view you want to access via the API and grant [Read and Analyze permissions](https://support.google.com/analytics/answer/2884495). 5. Enter the **Replication Start Date** in YYYY-MM-DD format. The data added on and after this date will be replicated. If this field is blank, Airbyte will replicate all data. 6. Enter the [**View ID**](https://ga-dev-tools.appspot.com/account-explorer/) for the Google Analytics View you want to fetch data from. 7. Optionally, enter a JSON object as a string in the **Custom Reports** field. For details, refer to [Requesting custom reports](#requesting-custom-reports) @@ -79,7 +81,6 @@ You need to add the service account email address on the account level, not the ::: - ## Supported streams The Google Analytics (Universal Analytics) source connector can sync the following tables: @@ -104,16 +105,16 @@ Reach out to us on Slack or [create an issue](https://github.com/airbytehq/airby [Analytics Reporting API v4](https://developers.google.com/analytics/devguides/reporting/core/v4/limits-quotas) -* Number of requests per day per project: 50,000 -* Number of requests per view (profile) per day: 10,000 (cannot be increased) -* Number of requests per 100 seconds per project: 2,000 -* Number of requests per 100 seconds per user per project: 100 (can be increased in Google API Console to 1,000). +- Number of requests per day per project: 50,000 +- Number of requests per view (profile) per day: 10,000 (cannot be increased) +- Number of requests per 100 seconds per project: 2,000 +- Number of requests per 100 seconds per user per project: 100 (can be increased in Google API Console to 1,000). The Google Analytics connector should not run into the "requests per 100 seconds" limitation under normal usage. [Create an issue](https://github.com/airbytehq/airbyte/issues) if you see any rate limit issues that are not automatically retried successfully and try increasing the `window_in_days` value. ## Sampled data in reports -If you are not on the Google Analytics 360 tier, the Google Analytics API may return sampled data if the amount of data in your Google Analytics account exceeds Google's [pre-determined compute thresholds](https://support.google.com/analytics/answer/2637192?hl=en&ref_topic=2601030&visit_id=637868645346124317-2833523666&rd=1#thresholds&zippy=%2Cin-this-article). This means the data returned in the report is an estimate which may have some inaccuracy. This [Google page](https://support.google.com/analytics/answer/2637192) provides a comprehensive overview of how Google applies sampling to your data. +If you are not on the Google Analytics 360 tier, the Google Analytics API may return sampled data if the amount of data in your Google Analytics account exceeds Google's [pre-determined compute thresholds](https://support.google.com/analytics/answer/2637192?hl=en&ref_topic=2601030&visit_id=637868645346124317-2833523666&rd=1#thresholds&zippy=%2Cin-this-article). This means the data returned in the report is an estimate which may have some inaccuracy. This [Google page](https://support.google.com/analytics/answer/2637192) provides a comprehensive overview of how Google applies sampling to your data. In order to minimize the chances of sampling being applied to your data, Airbyte makes data requests to Google in one day increments (the smallest allowed date increment). This reduces the amount of data the Google API processes per request, thus minimizing the chances of sampling being applied. The downside of requesting data in one day increments is that it increases the time it takes to export your Google Analytics data. If sampling is not a concern, you can override this behavior by setting the optional `window_in_day` parameter to specify the number of days to look back and avoid sampling. When sampling occurs, a warning is logged to the sync log. @@ -144,56 +145,57 @@ Here is an example input "Custom Reports" field: To create a list of dimensions, you can use default Google Analytics dimensions (listed below) or custom dimensions if you have some defined. Each report can contain no more than 7 dimensions, and they must all be unique. The default Google Analytics dimensions are: -* `ga:browser` -* `ga:city` -* `ga:continent` -* `ga:country` -* `ga:date` -* `ga:deviceCategory` -* `ga:hostname` -* `ga:medium` -* `ga:metro` -* `ga:operatingSystem` -* `ga:pagePath` -* `ga:region` -* `ga:socialNetwork` -* `ga:source` -* `ga:subContinent` - -To create a list of metrics, use a default Google Analytics metric (values from the list below) or custom metrics if you have defined them. +- `ga:browser` +- `ga:city` +- `ga:continent` +- `ga:country` +- `ga:date` +- `ga:deviceCategory` +- `ga:hostname` +- `ga:medium` +- `ga:metro` +- `ga:operatingSystem` +- `ga:pagePath` +- `ga:region` +- `ga:socialNetwork` +- `ga:source` +- `ga:subContinent` + +To create a list of metrics, use a default Google Analytics metric (values from the list below) or custom metrics if you have defined them. A custom report can contain no more than 10 unique metrics. The default available Google Analytics metrics are: -* `ga:14dayUsers` -* `ga:1dayUsers` -* `ga:28dayUsers` -* `ga:30dayUsers` -* `ga:7dayUsers` -* `ga:avgSessionDuration` -* `ga:avgTimeOnPage` -* `ga:bounceRate` -* `ga:entranceRate` -* `ga:entrances` -* `ga:exitRate` -* `ga:exits` -* `ga:newUsers` -* `ga:pageviews` -* `ga:pageviewsPerSession` -* `ga:sessions` -* `ga:sessionsPerUser` -* `ga:uniquePageviews` -* `ga:users` +- `ga:14dayUsers` +- `ga:1dayUsers` +- `ga:28dayUsers` +- `ga:30dayUsers` +- `ga:7dayUsers` +- `ga:avgSessionDuration` +- `ga:avgTimeOnPage` +- `ga:bounceRate` +- `ga:entranceRate` +- `ga:entrances` +- `ga:exitRate` +- `ga:exits` +- `ga:newUsers` +- `ga:pageviews` +- `ga:pageviewsPerSession` +- `ga:sessions` +- `ga:sessionsPerUser` +- `ga:uniquePageviews` +- `ga:users` Incremental sync is supported only if you add `ga:date` dimension to your custom report. ## Changelog | Version | Date | Pull Request | Subject | -|:--------|:-----------|:---------------------------------------------------------|:---------------------------------------------------------------------------------------------| -| 0.1.36 | 2023-04-13 | [22223](https://github.com/airbytehq/airbyte/pull/22223) | Fix custom report with Segments dimensions | +| :------ | :--------- | :------------------------------------------------------- | :------------------------------------------------------------------------------------------- | +| 0.2.0 | 2023-06-26 | [27738](https://github.com/airbytehq/airbyte/pull/27738) | License Update: Elv2 | +| 0.1.36 | 2023-04-13 | [22223](https://github.com/airbytehq/airbyte/pull/22223) | Fix custom report with Segments dimensions | | 0.1.35 | 2023-05-31 | [26885](https://github.com/airbytehq/airbyte/pull/26885) | Remove `authSpecification` from spec in favour of `advancedAuth` | | 0.1.34 | 2023-01-27 | [22006](https://github.com/airbytehq/airbyte/pull/22006) | Set `AvailabilityStrategy` for streams explicitly to `None` | -| 0.1.33 | 2022-12-23 | [20858](https://github.com/airbytehq/airbyte/pull/20858) | Fix check connection | -| 0.1.32 | 2022-11-04 | [18965](https://github.com/airbytehq/airbyte/pull/18965) | Fix for `discovery` stage, when `custom_reports` are provided with single stream as `dict` | +| 0.1.33 | 2022-12-23 | [20858](https://github.com/airbytehq/airbyte/pull/20858) | Fix check connection | +| 0.1.32 | 2022-11-04 | [18965](https://github.com/airbytehq/airbyte/pull/18965) | Fix for `discovery` stage, when `custom_reports` are provided with single stream as `dict` | | 0.1.31 | 2022-10-30 | [18670](https://github.com/airbytehq/airbyte/pull/18670) | Add `Custom Reports` schema validation on `check connection` | | 0.1.30 | 2022-10-13 | [17943](https://github.com/airbytehq/airbyte/pull/17943) | Fix pagination | | 0.1.29 | 2022-10-12 | [17905](https://github.com/airbytehq/airbyte/pull/17905) | Handle exceeded daily quota gracefully | diff --git a/docs/integrations/sources/google-search-console.md b/docs/integrations/sources/google-search-console.md index 930d9702913e..d72068270337 100644 --- a/docs/integrations/sources/google-search-console.md +++ b/docs/integrations/sources/google-search-console.md @@ -2,26 +2,25 @@ This page contains the setup guide and reference information for the google search console source connector. - ## Prerequisites -* A verified property in Google Search Console -* Enable Google Search Console API for GCP project at [GCP console](https://console.cloud.google.com/apis/library/searchconsole.googleapis.com) -* Credentials to a Google Service Account \(or Google Service Account with delegated Domain Wide Authority\) or Google User Account -* Enable Google Search Console API - +- A verified property in Google Search Console +- Enable Google Search Console API for GCP project at [GCP console](https://console.cloud.google.com/apis/library/searchconsole.googleapis.com) +- Credentials to a Google Service Account \(or Google Service Account with delegated Domain Wide Authority\) or Google User Account +- Enable Google Search Console API ## Setup guide + ### Step 1: Set up google search console #### How to create the client credentials for Google Search Console, to use with Airbyte? You can either: -* Use the existing `Service Account` for your Google Project with granted Admin Permissions -* Use your personal Google User Account with oauth. If you choose this option, your account must have permissions to view the Google Search Console project you choose. -* Create the new `Service Account` credentials for your Google Project, and grant Admin Permissions to it -* Follow the `Delegating domain-wide authority` process to obtain the necessary permissions to your google account from the administrator of Workspace +- Use the existing `Service Account` for your Google Project with granted Admin Permissions +- Use your personal Google User Account with oauth. If you choose this option, your account must have permissions to view the Google Search Console project you choose. +- Create the new `Service Account` credentials for your Google Project, and grant Admin Permissions to it +- Follow the `Delegating domain-wide authority` process to obtain the necessary permissions to your google account from the administrator of Workspace ### Creating a Google service account @@ -31,8 +30,8 @@ A service account's credentials include a generated email address that is unique 2. If prompted, select an existing project, or create a new project. 3. Click `+ Create service account`. 4. Under Service account details, type a `name`, `ID`, and `description` for the service account, then click `Create`. - * Optional: Under `Service account permissions`, select the `IAM roles` to grant to the service account, then click `Continue`. - * Optional: Under `Grant users access to this service account`, add the `users` or `groups` that are allowed to use and manage the service account. + - Optional: Under `Service account permissions`, select the `IAM roles` to grant to the service account, then click `Continue`. + - Optional: Under `Grant users access to this service account`, add the `users` or `groups` that are allowed to use and manage the service account. 5. Go to [API Console/Credentials](https://console.cloud.google.com/apis/credentials), check the `Service Accounts` section, click on the Email address of service account you just created. 6. Open `Details` tab and find `Show domain-wide delegation`, checkmark the `Enable Google Workspace Domain-wide Delegation`. 7. On `Keys` tab click `+ Add key`, then click `Create new key`. @@ -55,13 +54,14 @@ You can return to the [API Console/Credentials](https://console.cloud.google.com Follow the Google Documentation for performing [Delegating domain-wide authority](https://developers.google.com/identity/protocols/oauth2/service-account#delegatingauthority) to create a Service account with delegated domain-wide authority. This account must be created by an administrator of your Google Workspace. Please make sure to grant the following `OAuth scopes` to the service user: -* `https://www.googleapis.com/auth/webmasters.readonly` +- `https://www.googleapis.com/auth/webmasters.readonly` At the end of this process, you should have JSON credentials to this Google Service Account. ## Step 2: Set up the google search console connector in Airbyte + **For Airbyte Cloud:** 1. [Log into your Airbyte Cloud](https://cloud.airbyte.com/workspaces) account. @@ -72,72 +72,69 @@ At the end of this process, you should have JSON credentials to this Google Serv 6. Fill in the `start date` field. 7. Fill in the `custom reports` (optionally) in format `{"name": "", "dimensions": ["", ...]}` 8. Fill in the `data_state` (optionally) in case you want to sync fresher data use `all' value, otherwise use 'final'. -8. You should be ready to sync data. +9. You should be ready to sync data. + **For Airbyte Open Source:** 1. Fill in the `service_account_info` and `email` fields for authentication. 2. Fill in the `site_urls` field. 3. Fill in the `start date` field. 4. Fill in the `custom reports` (optionally) in format `{"name": "", "dimensions": ["", ...]}` -5. Fill in the `data_state` (optionally) in case you want to sync fresher data use `all' value, otherwise use 'final'. +5. Fill in the `data_state` (optionally) in case you want to sync fresher data use `all' value, otherwise use 'final'. 6. You should be ready to sync data. - ## Supported sync modes The Google Search Console Source connector supports the following [ sync modes](https://docs.airbyte.com/cloud/core-concepts#connection-sync-modes): - -* [Full Refresh - Overwrite](https://docs.airbyte.com/understanding-airbyte/connections/full-refresh-overwrite/) -* [Full Refresh - Append](https://docs.airbyte.com/understanding-airbyte/connections/full-refresh-append) -* [Incremental - Append](https://docs.airbyte.com/understanding-airbyte/connections/incremental-append) -* [Incremental - Deduped History](https://docs.airbyte.com/understanding-airbyte/connections/incremental-deduped-history) +- [Full Refresh - Overwrite](https://docs.airbyte.com/understanding-airbyte/connections/full-refresh-overwrite/) +- [Full Refresh - Append](https://docs.airbyte.com/understanding-airbyte/connections/full-refresh-append) +- [Incremental - Append](https://docs.airbyte.com/understanding-airbyte/connections/incremental-append) +- [Incremental - Deduped History](https://docs.airbyte.com/understanding-airbyte/connections/incremental-deduped-history) :::note - The granularity for the cursor is 1 day, so Incremental Sync in Append mode may result in duplicating the data. +The granularity for the cursor is 1 day, so Incremental Sync in Append mode may result in duplicating the data. ::: :::note - Parameter `data_state='all'` should not be used with Incremental Sync mode as it may cause data loss. +Parameter `data_state='all'` should not be used with Incremental Sync mode as it may cause data loss. ::: ## Supported Streams -* [Sites](https://developers.google.com/webmaster-tools/search-console-api-original/v3/sites/get) -* [Sitemaps](https://developers.google.com/webmaster-tools/search-console-api-original/v3/sitemaps/list) -* [Full Analytics report](https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query) \(this stream has a long sync time because it is very detailed, use with care\) -* [Analytics report by country](https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query) -* [Analytics report by date](https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query) -* [Analytics report by device](https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query) -* [Analytics report by page](https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query) -* [Analytics report by query](https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query) -* Analytics report by custom dimensions - +- [Sites](https://developers.google.com/webmaster-tools/search-console-api-original/v3/sites/get) +- [Sitemaps](https://developers.google.com/webmaster-tools/search-console-api-original/v3/sitemaps/list) +- [Full Analytics report](https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query) \(this stream has a long sync time because it is very detailed, use with care\) +- [Analytics report by country](https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query) +- [Analytics report by date](https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query) +- [Analytics report by device](https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query) +- [Analytics report by page](https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query) +- [Analytics report by query](https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query) +- Analytics report by custom dimensions ## Performance considerations This connector attempts to back off gracefully when it hits Reports API's rate limits. To find more information about limits, see [Usage Limits](https://developers.google.com/webmaster-tools/limits) documentation. - ## Data type map | Integration Type | Airbyte Type | Notes | -|:-----------------|:-------------|:------| +| :--------------- | :----------- | :---- | | `string` | `string` | | | `number` | `number` | | | `array` | `array` | | | `object` | `object` | | - ## Changelog | Version | Date | Pull Request | Subject | -|:---------|:-----------|:--------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------| -| `1.0.2` | 2023-06-13 | [27307](https://github.com/airbytehq/airbyte/pull/27307) | Fix `data_state` config typo | +| :------- | :--------- | :------------------------------------------------------------------------------------------------------------ | :----------------------------------------------------------------------- | +| `1.1.0` | 2023-06-26 | [27738](https://github.com/airbytehq/airbyte/pull/27738) | License Update: Elv2 | +| `1.0.2` | 2023-06-13 | [27307](https://github.com/airbytehq/airbyte/pull/27307) | Fix `data_state` config typo | | `1.0.1` | 2023-05-30 | [26746](https://github.com/airbytehq/airbyte/pull/26746) | Remove `authSpecification` from connector spec in favour of advancedAuth | | `1.0.0` | 2023-05-24 | [26452](https://github.com/airbytehq/airbyte/pull/26452) | Add data_state parameter to specification | | `0.1.22` | 2023-03-20 | [22295](https://github.com/airbytehq/airbyte/pull/22295) | Update specification examples | diff --git a/docs/integrations/sources/google-sheets.md b/docs/integrations/sources/google-sheets.md index 838b55203f37..79c5c05f9435 100644 --- a/docs/integrations/sources/google-sheets.md +++ b/docs/integrations/sources/google-sheets.md @@ -9,6 +9,7 @@ The Google Sheets source connector pulls data from a single Google Sheets spread ## Setup guide + **For Airbyte Cloud:** To set up Google Sheets as a source in Airbyte Cloud: @@ -18,29 +19,30 @@ To set up Google Sheets as a source in Airbyte Cloud: 3. On the Set up the source page, select **Google Sheets** from the **Source type** dropdown. 4. Enter a name for the Google Sheets connector. 5. Authenticate your Google account via OAuth or Service Account Key Authentication. - - **(Recommended)** To authenticate your Google account via OAuth, click **Sign in with Google** and complete the authentication workflow. - - To authenticate your Google account via Service Account Key Authentication, enter your [Google Cloud service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating_service_account_keys) in JSON format. Make sure the Service Account has the Project Viewer permission. If your spreadsheet is viewable by anyone with its link, no further action is needed. If not, [give your Service account access to your spreadsheet](https://youtu.be/GyomEw5a2NQ%22). + - **(Recommended)** To authenticate your Google account via OAuth, click **Sign in with Google** and complete the authentication workflow. + - To authenticate your Google account via Service Account Key Authentication, enter your [Google Cloud service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating_service_account_keys) in JSON format. Make sure the Service Account has the Project Viewer permission. If your spreadsheet is viewable by anyone with its link, no further action is needed. If not, [give your Service account access to your spreadsheet](https://youtu.be/GyomEw5a2NQ%22). 6. For **Spreadsheet Link**, enter the link to the Google spreadsheet. To get the link, go to the Google spreadsheet you want to sync, click **Share** in the top right corner, and click **Copy Link**. 7. For **Row Batch Size**, define the number of records you want the Google API to fetch at a time. The default value is 200. + **For Airbyte Open Source:** To set up Google Sheets as a source in Airbyte Open Source: 1. [Enable the Google Cloud Platform APIs for your personal or organization account](https://support.google.com/googleapi/answer/6158841?hl=en). - :::info - The connector only finds the spreadsheet you want to replicate; it does not access any of your other files in Google Drive. - ::: + :::info + The connector only finds the spreadsheet you want to replicate; it does not access any of your other files in Google Drive. + ::: 2. Go to the Airbyte UI and in the left navigation bar, click **Sources**. In the top-right corner, click **+ New source**. 3. On the Set up the source page, select **Google Sheets** from the Source type dropdown. 4. Enter a name for the Google Sheets connector. 5. Authenticate your Google account via OAuth or Service Account Key Authentication: - - To authenticate your Google account via OAuth, enter your Google application's [client ID, client secret, and refresh token](https://developers.google.com/identity/protocols/oauth2). - - To authenticate your Google account via Service Account Key Authentication, enter your [Google Cloud service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating_service_account_keys) in JSON format. Make sure the Service Account has the Project Viewer permission. If your spreadsheet is viewable by anyone with its link, no further action is needed. If not, [give your Service account access to your spreadsheet](https://youtu.be/GyomEw5a2NQ%22). + - To authenticate your Google account via OAuth, enter your Google application's [client ID, client secret, and refresh token](https://developers.google.com/identity/protocols/oauth2). + - To authenticate your Google account via Service Account Key Authentication, enter your [Google Cloud service account key](https://cloud.google.com/iam/docs/creating-managing-service-account-keys#creating_service_account_keys) in JSON format. Make sure the Service Account has the Project Viewer permission. If your spreadsheet is viewable by anyone with its link, no further action is needed. If not, [give your Service account access to your spreadsheet](https://youtu.be/GyomEw5a2NQ%22). 6. For **Spreadsheet Link**, enter the link to the Google spreadsheet. To get the link, go to the Google spreadsheet you want to sync, click **Share** in the top right corner, and click **Copy Link**. ### Output schema @@ -52,12 +54,13 @@ Each sheet in the selected spreadsheet is synced as a separate stream. Each sele Airbyte only supports replicating [Grid](https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/sheets#SheetType) sheets. + ## Supported sync modes The Google Sheets source connector supports the following sync modes: -* [Full Refresh - Overwrite](https://docs.airbyte.com/understanding-airbyte/connections/full-refresh-overwrite/) -* [Full Refresh - Append](https://docs.airbyte.com/understanding-airbyte/connections/full-refresh-append) +- [Full Refresh - Overwrite](https://docs.airbyte.com/understanding-airbyte/connections/full-refresh-overwrite/) +- [Full Refresh - Append](https://docs.airbyte.com/understanding-airbyte/connections/full-refresh-append) ## Data type mapping @@ -65,16 +68,15 @@ The Google Sheets source connector supports the following sync modes: | :--------------- | :----------- | :---- | | any type | `string` | | - ## Performance consideration The [Google API rate limit](https://developers.google.com/sheets/api/limits) is 100 requests per 100 seconds per user and 500 requests per 100 seconds per project. Airbyte batches requests to the API in order to efficiently pull data and respects these rate limits. We recommended not using the same service user for more than 3 instances of the Google Sheets source connector to ensure high transfer speeds. - ## Changelog | Version | Date | Pull Request | Subject | -|---------|------------|----------------------------------------------------------|-------------------------------------------------------------------------------| +| ------- | ---------- | -------------------------------------------------------- | ----------------------------------------------------------------------------- | +| 0.3.0 | 2023-06-26 | [27738](https://github.com/airbytehq/airbyte/pull/27738) | License Update: Elv2 | | 0.2.39 | 2023-05-31 | [26833](https://github.com/airbytehq/airbyte/pull/26833) | Remove authSpecification in favour of advancedAuth in specification | | 0.2.38 | 2023-05-16 | [26097](https://github.com/airbytehq/airbyte/pull/26097) | Refactor config error | | 0.2.37 | 2023-02-21 | [23292](https://github.com/airbytehq/airbyte/pull/23292) | Skip non grid sheets. | @@ -111,4 +113,3 @@ The [Google API rate limit](https://developers.google.com/sheets/api/limits) is | 0.1.4 | 2020-11-30 | [1046](https://github.com/airbytehq/airbyte/pull/1046) | Add connectors using an index YAML file | -