Skip to content

Commit

Permalink
Merge branch 'master' into ddavydov/#1772-fix-datetime-field-typing-f…
Browse files Browse the repository at this point in the history
…or-mixpanel
  • Loading branch information
tolik0 committed Oct 2, 2023
2 parents 5068013 + f34959e commit 5c43999
Show file tree
Hide file tree
Showing 18 changed files with 120 additions and 60 deletions.
2 changes: 1 addition & 1 deletion airbyte-cdk/python/.bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.51.24
current_version = 0.51.25
commit = False

[bumpversion:file:setup.py]
Expand Down
3 changes: 3 additions & 0 deletions airbyte-cdk/python/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# Changelog

## 0.51.25
Add configurable OpenAI embedder to cdk and add cloud environment helper

## 0.51.24
Fix previous version of request_cache clearing

Expand Down
4 changes: 2 additions & 2 deletions airbyte-cdk/python/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ RUN apk --no-cache upgrade \
&& apk --no-cache add tzdata build-base

# install airbyte-cdk
RUN pip install --prefix=/install airbyte-cdk==0.51.24
RUN pip install --prefix=/install airbyte-cdk==0.51.25

# build a clean environment
FROM base
Expand All @@ -32,5 +32,5 @@ ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py"
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]

# needs to be the same as CDK
LABEL io.airbyte.version=0.51.24
LABEL io.airbyte.version=0.51.25
LABEL io.airbyte.name=airbyte/source-declarative-manifest
7 changes: 3 additions & 4 deletions airbyte-cdk/python/airbyte_cdk/entrypoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
from airbyte_cdk.models.airbyte_protocol import ConnectorSpecification # type: ignore [attr-defined]
from airbyte_cdk.sources import Source
from airbyte_cdk.sources.utils.schema_helpers import check_config_against_spec_or_exit, split_config
from airbyte_cdk.utils import is_cloud_environment
from airbyte_cdk.utils.airbyte_secrets_utils import get_secrets, update_secrets
from airbyte_cdk.utils.constants import ENV_REQUEST_CACHE_PATH
from airbyte_cdk.utils.traced_exception import AirbyteTracedException
Expand All @@ -37,10 +38,8 @@ class AirbyteEntrypoint(object):
def __init__(self, source: Source):
init_uncaught_exception_handler(logger)

# DEPLOYMENT_MODE is read when instantiating the entrypoint because it is the common path shared by syncs and connector
# builder test requests
deployment_mode = os.environ.get("DEPLOYMENT_MODE", "")
if deployment_mode.casefold() == CLOUD_DEPLOYMENT_MODE:
# deployment mode is read when instantiating the entrypoint because it is the common path shared by syncs and connector builder test requests
if is_cloud_environment():
_init_internal_request_filter()

self.source = source
Expand Down
3 changes: 2 additions & 1 deletion airbyte-cdk/python/airbyte_cdk/utils/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,6 @@

from .schema_inferrer import SchemaInferrer
from .traced_exception import AirbyteTracedException
from .is_cloud_environment import is_cloud_environment

__all__ = ["AirbyteTracedException", "SchemaInferrer"]
__all__ = ["AirbyteTracedException", "SchemaInferrer", "is_cloud_environment"]
18 changes: 18 additions & 0 deletions airbyte-cdk/python/airbyte_cdk/utils/is_cloud_environment.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#
# Copyright (c) 2023 Airbyte, Inc., all rights reserved.
#

import os

CLOUD_DEPLOYMENT_MODE = "cloud"


def is_cloud_environment():
"""
Returns True if the connector is running in a cloud environment, False otherwise.
The function checks the value of the DEPLOYMENT_MODE environment variable which is set by the platform.
This function can be used to determine whether stricter security measures should be applied.
"""
deployment_mode = os.environ.get("DEPLOYMENT_MODE", "")
return deployment_mode.casefold() == CLOUD_DEPLOYMENT_MODE
2 changes: 1 addition & 1 deletion airbyte-cdk/python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
name="airbyte-cdk",
# The version of the airbyte-cdk package is used at runtime to validate manifests. That validation must be
# updated if our semver format changes such as using release candidate versions.
version="0.51.24",
version="0.51.25",
description="A framework for writing Airbyte Connectors.",
long_description=README,
long_description_content_type="text/markdown",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,5 +29,5 @@ RUN tar xf ${APPLICATION}.tar --strip-components=1

ENV ENABLE_SENTRY true

LABEL io.airbyte.version=3.1.16
LABEL io.airbyte.version=3.1.17
LABEL io.airbyte.name=airbyte/destination-snowflake
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,7 @@ integrationTestJava {
dependencies {
implementation 'com.google.cloud:google-cloud-storage:1.113.16'
implementation 'com.google.auth:google-auth-library-oauth2-http:0.25.5'
// Updating to any newer version (e.g. 3.13.22) is causing a regression with normalization.
// See: https://github.com/airbytehq/airbyte/actions/runs/3078146312
// implementation 'net.snowflake:snowflake-jdbc:3.13.19'
// Temporarily switch to a forked version of snowflake-jdbc to prevent infinitely-retried http requests
// the diff is to replace this while(true) with while(retryCount < 100) https://github.com/snowflakedb/snowflake-jdbc/blob/v3.13.19/src/main/java/net/snowflake/client/jdbc/RestRequest.java#L121
// TODO (edgao) explain how you built this jar
implementation files('lib/snowflake-jdbc.jar')
implementation 'net.snowflake:snowflake-jdbc:3.14.1'
implementation 'org.apache.commons:commons-csv:1.4'
implementation 'org.apache.commons:commons-text:1.10.0'
implementation 'com.github.alexmojaki:s3-stream-upload:2.2.2'
Expand Down
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ data:
connectorSubtype: database
connectorType: destination
definitionId: 424892c4-daac-4491-b35d-c6688ba547ba
dockerImageTag: 3.1.16
dockerImageTag: 3.1.17
dockerRepository: airbyte/destination-snowflake
githubIssueLabel: destination-snowflake
icon: snowflake.svg
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,5 +28,5 @@ COPY source_google_analytics_data_api ./source_google_analytics_data_api
ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py"
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]

LABEL io.airbyte.version=1.6.0
LABEL io.airbyte.version=2.0.0
LABEL io.airbyte.name=airbyte/source-google-analytics-data-api
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ data:
connectorSubtype: api
connectorType: source
definitionId: 3cc2eafd-84aa-4dca-93af-322d9dfeec1a
dockerImageTag: 1.6.0
dockerImageTag: 2.0.0
dockerRepository: airbyte/source-google-analytics-data-api
githubIssueLabel: source-google-analytics-data-api
icon: google-analytics.svg
Expand All @@ -18,6 +18,11 @@ data:
enabled: true
oss:
enabled: true
releases:
breakingChanges:
2.0.0:
message: "Version 2.0.0 introduces changes to stream naming. This fixes a defect when the connector config has multiple properties and produces duplicated streams. In order to have the connector working, please re-set up the connector."
upgradeDeadline: "2023-10-16"
releaseStage: generally_available
suggestedStreams:
streams:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -516,7 +516,7 @@ def check_connection(self, logger: logging.Logger, config: Mapping[str, Any]) ->
invalid_metrics = ", ".join(invalid_metrics)
return False, WRONG_METRICS.format(fields=invalid_metrics, report_name=report["name"])

report_stream = self.instantiate_report_class(report, _config, page_size=100)
report_stream = self.instantiate_report_class(report, False, _config, page_size=100)
# check if custom_report dimensions + metrics can be combined and report generated
stream_slice = next(report_stream.stream_slices(sync_mode=SyncMode.full_refresh))
next(report_stream.read_records(sync_mode=SyncMode.full_refresh, stream_slice=stream_slice), None)
Expand All @@ -530,10 +530,20 @@ def streams(self, config: Mapping[str, Any]) -> List[Stream]:
return [stream for report in reports + config["custom_reports_array"] for stream in self.instantiate_report_streams(report, config)]

def instantiate_report_streams(self, report: dict, config: Mapping[str, Any], **extra_kwargs) -> GoogleAnalyticsDataApiBaseStream:
add_name_suffix = False
for property_id in config["property_ids"]:
yield self.instantiate_report_class(report=report, config={**config, "property_id": property_id})
yield self.instantiate_report_class(
report=report, add_name_suffix=add_name_suffix, config={**config, "property_id": property_id}
)
# Append property ID to stream name only for the second and subsequent properties.
# This will make a release non-breaking for users with a single property.
# This is a temporary solution until https://github.com/airbytehq/airbyte/issues/30926 is implemented.
add_name_suffix = True

def instantiate_report_class(self, report: dict, config: Mapping[str, Any], **extra_kwargs) -> GoogleAnalyticsDataApiBaseStream:
@staticmethod
def instantiate_report_class(
report: dict, add_name_suffix: bool, config: Mapping[str, Any], **extra_kwargs
) -> GoogleAnalyticsDataApiBaseStream:
cohort_spec = report.get("cohortSpec")
pivots = report.get("pivots")
stream_config = {
Expand All @@ -550,4 +560,7 @@ def instantiate_report_class(self, report: dict, config: Mapping[str, Any], **ex
if cohort_spec:
stream_config["cohort_spec"] = cohort_spec
report_class_tuple = (CohortReportMixin, *report_class_tuple)
return type(report["name"], report_class_tuple, {})(config=stream_config, authenticator=config["authenticator"], **extra_kwargs)
name = report["name"]
if add_name_suffix:
name = f"{name}Property{config['property_id']}"
return type(name, report_class_tuple, {})(config=stream_config, authenticator=config["authenticator"], **extra_kwargs)
Original file line number Diff line number Diff line change
Expand Up @@ -97,12 +97,10 @@ def test_check(requests_mock, config_gen, config_values, is_successful, message)
assert e.value.failure_type == FailureType.config_error


def test_streams(mocker, patch_base_class):
def test_streams(patch_base_class, config_gen):
config = config_gen(property_ids=["Prop1", "PropN"])
source = SourceGoogleAnalyticsDataApi()

config_mock = MagicMock()
config_mock.__getitem__.side_effect = patch_base_class["config"].__getitem__

streams = source.streams(patch_base_class["config"])
expected_streams_number = 57
streams = source.streams(config)
expected_streams_number = 57 * 2
assert len([stream for stream in streams if "_property_" in stream.name]) == 57
assert len(set(streams)) == expected_streams_number
1 change: 1 addition & 0 deletions docs/integrations/destinations/snowflake.md
Original file line number Diff line number Diff line change
Expand Up @@ -271,6 +271,7 @@ Otherwise, make sure to grant the role the required permissions in the desired n

| Version | Date | Pull Request | Subject |
|:----------------|:-----------|:-----------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 3.1.17 | 2023-09-29 | [\#30938](https://github.com/airbytehq/airbyte/pull/30938) | Upgrade snowflake-jdbc driver |
| 3.1.16 | 2023-09-28 | [\#30835](https://github.com/airbytehq/airbyte/pull/30835) | Fix regression from 3.1.15 in supporting concurrent syncs with identical stream name but different namespace |
| 3.1.15 | 2023-09-26 | [\#30775](https://github.com/airbytehq/airbyte/pull/30775) | Increase async block size |
| 3.1.14 | 2023-09-27 | [\#30739](https://github.com/airbytehq/airbyte/pull/30739) | Fix column name collision detection |
Expand Down
27 changes: 27 additions & 0 deletions docs/integrations/sources/google-analytics-data-api-migrations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Google Analytics 4 (GA4) Migration Guide

## Upgrading to 2.0.0

A major update of most streams to avoid having duplicate stream names. This is relevant for the connections having more than one property ID.
Resetting a connector is needed if you have more than one property ID in your config.

Let's say you have three property IDs - `0001`, `0002`, `0003`. Two of them will be included in the stream names:
- "daily_active_users",
- "daily_active_users_property_0002",
- "daily_active_users_property_0003",
- "weekly_active_users",
- "weekly_active_users_property_0002"
- "weekly_active_users_property_0003"
...

If the number of properties in your config does not exceed one, you will not see changes to your stream names, and the reset is not required:
- "daily_active_users",
- "weekly_active_users"

Once you add the second property ID, new streams will have names with the new property ID included. Existing streams will not be affected:
- "daily_active_users",
- "daily_active_users_property_0002",
- "weekly_active_users",
- "weekly_active_users_property_0002"

If you add the second+ property after running the upgrade, the reset is not required.
Loading

0 comments on commit 5c43999

Please sign in to comment.