Skip to content

Commit

Permalink
Migrate connectors to use our python base image (Round 1) (#31543)
Browse files Browse the repository at this point in the history
  • Loading branch information
alafanechere committed Oct 18, 2023
1 parent c31cf83 commit c544183
Show file tree
Hide file tree
Showing 16 changed files with 286 additions and 161 deletions.

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -54,19 +54,70 @@ python main.py read --config secrets/config.json --catalog integration_tests/con

### Locally running the connector docker image

#### Build
First, make sure you build the latest Docker image:
```
docker build . -t airbyte/source-azure-blob-storage:dev


#### Use `airbyte-ci` to build your connector
The Airbyte way of building this connector is to use our `airbyte-ci` tool.
You can follow install instructions [here](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md#L1).
Then running the following command will build your connector:

```bash
airbyte-ci connectors --name source-azure-blob-storage build
```
Once the command is done, you will find your connector image in your local docker registry: `airbyte/source-azure-blob-storage:dev`.

##### Customizing our build process
When contributing on our connector you might need to customize the build process to add a system dependency or set an env var.
You can customize our build process by adding a `build_customization.py` module to your connector.
This module should contain a `pre_connector_install` and `post_connector_install` async function that will mutate the base image and the connector container respectively.
It will be imported at runtime by our build process and the functions will be called if they exist.

Here is an example of a `build_customization.py` module:
```python
from __future__ import annotations

from typing import TYPE_CHECKING

if TYPE_CHECKING:
# Feel free to check the dagger documentation for more information on the Container object and its methods.
# https://dagger-io.readthedocs.io/en/sdk-python-v0.6.4/
from dagger import Container

You can also build the connector image via Gradle:

async def pre_connector_install(base_image_container: Container) -> Container:
return await base_image_container.with_env_variable("MY_PRE_BUILD_ENV_VAR", "my_pre_build_env_var_value")

async def post_connector_install(connector_container: Container) -> Container:
return await connector_container.with_env_variable("MY_POST_BUILD_ENV_VAR", "my_post_build_env_var_value")
```
./gradlew :airbyte-integrations:connectors:source-azure-blob-storage:airbyteDocker

#### Build your own connector image
This connector is built using our dynamic built process in `airbyte-ci`.
The base image used to build it is defined within the metadata.yaml file under the `connectorBuildOptions`.
The build logic is defined using [Dagger](https://dagger.io/) [here](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/pipelines/builds/python_connectors.py).
It does not rely on a Dockerfile.

If you would like to patch our connector and build your own a simple approach would be to:

1. Create your own Dockerfile based on the latest version of the connector image.
```Dockerfile
FROM airbyte/source-azure-blob-storage:latest

COPY . ./airbyte/integration_code
RUN pip install ./airbyte/integration_code

# The entrypoint and default env vars are already set in the base image
# ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py"
# ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]
```
When building via Gradle, the docker image name and tag, respectively, are the values of the `io.airbyte.name` and `io.airbyte.version` `LABEL`s in
the Dockerfile.
Please use this as an example. This is not optimized.

2. Build your image:
```bash
docker build -t airbyte/source-azure-blob-storage:dev .
# Running the spec command against your patched connector
docker run airbyte/source-azure-blob-storage:dev spec
```
#### Run
Then run any of the connector commands as follows:
```
Expand Down Expand Up @@ -126,4 +177,4 @@ You've checked out the repo, implemented a million dollar feature, and you're re
1. Bump the connector version in `Dockerfile` -- just increment the value of the `LABEL io.airbyte.version` appropriately (we use [SemVer](https://semver.org/)).
1. Create a Pull Request.
1. Pat yourself on the back for being an awesome contributor.
1. Someone from Airbyte will take a look at your PR and iterate with you to merge it into master.
1. Someone from Airbyte will take a look at your PR and iterate with you to merge it into master.
Original file line number Diff line number Diff line change
@@ -1,9 +1,15 @@
data:
ab_internal:
ql: 100
sl: 100
connectorBuildOptions:
baseImage: docker.io/airbyte/python-connector-base:1.1.0@sha256:bd98f6505c6764b1b5f99d3aedc23dfc9e9af631a62533f60eb32b1d3dbab20c
connectorSubtype: file
connectorType: source
definitionId: fdaaba68-4875-4ed9-8fcd-4ae1e0a25093
dockerImageTag: 0.2.0
dockerImageTag: 0.2.1
dockerRepository: airbyte/source-azure-blob-storage
documentationUrl: https://docs.airbyte.com/integrations/sources/azure-blob-storage
githubIssueLabel: source-azure-blob-storage
icon: azureblobstorage.svg
license: MIT
Expand All @@ -14,11 +20,7 @@ data:
oss:
enabled: true
releaseStage: alpha
documentationUrl: https://docs.airbyte.com/integrations/sources/azure-blob-storage
supportLevel: community
tags:
- language:python
ab_internal:
sl: 100
ql: 100
supportLevel: community
metadataSpecVersion: "1.0"

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -54,19 +54,70 @@ python main.py read --config secrets/config.json --catalog integration_tests/con

### Locally running the connector docker image

#### Build
First, make sure you build the latest Docker image:
```
docker build . -t airbyte/source-google-analytics-data-api:dev


#### Use `airbyte-ci` to build your connector
The Airbyte way of building this connector is to use our `airbyte-ci` tool.
You can follow install instructions [here](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/README.md#L1).
Then running the following command will build your connector:

```bash
airbyte-ci connectors --name source-google-analytics-data-api build
```
Once the command is done, you will find your connector image in your local docker registry: `airbyte/source-google-analytics-data-api:dev`.

##### Customizing our build process
When contributing on our connector you might need to customize the build process to add a system dependency or set an env var.
You can customize our build process by adding a `build_customization.py` module to your connector.
This module should contain a `pre_connector_install` and `post_connector_install` async function that will mutate the base image and the connector container respectively.
It will be imported at runtime by our build process and the functions will be called if they exist.

Here is an example of a `build_customization.py` module:
```python
from __future__ import annotations

from typing import TYPE_CHECKING

if TYPE_CHECKING:
# Feel free to check the dagger documentation for more information on the Container object and its methods.
# https://dagger-io.readthedocs.io/en/sdk-python-v0.6.4/
from dagger import Container

You can also build the connector image via Gradle:

async def pre_connector_install(base_image_container: Container) -> Container:
return await base_image_container.with_env_variable("MY_PRE_BUILD_ENV_VAR", "my_pre_build_env_var_value")

async def post_connector_install(connector_container: Container) -> Container:
return await connector_container.with_env_variable("MY_POST_BUILD_ENV_VAR", "my_post_build_env_var_value")
```
./gradlew :airbyte-integrations:connectors:source-google-analytics-data-api:airbyteDocker

#### Build your own connector image
This connector is built using our dynamic built process in `airbyte-ci`.
The base image used to build it is defined within the metadata.yaml file under the `connectorBuildOptions`.
The build logic is defined using [Dagger](https://dagger.io/) [here](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/pipelines/pipelines/builds/python_connectors.py).
It does not rely on a Dockerfile.

If you would like to patch our connector and build your own a simple approach would be to:

1. Create your own Dockerfile based on the latest version of the connector image.
```Dockerfile
FROM airbyte/source-google-analytics-data-api:latest

COPY . ./airbyte/integration_code
RUN pip install ./airbyte/integration_code

# The entrypoint and default env vars are already set in the base image
# ENV AIRBYTE_ENTRYPOINT "python /airbyte/integration_code/main.py"
# ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]
```
When building via Gradle, the docker image name and tag, respectively, are the values of the `io.airbyte.name` and `io.airbyte.version` `LABEL`s in
the Dockerfile.
Please use this as an example. This is not optimized.

2. Build your image:
```bash
docker build -t airbyte/source-google-analytics-data-api:dev .
# Running the spec command against your patched connector
docker run airbyte/source-google-analytics-data-api:dev spec
```
#### Run
Then run any of the connector commands as follows:
```
Expand Down Expand Up @@ -127,4 +178,4 @@ You've checked out the repo, implemented a million dollar feature, and you're re
1. Bump the connector version in `Dockerfile` -- just increment the value of the `LABEL io.airbyte.version` appropriately (we use [SemVer](https://semver.org/)).
1. Create a Pull Request.
1. Pat yourself on the back for being an awesome contributor.
1. Someone from Airbyte will take a look at your PR and iterate with you to merge it into master.
1. Someone from Airbyte will take a look at your PR and iterate with you to merge it into master.
Original file line number Diff line number Diff line change
@@ -1,14 +1,20 @@
data:
ab_internal:
ql: 400
sl: 300
allowedHosts:
hosts:
- oauth2.googleapis.com
- www.googleapis.com
- analyticsdata.googleapis.com
connectorBuildOptions:
baseImage: docker.io/airbyte/python-connector-base:1.1.0@sha256:bd98f6505c6764b1b5f99d3aedc23dfc9e9af631a62533f60eb32b1d3dbab20c
connectorSubtype: api
connectorType: source
definitionId: 3cc2eafd-84aa-4dca-93af-322d9dfeec1a
dockerImageTag: 2.0.0
dockerImageTag: 2.0.1
dockerRepository: airbyte/source-google-analytics-data-api
documentationUrl: https://docs.airbyte.com/integrations/sources/google-analytics-data-api
githubIssueLabel: source-google-analytics-data-api
icon: google-analytics.svg
license: Elv2
Expand All @@ -18,12 +24,16 @@ data:
enabled: true
oss:
enabled: true
releaseStage: generally_available
releases:
breakingChanges:
2.0.0:
message: "Version 2.0.0 introduces changes to stream names for those syncing more than one Google Analytics 4 property. It allows streams from all properties to sync successfully. Please upgrade the connector to enable this additional functionality."
message:
Version 2.0.0 introduces changes to stream names for those syncing
more than one Google Analytics 4 property. It allows streams from all properties
to sync successfully. Please upgrade the connector to enable this additional
functionality.
upgradeDeadline: "2023-10-16"
releaseStage: generally_available
suggestedStreams:
streams:
- website_overview
Expand All @@ -35,11 +45,7 @@ data:
- locations
- four_weekly_active_users
- sessions
documentationUrl: https://docs.airbyte.com/integrations/sources/google-analytics-data-api
supportLevel: certified
tags:
- language:python
ab_internal:
sl: 300
ql: 400
supportLevel: certified
metadataSpecVersion: "1.0"
17 changes: 0 additions & 17 deletions airbyte-integrations/connectors/source-salesforce/Dockerfile

This file was deleted.

Loading

0 comments on commit c544183

Please sign in to comment.