-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
connectors-ci: add finalize_build
logic to handle custom Dockerfiles
#26489
connectors-ci: add finalize_build
logic to handle custom Dockerfiles
#26489
Conversation
Affected Connector ReportNOTE
|
Connector | Version | Changelog | Publish |
---|---|---|---|
source-alloydb |
2.0.28 |
✅ | ✅ |
source-alloydb-strict-encrypt |
2.0.28 |
🔵 (ignored) |
🔵 (ignored) |
source-azure-blob-storage |
0.1.0 |
✅ | ✅ |
source-bigquery |
0.2.3 |
✅ | ✅ |
source-clickhouse |
0.1.17 |
✅ | ✅ |
source-clickhouse-strict-encrypt |
0.1.17 |
🔵 (ignored) |
🔵 (ignored) |
source-cockroachdb |
0.1.22 |
✅ | ✅ |
source-cockroachdb-strict-encrypt |
0.1.22 |
🔵 (ignored) |
🔵 (ignored) |
source-db2 |
0.1.19 |
✅ | ✅ |
source-db2-strict-encrypt |
0.1.19 |
🔵 (ignored) |
🔵 (ignored) |
source-dynamodb |
0.1.2 |
✅ | ✅ |
source-e2e-test |
2.1.4 |
✅ | ✅ |
source-e2e-test-cloud |
2.1.4 |
🔵 (ignored) |
🔵 (ignored) |
source-elasticsearch |
0.1.1 |
✅ | ✅ |
source-jdbc |
0.3.5 |
🔵 (ignored) |
🔵 (ignored) |
source-kafka |
0.2.3 |
✅ | ✅ |
source-mongodb |
0.3.3 |
🔵 (ignored) |
🔵 (ignored) |
source-mongodb-strict-encrypt |
0.1.19 |
🔵 (ignored) |
🔵 (ignored) |
source-mongodb-v2 |
0.1.19 |
✅ | ✅ |
source-mssql |
1.0.16 |
✅ | ✅ |
source-mssql-strict-encrypt |
1.0.16 |
🔵 (ignored) |
🔵 (ignored) |
source-mysql |
2.0.23 |
✅ | ✅ |
source-mysql-strict-encrypt |
2.0.23 |
🔵 (ignored) |
🔵 (ignored) |
source-oracle |
0.3.24 |
✅ | ✅ |
source-oracle-strict-encrypt |
0.3.24 |
🔵 (ignored) |
🔵 (ignored) |
source-postgres |
2.0.29 |
✅ | ✅ |
source-postgres-strict-encrypt |
2.0.29 |
🔵 (ignored) |
🔵 (ignored) |
source-python-http-tutorial |
0.1.4 |
🔵 (ignored) |
🔵 (ignored) |
source-redshift |
0.3.16 |
✅ | ✅ |
source-relational-db |
0.3.1 |
🔵 (ignored) |
🔵 (ignored) |
source-scaffold-java-jdbc |
0.1.0 |
🔵 (ignored) |
🔵 (ignored) |
source-sftp |
0.1.2 |
✅ | ✅ |
source-snowflake |
0.1.34 |
✅ | ✅ |
source-stock-ticker-api-tutorial |
0.2.1 |
🔵 (ignored) |
🔵 (ignored) |
source-teradata |
0.1.0 |
✅ | ✅ |
source-tidb |
0.2.4 |
✅ | ✅ |
- See "Actionable Items" below for how to resolve warnings and errors.
❌ Destinations (49)
Connector | Version | Changelog | Publish |
---|---|---|---|
destination-azure-blob-storage |
0.2.0 |
✅ | ✅ |
destination-bigquery |
1.4.3 |
✅ | ✅ |
destination-bigquery-denormalized |
1.4.1 |
✅ | ✅ |
destination-cassandra |
0.1.4 |
✅ | ✅ |
destination-clickhouse |
0.2.3 |
✅ | ✅ |
destination-clickhouse-strict-encrypt |
0.2.3 |
🔵 (ignored) |
🔵 (ignored) |
destination-csv |
1.0.0 |
✅ | ✅ |
destination-databricks |
1.0.2 |
✅ | ✅ |
destination-dev-null |
0.3.0 |
🔵 (ignored) |
🔵 (ignored) |
destination-doris |
0.1.0 |
✅ | ✅ |
destination-dynamodb |
0.1.7 |
✅ | ✅ |
destination-e2e-test |
0.3.0 |
✅ | ✅ |
destination-elasticsearch |
0.1.6 |
✅ | ✅ |
destination-elasticsearch-strict-encrypt |
0.1.6 |
🔵 (ignored) |
🔵 (ignored) |
destination-exasol |
0.1.1 |
✅ | ✅ |
destination-gcs |
0.3.0 |
✅ | ✅ |
destination-iceberg |
0.1.0 |
✅ | ✅ |
destination-kafka |
0.1.10 |
✅ | ✅ |
destination-keen |
0.2.4 |
✅ | ✅ |
destination-kinesis |
0.1.5 |
✅ | ✅ |
destination-local-json |
0.2.11 |
✅ | ✅ |
destination-mariadb-columnstore |
0.1.7 |
✅ | ✅ |
destination-mongodb |
0.1.9 |
✅ | ✅ |
destination-mongodb-strict-encrypt |
0.1.9 |
🔵 (ignored) |
🔵 (ignored) |
destination-mqtt |
0.1.3 |
✅ | ✅ |
destination-mssql |
0.1.23 |
✅ | ✅ |
destination-mssql-strict-encrypt |
0.1.23 |
🔵 (ignored) |
🔵 (ignored) |
destination-mysql |
0.1.20 |
✅ | ✅ |
destination-mysql-strict-encrypt |
❌ 0.1.21 (mismatch: 0.1.20 ) |
🔵 (ignored) |
🔵 (ignored) |
destination-oracle |
0.1.19 |
✅ | ✅ |
destination-oracle-strict-encrypt |
0.1.19 |
🔵 (ignored) |
🔵 (ignored) |
destination-postgres |
0.3.27 |
✅ | ✅ |
destination-postgres-strict-encrypt |
0.3.27 |
🔵 (ignored) |
🔵 (ignored) |
destination-pubsub |
0.2.0 |
✅ | ✅ |
destination-pulsar |
0.1.3 |
✅ | ✅ |
destination-r2 |
0.1.0 |
✅ | ✅ |
destination-redis |
0.1.4 |
✅ | ✅ |
destination-redpanda |
0.1.0 |
✅ | ✅ |
destination-redshift |
0.4.7 |
✅ | ✅ |
destination-rockset |
0.1.4 |
✅ | ✅ |
destination-s3 |
0.4.1 |
✅ | ✅ |
destination-s3-glue |
0.1.7 |
✅ | ✅ |
destination-scylla |
0.1.3 |
✅ | ✅ |
destination-selectdb |
0.1.0 |
✅ | ✅ |
destination-snowflake |
1.0.4 |
✅ | ✅ |
destination-starburst-galaxy |
0.0.1 |
✅ | ✅ |
destination-teradata |
0.1.1 |
✅ | ✅ |
destination-tidb |
0.1.1 |
✅ | ❌ (diff seed version) |
destination-yugabytedb |
0.1.1 |
✅ | ✅ |
- See "Actionable Items" below for how to resolve warnings and errors.
👀 Other Modules (1)
- base-normalization
Actionable Items
(click to expand)
Category | Status | Actionable Item |
---|---|---|
Version | ❌ mismatch |
The version of the connector is different from its normal variant. Please bump the version of the connector. |
⚠ doc not found |
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug. |
|
Changelog | ⚠ doc not found |
The connector does not seem to have a documentation file. This can be normal (e.g. basic connector like source-jdbc is not published or documented). Please double-check to make sure that it is not a bug. |
❌ changelog missing |
There is no chnagelog for the current version of the connector. If you are the author of the current version, please add a changelog. | |
Publish | ⚠ not in seed |
The connector is not in the cloud or oss registry, so its publication status cannot be checked. This can be normal (e.g. some connectors are cloud-specific, and only listed in the cloud seed file). Please double-check to make sure that you have added a metadata.yaml file and the expected registries are enabled. |
echo "$ARCH" | ||
yum install lzop lzo lzo-dev -y | ||
|
||
# alanechere: I'm not sure we need this custom install of lzo anymore. Using the yum install above works in the build context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edgao I was not able to run this script successfully. The sdk install java 8.0.342-librca
command exits with a 1 status code. Considering the current code freeze on this connector I think it's fine to merge this PR with a "broken" script. I'll let your team figure the changes required to support LZO on all architectures. I'm even not sure the custom install for ARM64 is required. I tried a simple yum install lzop lzo lzo-dev -y
on the ARM64 variant and it worked.
You can reproduce my failure on this branch by running:
airbyte-ci connectors --name=destination-s3 build
I'll also let you decide if you prefer to use a bash script or the finalize_build.py
module with a finalize_build
(please check the destination-iceberg/finalize_build.py
for an example). The 🐍 way is better for improved caching and access to the Dagger API. But I also understand that running this script as it is is convenient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amazing. yeah I'll just submit an issue to figure this out, not a blocker for this pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, and the comments in the Java dockerfiles are great too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -1,3 +1,11 @@ | |||
### WARNING ### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🥇
has_finalize_bash_script = "finalize_build.sh" in finalize_scripts | ||
has_finalize_python_script = "finalize_build.py" in finalize_scripts | ||
if has_finalize_python_script and has_finalize_bash_script: | ||
raise Exception("Connector has both finalize_build.sh and finalize_build.py, please remove one of them") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch 👏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.with_entrypoint("sh") | ||
.with_exec(["/tmp/finalize_build.sh"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.with_entrypoint("sh") | |
.with_exec(["/tmp/finalize_build.sh"]) | |
.with_exec(["sh", "/tmp/finalize_build.sh"]) |
return await finalize_build(context, connector_container) | ||
|
||
|
||
async def finalize_build(context: ConnectorContext, connector_container: Container) -> Container: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Lets put this in the docs!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, submitted #26521 to get dest-s3 into good state.
echo "$ARCH" | ||
yum install lzop lzo lzo-dev -y | ||
|
||
# alanechere: I'm not sure we need this custom install of lzo anymore. Using the yum install above works in the build context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amazing. yeah I'll just submit an issue to figure this out, not a blocker for this pr.
if not finalize_scripts: | ||
return connector_container | ||
|
||
# We don't want finalize scripts to override the entrypoint so we keep it in memory to reset it after finalization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious, why don't we want the scripts to override the entrypoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might reconsider it for specific upcoming cases, but I've not seen connector specifically overriding the classic airbyte entrypoint (except your "normalization in container" usecase, which is handled directly in the main code path you contributed to)
What
Closes #26426
Java connectors are not built with Dockerfiles anymore now that we switched to Dagger.
But some connectors have a custom Dockerfile that install system dependencies are add custom env variable.
This PR is a suggested approach to handle these exceptional cases.
How
finalize_build.sh
or afinalize_build.py
file can be created in the connector folder. Developers are free to pick the format they prefer. Thefinalize_build
python function allows direct interaction with the Dagger container being built, which can be convenient for docker specific instructions and improved caching.with_airbyte_python_connector
orwith_airbyte_java_connector
function will run the bash script or thefinalize_build
Python function on the container that was built with the common procedure.I added a deprecation warning in order to make our developers aware that Dockerfile changes currently have no effect on images we currently publish with the automated pipelines.
We can't delete the Dockerfiles at the moment because they're still use for
/test
- but once the tests powered by dagger are live we can proceed to their deletion.For a "real-life" example I've added:
finalize_build.sh
script to destination-s3 to install LZO compression.finalize_build.py
script to destination-iceberg to set the custom JAVA_OPTS env var this connector requiresfinalize_build.sh
script fordestination-s3
exits with 1 status code on ARM64 architecutredestination-iceberg
connector build has been broken for a long time (probably since monorepo split) and relies on Java library that are not on the airbyte repo anymore...Recommended reading order:
airbyte/tools/ci_connector_ops/ci_connector_ops/pipelines/actions/environments.py
Line 686 in d17ec25
airbyte/airbyte-integrations/connectors/destination-s3/finalize_build.sh
Line 1 in d17ec25
airbyte/airbyte-integrations/connectors/destination-iceberg/finalize_build.py
Line 6 in d17ec25
🚨 User Impact 🚨
No impact as the build logic for connector with "normal" Dockerfiles was not altered. I'll publish a java and a python connector pre-release for sanity.
Update: confirmed that the "normal" connector publish is working. The
destination-s3
failure is related to an error in thefinalize_build.sh
script I pinged @edgao about here.