This page will walk through the process of developing with the Java CDK.
- Developing with the Java CDK
The java CDK is comprised of separate modules:
core
- Shared classes for building connectors of all types.db-sources
- Shared classes for building DB sources.db-destinations
- Shared classes for building DB destinations.
Each CDK submodule may contain these elements:
src/main
- (Required.) The classes that will ship with the connector, providing capabilities to the connectors.src/test
- (Required.) These are unit tests that run as part of every build of the CDK. They help ensure that CDKmain
code is in a healthy state.src/test-integration
- (Optional.) Integration tests which provide a more extensive test of the code insrc/main
. These are not by thebuild
command but are executed as part of theintegrationTest
orintegrationTestJava
Gradle tasks.src/testFixtures
- (Optional.) These shared classes are exported for connectors for use in the connectors' own test implementations. Connectors will have access to these classes within their unit and integration tests, but the classes will not be shipped with connectors when they are published.
The CDK is published as a set of jar files sharing a version number. Every submodule generates one runtime jar for the main classes. If the submodule contains test fixtures, a second jar will be published with the test fixtures classes.
Note: Connectors do not have to manage which jars they should depend on, as this is handled automatically by the airbyte-java-connector
plugin. See example below.
To build and test the Java CDK, execute the following:
./gradlew :airbyte-cdk:java:airbyte-cdk:build
You will need to bump this version manually whenever you are making changes to code inside the CDK.
While under development, the next version number for the CDK is tracked in the file: airbyte-cdk/java/airbyte-cdk/core/src/main/resources/version.properties
.
If the CDK is not being modified, this file will contain the most recently published version number.
The CDK can be published with a GitHub Workflow and a slash command which can be run by Airbyte personnel.
To invoke via slash command (recommended), use the following syntax in a comment on the PR that contains your changes:
/publish-java-cdk # Run with the defaults (dry-run=false, force=false)
/publish-java-cdk dry-run=true # Run in dry-run mode (no-op)
/publish-java-cdk force=true # Force-publish if needing to replace an already published version
Note:
- Remember to document your changes in the Changelog section below.
- After you publish the CDK, remember to toggle
useLocalCdk
back tofalse
in all connectors. - Unless you specify
force=true
, the pipeline should fail if the version you are trying to publish already exists. - By running the publish with
dry-run=true
, you can confirm the process is working as expected, without actually publishing the changes. - In dry-run mode, you can also view and download the jars that are generated. To do so, navigate to the job status in GitHub Actions and navigate to the 'artifacts' section.
- You can also invoke manually in the GitHub Web UI. To do so: go to
Actions
tab, select thePublish Java CDK
workflow, and clickRun workflow
. - You can view and administer published CDK versions here: https://admin.cloudrepo.io/repository/airbyte-public-jars/io/airbyte/airbyte-cdk
- The public endpoint for published CDK versions is here: https://airbyte.mycloudrepo.io/public/repositories/airbyte-public-jars/io/airbyte/airbyte-cdk/
You can reference the CDK in your connector's build.gradle
file:
plugins {
id 'application'
id 'airbyte-java-connector'
}
airbyteJavaConnector {
cdkVersionRequired = '0.1.0' // The CDK version to pin to.
features = ['db-destinations'] // An array of CDK features to depend on.
useLocalCdk = false // Use 'true' to use a live reference to the
// local cdk project.
}
airbyteJavaConnector.addCdkDependencies()
Replace 0.1.0
with the CDK version you are working with. If you're actively developing the CDK and want to use the latest version locally, use the useLocalCdk
flag to use the live CDK code during builds and tests.
You can iterate on changes in the CDK local and test them in the connector without needing to publish the CDK changes publicly.
When modifying the CDK and a connector in the same PR or branch, please use the following steps:
- Set the version of the CDK in
version.properties
to the next appropriate version number and add a description in theChangelog
at the bottom of this readme file. - Modify your connector's build.gradle file as follows:
- Set
useLocalCdk
totrue
in the connector you are working on. This will ensure the connector always uses the local CDK definitions instead of the published version. - Set
cdkVersionRequired
to use the new to-be-published CDK version.
- Set
After the above, you can build and test your connector as usual. Gradle will automatically use the local CDK code files while you are working on the connector.
Once you are done developing and testing your CDK changes:
- Publish the CDK using the instructions here in this readme.
- After publishing the CDK, update the
useLocalCdk
setting by running./gradlew :airbyte-integrations:connectors:<connector-name>:disableLocalCdkRefs
. to automatically revertuseLocalCdk
tofalse
. - You can optionally run
./gradlew :airbyte-integrations:connectors:<connector-name>:assertNotUsingLocalCdk
to ensure that the project is not using a local CDK reference.
Note: You can also use ./gradlew assertNotUsingLocalCdk
or ./gradlew disableLocalCdkRefs
to run these tasks on all connectors simultaneously.
Note: after switching between a local and a pinned CDK reference, you may need to refresh dependency caches in Gradle and/or your IDE.
In Gradle, you can use the CLI arg --refresh-dependencies
the next time you build or test your connector, which will ensure that the correct version of the CDK is used after toggling the useLocalCdk
value.
You can always pin your connector to a prior stable version of the CDK, which may not match what is the latest version in the airbyte
repo. For instance, your connector can be pinned to 0.1.1
while the latest version may be 0.2.0
.
Maven and Gradle will automatically reference the correct (pinned) version of the CDK for your connector, and you can use your local IDE to browse the prior version of the codebase that corresponds to that version.
MavenLocal debugging steps:
- Confirm local publish status by running:
ls -la ~/.m2/repository/io/airbyte/airbyte-cdk/*
- Confirm jar contents by running:
jar tf ~/.m2/repository/io/airbyte/airbyte-cdk/0.0.2-SNAPSHOT/airbyte-cdk-0.0.2-SNAPSHOT.jar
- Remove CDK artifacts from MavenLocal by running:
rm -rf ~/.m2/repository/io/airbyte/airbyte-cdk/*
- Rebuid CDK artifacts by running:
./gradlew :airbyte-cdk:java:airbyte-cdk:build
or./gradlew :airbyte-cdk:java:airbyte-cdk:publishToMavenLocal
Version | Date | Pull Request | Subject |
---|---|---|---|
0.23.2 | 2024-02-22 | #35385 | Bugfix: inverted logic of disableTypeDedupe flag |
0.23.1 | 2024-02-22 | #35527 | reduce shutdow timeouts |
0.23.0 | 2024-02-22 | #35342 | Consolidate and perform upfront gathering of DB metadata state |
0.21.4 | 2024-02-21 | #35511 | Reduce CDC state compression limit to 1MB |
0.21.3 | 2024-02-20 | #35394 | Add Junit progress information to the test logs |
0.21.2 | 2024-02-20 | #34978 | Reduce log noise in NormalizationLogParser. |
0.21.1 | 2024-02-20 | #35199 | Add thread names to the logs. |
0.21.0 | 2024-02-16 | #35314 | Delete S3StreamCopier classes. These have been superseded by the async destinations framework. |
0.20.9 | 2024-02-15 | #35240 | Make state emission to platform inside state manager itself. |
0.20.8 | 2024-02-15 | #35285 | Improve blobstore module structure. |
0.20.7 | 2024-02-13 | #35236 | output logs to files in addition to stdout when running tests |
0.20.6 | 2024-02-12 | #35036 | Add trace utility to emit analytics messages. |
0.20.5 | 2024-02-13 | #34869 | Don't emit final state in SourceStateIterator there is an underlying stream failure. |
0.20.4 | 2024-02-12 | #35042 | Use delegate's isDestinationV2 invocation in SshWrappedDestination. |
0.20.3 | 2024-02-09 | #34580 | Support special chars in mysql/mssql database name. |
0.20.2 | 2024-02-12 | #35111 | Make state emission from async framework synchronized. |
0.20.1 | 2024-02-11 | #35111 | Fix GlobalAsyncStateManager stats counting logic. |
0.20.0 | 2024-02-09 | #34562 | Add new test cases to BaseTypingDedupingTest to exercise special characters. |
0.19.0 | 2024-02-01 | #34745 | Reorganize CDK module structure. |
0.18.0 | 2024-02-08 | #33606 | Add updated Initial and Incremental Stream State definitions for DB Sources. |
0.17.1 | 2024-02-08 | #35027 | Make state handling thread safe in async destination framework. |
0.17.0 | 2024-02-08 | #34502 | Enable configuring async destination batch size. |
0.16.6 | 2024-02-07 | #34892 | Improved testcontainers logging and support for unshared containers. |
0.16.5 | 2024-02-07 | #34948 | Fix source state stats counting logic |
0.16.4 | 2024-02-01 | #34727 | Add future based stdout consumer in BaseTypingDedupingTest |
0.16.3 | 2024-01-30 | #34669 | Fix org.apache.logging.log4j:log4j-slf4j-impl version conflicts. |
0.16.2 | 2024-01-29 | #34630 | expose NamingTransformer to sub-classes in destinations JdbcSqlGenerator. |
0.16.1 | 2024-01-29 | #34533 | Add a safe method to execute DatabaseMetadata's Resultset returning queries. |
0.16.0 | 2024-01-26 | #34573 | Untangle Debezium harness dependencies. |
0.15.2 | 2024-01-25 | #34441 | Improve airbyte-api build performance. |
0.15.1 | 2024-01-25 | #34451 | Async destinations: Better logging when we fail to parse an AirbyteMessage |
0.15.0 | 2024-01-23 | #34441 | Removed connector registry and micronaut dependencies. |
0.14.2 | 2024-01-24 | #34458 | Handle case-sensitivity in sentry error grouping |
0.14.1 | 2024-01-24 | #34468 | Add wait for process to be done before ending sync in destination BaseTDTest |
0.14.0 | 2024-01-23 | #34461 | Revert non backward compatible signature changes from 0.13.1 |
0.13.3 | 2024-01-23 | #34077 | Denote if destinations fully support Destinations V2 |
0.13.2 | 2024-01-18 | #34364 | Better logging in mongo db source connector |
0.13.1 | 2024-01-18 | #34236 | Add postCreateTable hook in destination JdbcSqlGenerator |
0.13.0 | 2024-01-16 | #34177 | Add useExpensiveSafeCasting param in JdbcSqlGenerator methods; add JdbcTypingDedupingTest fixture; other DV2-related changes |
0.12.1 | 2024-01-11 | #34186 | Add hook for additional destination specific checks to JDBC destination check method |
0.12.0 | 2024-01-10 | #33875 | Upgrade sshd-mina to 2.11.1 |
0.11.5 | 2024-01-10 | #34119 | Remove wal2json support for postgres+debezium. |
0.11.4 | 2024-01-09 | #33305 | Source stats in incremental syncs |
0.11.3 | 2023-01-09 | #33658 | Always fail when debezium fails, even if it happened during the setup phase. |
0.11.2 | 2024-01-09 | #33969 | Destination state stats implementation |
0.11.1 | 2024-01-04 | #33727 | SSH bastion heartbeats for Destinations |
0.11.0 | 2024-01-04 | #33730 | DV2 T+D uses Sql struct to represent transactions; other T+D-related changes |
0.10.4 | 2023-12-20 | #33071 | Add the ability to parse JDBC parameters with another delimiter than '&' |
0.10.3 | 2024-01-03 | #33312 | Send out count in AirbyteStateMessage |
0.10.1 | 2023-12-21 | #33723 | Make memory-manager log message less scary |
0.10.0 | 2023-12-20 | #33704 | JdbcDestinationHandler now properly implements getInitialRawTableState ; reenable SqlGenerator test |
0.9.0 | 2023-12-18 | #33124 | Make Schema Creation Separate from Table Creation, exclude the T&D module from the CDK |
0.8.0 | 2023-12-18 | #33506 | Improve async destination shutdown logic; more JDBC async migration work; improve DAT test schema handling |
0.7.9 | 2023-12-18 | #33549 | Improve MongoDB logging. |
0.7.8 | 2023-12-18 | #33365 | Emit stream statuses more consistently |
0.7.7 | 2023-12-18 | #33434 | Remove LEGACY state |
0.7.6 | 2023-12-14 | #32328 | Add schema less mode for mongodb CDC. Fixes for non standard mongodb id type. |
0.7.4 | 2023-12-13 | #33232 | Track stream record count during sync; only run T+D if a stream had nonzero records or the previous sync left unprocessed records. |
0.7.3 | 2023-12-13 | #33369 | Extract shared JDBC T+D code. |
0.7.2 | 2023-12-11 | #33307 | Fix DV2 JDBC type mappings (code changes in #33307). |
0.7.1 | 2023-12-01 | #33027 | Add the abstract DB source debugger. |
0.7.0 | 2023-12-07 | #32326 | Destinations V2 changes for JDBC destinations |
0.6.4 | 2023-12-06 | #33082 | Improvements to schema snapshot error handling + schema snapshot history scope (scoped to configured DB). |
0.6.2 | 2023-11-30 | #32573 | Update MSSQLConverter to enforce 6-digit microsecond precision for timestamp fields |
0.6.1 | 2023-11-30 | #32610 | Support DB initial sync using binary as primary key. |
0.6.0 | 2023-11-30 | #32888 | JDBC destinations now use the async framework |
0.5.3 | 2023-11-28 | #32686 | Better attribution of debezium engine shutdown due to heartbeat. |
0.5.1 | 2023-11-27 | #32662 | Debezium initialization wait time will now read from initial setup time. |
0.5.0 | 2023-11-22 | #32656 | Introduce TestDatabase test fixture, refactor database source test base classes. |
0.4.11 | 2023-11-14 | #32526 | Clean up memory manager logs. |
0.4.10 | 2023-11-13 | #32285 | Fix UUID codec ordering for MongoDB connector |
0.4.9 | 2023-11-13 | #32468 | Further error grouping improvements for DV2 connectors |
0.4.8 | 2023-11-09 | #32377 | source-postgres tests: skip dropping database |
0.4.7 | 2023-11-08 | #31856 | source-postgres: support for infinity date and timestamps |
0.4.5 | 2023-11-07 | #32112 | Async destinations framework: Allow configuring the queue flush threshold |
0.4.4 | 2023-11-06 | #32119 | Add STANDARD UUID codec to MongoDB debezium handler |
0.4.2 | 2023-11-06 | #32190 | Improve error deinterpolation |
0.4.1 | 2023-11-02 | #32192 | Add 's3-destinations' CDK module. |
0.4.0 | 2023-11-02 | #32050 | Fix compiler warnings. |
0.3.0 | 2023-11-02 | #31983 | Add deinterpolation feature to AirbyteExceptionHandler. |
0.2.4 | 2023-10-31 | #31807 | Handle case of debezium update and delete of records in mongodb. |
0.2.3 | 2023-10-31 | #32022 | Update Debezium version from 2.20 -> 2.4.0. |
0.2.2 | 2023-10-31 | #31976 | Debezium tweaks to make tests run faster. |
0.2.0 | 2023-10-30 | #31960 | Hoist top-level gradle subprojects into CDK. |
0.1.12 | 2023-10-24 | #31674 | Fail sync when Debezium does not shut down properly. |
0.1.11 | 2023-10-18 | #31486 | Update constants in AdaptiveSourceRunner. |
0.1.9 | 2023-10-12 | #31309 | Use toPlainString() when handling BigDecimals in PostgresConverter |
0.1.8 | 2023-10-11 | #31322 | Cap log line length to 32KB to prevent loss of records |
0.1.7 | 2023-10-10 | #31194 | Deallocate unused per stream buffer memory when empty |
0.1.6 | 2023-10-10 | #31083 | Fix precision of numeric values in async destinations |
0.1.5 | 2023-10-09 | #31196 | Update typo in CDK (CDN_LSN -> CDC_LSN) |
0.1.4 | 2023-10-06 | #31139 | Reduce async buffer |
0.1.1 | 2023-09-28 | #30835 | JDBC destinations now avoid staging area name collisions by using the raw table name as the stage name. (previously we used the stream name as the stage name) |
0.1.0 | 2023-09-27 | #30445 | First launch, including shared classes for all connectors. |
0.0.2 | 2023-08-21 | #28687 | Version bump only (no other changes). |
0.0.1 | 2023-08-08 | #28687 | Initial release for testing. |