-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add and persist job failures for Normalization #14790
Conversation
# Conflicts: # airbyte-integrations/bases/base-normalization/Dockerfile
.withType(AirbyteTraceMessage.Type.ERROR) | ||
.withEmittedAt((double) System.currentTimeMillis()) | ||
.withError(new AirbyteErrorTraceMessage() | ||
.withFailureType(FailureType.SYSTEM_ERROR) // TODO: decide on best FailureType for this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should use DATA_ERROR
or even more specifically DBT_ERROR
here?
The reason for an error from dbt is not necessarily clear, e.g. it could be a problem with the source data or it could be a system error from a bug we've introduced or it could be an issue with the destination (and other ors)...
|
||
private final MdcScope.Builder containerLogMdcBuilder; | ||
private final Logger logger; | ||
private final List<String> dbtErrors = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so sure about storing dbtErrors as data within an instance of this object, does that seem fine or is there a better approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the getter use anywhere? creating it in the create
method and passing it as a parameter will remove any potential side effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the getter is used here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit comments
final String logLevel = jsonLine.get("level").isNull() ? "" : jsonLine.get("level").asText(); | ||
final String logMsg = jsonLine.get("msg").isNull() ? "" : jsonLine.get("msg").asText(); | ||
try (final var mdcScope = containerLogMdcBuilder.build()) { | ||
switch (logLevel) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should we do with the log lines without level? Ignore them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a required field from dbt and any other json line should be an AirbyteMessage, however I've added a case to send the line to info log so we capture the unexpected.
return bufferedReader | ||
.lines() | ||
.flatMap(line -> { | ||
final Optional<JsonNode> jsonLine = Jsons.tryDeserialize(line); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we extract the flatMap function and us a descriptive name for them? Like
.flatMap(printAndFilterNonJsonLines)
.flatMap(printAndFilterNonAirbyteMessageLines)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
if (m.isEmpty()) { | ||
// valid JSON but not an AirbyteMessage, so we assume this is a dbt json log | ||
try { | ||
final String logLevel = jsonLine.get("level").isNull() ? "" : jsonLine.get("level").asText(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we double check that it is returning a NullNode or a null value, the doc is confusing about that. (I feel that it differs between a value set as null and no value):
Method for accessing value of the specified element of an array node. For other nodes, null is always returned.
For array nodes, index specifies exact location within array and allows for efficient iteration over child elements (underlying storage is guaranteed to be efficiently indexable, i.e. has random-access to elements). If index is less than 0, or equal-or-greater than node.size(), null is returned; no exception is thrown for any index.
NOTE: if the element value has been explicitly set as null (which is different from removal!), a com.fasterxml.jackson.databind.node.NullNode will be returned, not null.
Returns:
Node that represent value of the specified element, if this node is an array and has specified element. Null otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added check for JsonNodeType.NULL
check_dbt_event_buffer_size | ||
if [ "$ret" -eq 0 ]; then | ||
echo -e "\nDBT >=1.0.0 detected; using 10K event buffer size\n" | ||
dbt_additional_args="--event-buffer-size=10000" | ||
dbt_additional_args="--event-buffer-size=10000 --log-format json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's way better than trying to parse line breaks!
msg = "Something went wrong while transforming the catalog in Normalization. See the logs for more details." | ||
raise AirbyteTracedException(str(e), msg, exception=e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this message be shown to users (AirbyteTracedException), perhaps:
Something went wrong while normalizing the data moved in this sync. See the logs for more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how much detail do we want in those user-facing messages? I think it's good to state that the error happened in catalog transformation, i.e. we didn't even try to touch the "big data" at all:
Something went wrong while normalizing the data moved in this sync (failed to transform catalog into dbt project). See the logs for more details.
(admittedly: this is a bit bikesheddy; ideally this never happens because we have really solid test coverage on the two transform_*.py
files, and they don't interact with the destination warehouse/db at all)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm biased towards differentiating when we can so used @edgao's suggestion.
normalizationRunner.getTraceMessages() | ||
.forEach(traceMessage -> traceFailureReasons.add(FailureHelper.normalizationFailure(traceMessage, Long.valueOf(jobId), attempt))); | ||
failed = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 there can be more than one trace message
Not that we have any yet, but there can be trace messages that are not failureReasons... do you need to filter this array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call, added filtering for ERROR traces.
String errorTraceString = """ | ||
{"type": "TRACE", "trace": { | ||
"type": "ERROR", "emitted_at": 123.0, "error": { | ||
"message": "Something went wrong in normalization.", "internal_message": "internal msg", | ||
"stack_trace": "abc.xyz", "failure_type": "system_error"}}} | ||
""".replace("\n", ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 testing the parsing of the dbt error message itself
@edgao I've noticed recent PRs that publish normalization are running tests for the separate destinations... do I need to do that here before I publish it? |
/test connector=connectors/destination-snowflake
Build PassedTest summary info:
|
/test connector=connectors/destination-postgres
Build FailedTest summary info:
|
/test connector=connectors/destination-bigquery
Build PassedTest summary info:
|
good call, that slipped my mind. |
# Conflicts: # docs/understanding-airbyte/basic-normalization.md
^ postgres failure above we think was caused by this PR. Going to publish normalization as this PR doesn't touch much of base-normalization and passes normalization + non-postgres dest tests. |
/test connector=bases/base-normalization
Build PassedTest summary info:
|
/publish connector=bases/base-normalization
if you have connectors that successfully published but failed definition generation, follow step 4 here |
/publish connector=bases/base-normalization
if you have connectors that successfully published but failed definition generation, follow step 4 here |
* added TracedException and uncaught exception handler * added trace message capturing * added tests for TRACE messages * pre-json logging * propagating normalization failures * log format json & fix hang * parsing dbt json logs * bump normalization version * tests * Benoit comments * update trace exception user message * review comments * bump version * bump version * review comments * nit comments * add normalization trace failure test * version bump * pmd * formatto * bump version
* 🎉 Source Recharge: increase `unit_test` cov, fix schemas (#14902) * Fix formatting (#14968) * Validate only on incremental (#14966) * Validate only on incremental * Add test * Format and pmd * Update test * PR comments * Update airbyte-protocol/protocol-models/src/test/java/io/airbyte/protocol/models/CatalogHelpersTest.java Co-authored-by: Lake Mossman <lake@airbyte.io> * Fix test Co-authored-by: Lake Mossman <lake@airbyte.io> * fix build: update mysqlsource with new constructor (#14974) * chore: add elasticsearch to documentation (#14948) * Upgrade platform to openjdk:19-slim-bullseye (#14971) The openjdk 17 image has multiple critical vulnerabilities. This is no longer updated frequently since 17 is now in LTS and not actively developed on. The JDK 19 images do not have any critical or high or medium vulnerabilities. Further, JVM has bytecode backwards compatibility guarantees. Here we update all of Cloud to use openjdk 19 slim to fix these holes. Leave gradle code compilation at 17 for now. Co-authored-by: Davin Chia <davinchia@gmail.com> * Update chart.yaml readme. (#14975) The release action is failing because https://github.com/airbytehq/airbyte/runs/7477005935?check_suite_focus=true some values are not up to date. * 📝 improve error message when using unsupported JDBC type as cursor (#14714) * improve error message * also in JdbcSourceOperations * bump versions + changelog * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Normalization for Snowflake destination: added support for key pair authentication (#14792) * Normalization for Snowflake destination: added support for key pair authentication * Normalization for Snowflake destination: updated changelogs * Normalization for Snowflake destination: renamed property passphrase to password * Normalization for Snowflake destination: bump normalization version in NormalizationRunnerFactory * Normalization for Snowflake destination: added unit tests and change file creating process * Normalization for Snowflake destination: added unit tests and change file creating process * Normalization for Snowflake destination: added unit tests and change file creating process * Bump Airbyte version from 0.39.37-alpha to 0.39.38-alpha (#14976) Co-authored-by: davinchia <davinchia@users.noreply.github.com> * 🐛 Source Recharge: fix `additionaProperties` in spec.json (#14978) * 🎉Source Snowflake: Source/Destination doesn't respect DATE data type (#14828) airbyte-5577: Respect date/datetime types for snowflake source. * Source TiDb Removed additionalProperties: false from spec (#14996) * Source TiDb Removed additionalProperties: false from spec * updated changelog * bump version * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Bugfix: sweep-pod.sh never deleting certain Failed pods (#14925) (#14931) * Make sure Airbyte release process uses JDK 19. (#14993) Follow up to #14971 . Make sure to update this for the OSS publishing process as well. Also update all the dockerfiles. Connectors are not touched. * Add callout tag for on-going known issue (#14893) * Add callout tag for on-going known issue Airbyte is currently working on resolving this issue with the Amazon Seller Partner connector. The issue is being tracked here: https://github.com/airbytehq/airbyte/issues/14734 * Update callout type * Update to use caution * Edited wording -- remove the word "Note" * Fixed some minor typos and formatting (#13577) * Fixed some minor typos and formatting * Update based on comments * Update with changes (#14359) * Update with changes - Update from Slack to Discourse forum - Remove mentions of Singer including broken examples * Update based on comments * Add source type to SourceDefinitionRead (#14967) * fix: fix airbyte-worker (#14977) * fix: fix airbyte-worker * bump temporalio version to 1.13.0 * fix: revert removal of deploymentMode variable * Remove old roadmap page in docs (#14814) * Bump Airbyte version from 0.39.38-alpha to 0.39.39-alpha (#15005) Co-authored-by: davinchia <davinchia@users.noreply.github.com> * fix: airbyte-webapp/Dockerfile to reduce vulnerabilities (#15023) The following vulnerabilities are fixed with an upgrade: - https://snyk.io/vuln/SNYK-ALPINE313-BUSYBOX-2953337 - https://snyk.io/vuln/SNYK-ALPINE313-BUSYBOX-2953337 - https://snyk.io/vuln/SNYK-ALPINE313-OPENSSL-1569448 - https://snyk.io/vuln/SNYK-ALPINE313-OPENSSL-2941811 - https://snyk.io/vuln/SNYK-ALPINE313-OPENSSL-2941811 * don't set workdir to /data when running SAT (#15024) * don't set workdir to /data * remove debug log * bump sat version (#15026) * Alex/lowcode referencedocs (#14973) * Add docstrings for auth package * docstrings for the check package * docstrings for the datetime package * docstrings for the decoder package * docstrings for extractors package and fix tests * interpolation docstrings * ref -> and parser docstrings * docstrings for parsers package * error handler docstrings * requester docstrings * more docstrings * docstrings * docstrings * docstrings * Use defined type annotations * update * update docstrings * Update docstrings * update docstrings * update docstrings * update template * Revert "update template" This reverts commit eb4a11858b2ffcd86cda430a78fc4215590f84f0. * update template * update * move to interpolated_string * update docstring * update * fix tests * format * return type can also be an array * Update airbyte-cdk/python/airbyte_cdk/sources/declarative/interpolation/interpolated_boolean.py Co-authored-by: Sherif A. Nada <snadalive@gmail.com> * Update airbyte-cdk/python/airbyte_cdk/sources/declarative/interpolation/interpolation.py Co-authored-by: Sherif A. Nada <snadalive@gmail.com> * Update airbyte-cdk/python/airbyte_cdk/sources/declarative/interpolation/jinja.py Co-authored-by: Sherif A. Nada <snadalive@gmail.com> * Update airbyte-cdk/python/airbyte_cdk/sources/declarative/interpolation/interpolated_boolean.py Co-authored-by: Sherif A. Nada <snadalive@gmail.com> * Update airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/error_handlers/backoff_strategy.py * Update as per comments Co-authored-by: Sherif A. Nada <snadalive@gmail.com> * Use latest Nginx Alpine image for Webapp. (#15029) Follow up to #15023. Use the latest to get rid of all the security vulnerabilities. * Remove insecure curl from worker image. (#15028) Curl was the last remaining security vulnerability in the image. Instead of using curl, use wget to avoid this issue. This also has the side effect of decreasing the image size by 150 MB. * [low-code CDK] Enable runtime string interpolation in authenticators (#14914) * interpolatedauth * fix tests * fix import * no need for default * Bump version * Missing docstrings * example * missing example * more docstrings * interpolated types * 13539 Fix integration tests source-clickhouse Mac OS (#14701) * 13539 Fix integration tests source-clickhouse Mac OS * 13539 Updated clickhouse jdbc driver * 13539 Updated destination-clickhouse-strict-encrypt * 13539 Updated SSL configuration and tests for clickhouse-destination * 13539 Updated SSL for source-clickhouse-strict-encrypt * 13539 Resolved host by ip * 13539 Fixed code formatting * 13539 Bump up source-clickhouse-strict-encrypt version * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Source Klaviyo: added new keys to schema (#14947) * feat: added new keys to schema * Rename Campaigns.json to campaigns.json * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Source Hubspot: revert v0.1.75 changes (#14999) * Revert "Source Hubspot: do not override _read_incremental (#14744)" This reverts commit ae0cf4cb3431776a4b28081b244d30d7bfa1e8f1. * #14034 source hubspot: revert previous version changes * #14034 upd changelog * #14034 do not revert source definitions * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Make Temporal workflows use new schema for scheduling to support cronstrings (#14873) * #14048 source klaviyo: align the docs with the standard template (#15034) * Source Hubspot: do not limit reading data to 30 days for property History Stream (#15035) * #352 source hubspot: do not limit reading data to 30 days for property history stream * #352 oncall: source hubspot - upd changelog * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * :tada: Base Normalization: handle airbyte_type from stream schema in normalization (#13591) * add datatypes * up * up * add MySQL * add MSSQL * fix * add macros * add macros * upd * upd * upd for clickhouse * Return datetime2 for MS SQL * Upd time type for mysql * Upd datetime for MySQL * update * upd date type for clickhouse * up * auto-generate * bump version * bump version * Stringify experiments object for segment (#14960) * Source Facebook Marketing: update sdk version to 14.0.0 (#15007) * update sdk version and docs * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * feat: Update docs with Helm related stuff (#15037) * feat: Update docs with Helm related stuff * Update docs/deploying-airbyte/on-kubernetes-via-helm.md Co-authored-by: Topher Lubaway <asimplechris@gmail.com> * Update docs/deploying-airbyte/on-kubernetes-via-helm.md Co-authored-by: Topher Lubaway <asimplechris@gmail.com> * Update docs/deploying-airbyte/on-kubernetes-via-helm.md Co-authored-by: Topher Lubaway <asimplechris@gmail.com> * Update docs/deploying-airbyte/on-kubernetes-via-helm.md Co-authored-by: Topher Lubaway <asimplechris@gmail.com> * Update docs/deploying-airbyte/on-kubernetes-via-helm.md Co-authored-by: Topher Lubaway <asimplechris@gmail.com> Co-authored-by: Topher Lubaway <asimplechris@gmail.com> * Log stream_instance's metadata (#15025) * Log stream_instance's metadata * syncmode is only set on ConfiguredStream * log both configured stream and stream instance * non-jdbc source connectors: Update additional properties from beta/GA specs and schemas (#15042) * Update `additionalProperties` field to true from schemas * Updated PR number * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * 🐛 Source Facebook Marketing: fix `DATA_RETENTION_PERIOD` validation and schema data type `failed_delivery_checks` issues (#15012) * Fix date validation and schema issues * Updated PR number * Updated to review * Updated to review * Update airbyte-integrations/connectors/source-facebook-marketing/unit_tests/test_utils.py Co-authored-by: Pedro S. Lopez <pedroslopez@me.com> * Update airbyte-integrations/connectors/source-facebook-marketing/unit_tests/test_utils.py Co-authored-by: Pedro S. Lopez <pedroslopez@me.com> * Fix to linter * Fix typo * Updated Docker version * auto-bump connector version [ci skip] Co-authored-by: Pedro S. Lopez <pedroslopez@me.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Source-posthog: manually bumps up version. (#15045) * Manually bumps up PostHog source version. * Updates Persons schema and spec.json * Updates version in the correct file. * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * reconcile junit version (#15047) * Handle ints and longs in normalization (#14362) * generate airbyte_type:integer * normalization accepts `airbyte_type: integer` * handles ints+longs * update avro for consistency * delete long type for now, treat all ints as longs * update avro type mappings {type:number, airbyte_type:integer} -> long {type:number, airbyte_type:big_integer} -> string (i.e. "unbounded integer") * fix test * remove long handling * Revert "remove long handling" This reverts commit 33ade8d2831e675c3545ac6019d200ec312e54d9. * Revert "update avro type mappings" This reverts commit 5b0349badad7545efe8e1191291a628445fe1c84. * Revert "delete long type for now, treat all ints as longs" This reverts commit 018efd4a5d0c59f392fd8e3b0d0967c666b72947. * Revert "update avro for consistency" This reverts commit bcf47c6799b5906deb4f219d7f6e64ea73b41b74. * newline@eof * update test * slightly better local tests * fix test * missed a few cases * postgres tests use correct hostnames * fix normalization * fix int macro * add test case * normalization test output * handle int/long correctly * fix types for other DBs * uint32 -> bigint; tests * add type value assertions * more test updates * regenerate output * reconcile big_integer to match docs * update comment * fix type * fix mysql constructor call * bigint only has 38 digits * fix s3 ints, fix DAT test case * big_integer should be string * reduce to 28 digit big_ints * fix test setup, mysql * kill big_integer tests * regenerate output * version bumps * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Bump Airbyte version from 0.39.39-alpha to 0.39.40-alpha (#15046) Co-authored-by: davinchia <davinchia@users.noreply.github.com> Co-authored-by: Davin Chia <davinchia@gmail.com> * 🎉 New source connector: Glassfrog (#13868) * Add Glassfrog native connector * fix: tests and formatting * chore: added connector to definitions * Add Glassfrog native connector * fix: tests and formatting * chore: added connector to definitions * fix: tests and formatting * auto-bump connector version [ci skip] Co-authored-by: Harshith Mullapudi <harshithmullapudi@gmail.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * source klaviyo - fix typo in docs (#15060) * 🐛Destination-clickhouse: enabled and fixed tests for normalization (#14783) [12582] destination-clickhouse: enabled normalization tests * [10719] Destination Oracle: custom JDBC parameters (#13841) * [10719] Destination Oracle: custom JDBC parameters * [10719] Destination Oracle: custom JDBC parameters fixed tests * [10719] Destination Oracle: custom JDBC parameters fixed tests * [10719] Destination Oracle: custom JDBC parameters fix for SSH oracle tests * [10719] Destination Oracle: custom JDBC parameters fixed test * [10719] Destination Oracle: custom JDBC parameters updated image tag * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * fixed (#15065) * update link to the roadmap in doc (#15062) * 🐛 🎉 Source PayPal Transaction: added `OAuth2.0`, fixed bug with normalization (#15000) * [low-code connectors] Handle 200 responses with error (#15055) * Handle 200 responses with error * missing file * fix: upgrade prism-react-renderer from 1.3.3 to 1.3.5 (#14980) Snyk has created this PR to upgrade prism-react-renderer from 1.3.3 to 1.3.5. See this package in npm: See this project in Snyk: https://app.snyk.io/org/davinchia/project/50ac983e-6d39-4eda-a744-e51fe8873952?utm_source=github&utm_medium=referral&page=upgrade-pr Co-authored-by: Topher Lubaway <asimplechris@gmail.com> * fix build (#15068) * Display new per-stream and global state to users when viewing connection settings (#15020) * Display new per-stream and global state to users * simplify method * remove a line break * Add addtional useMemo triggers * lint fixes? * nits picked * 🐛Destination-clickhouse-strict-encrypt: enabled normalization tests (#15069) * [14858] Destination-clickhouse-strict-encrypt: enabled normalization tests * fix acceptance tests with updated int type (#15078) * Source Zendesk Support: Convert `ticket_audits.previous_value` values to string (#15036) Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com> * Error silence (#15073) * Error silence * PR comments * update object as well * Release per stream to the OSS project (#15008) * Destination AWS Datalake: documentation update to match Airbyte template (#13716) * Documentation update to match Airbyte template * Update AWS Datalake doc * Bump Airbyte version from 0.39.40-alpha to 0.39.41-alpha (#15085) Co-authored-by: girarda <girarda@users.noreply.github.com> * Correct location of AWS configuration for S3 logs (#14630) The location of the access key id and secret key in the kubernetes config files is `.secrets`, not `.env` * Destination MongoDB: use SHA256 instead of MD5 (#14561) * fix: use sha256 instead of md5 * bump connector version * match strict-encrypt version with oss * auto-bump connector version [ci skip] Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * 🎉 Base Normalization: quote schema name to allow reserved keywords (#14683) Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com> * [low-code connectors] Add request options and state to stream slicers (#14552) * comment * comment * comments * fix * test for instantiating chain retrier * fix parsing * cleanup * fix * reset * never raise on http error * remove print * comment * comment * comment * comment * remove prints * add declarative stream to registry * start working on limit paginator * support for offset pagination * tests * move limit value * extract request option * boilerplate * page increment * delete offset paginator * update conditional paginator * refactor and fix test * fix test * small fix * Delete dead code * Add docstrings * quick fix * exponential backoff * fix test * fix * delete unused properties * fix * missing unit tests * uppercase * docstrings * rename to success * compare full request instead of just url * renmae module * rename test file * rename interface * rename default retrier * rename to compositeerrorhandler * fix missing renames * move action to filter * str -> minmaxdatetime * small fixes * plural * add example * handle header variations * also fix wait time from * allow using a regex to extract the value * group() * docstring * add docs * update comment * docstrings * fix tests * rename param * cleanup stop_condition * cleanup * Add examples * interpolated pagination strategy * dont need duplicate class * docstrings * more docstrings * docstrings * fix tests * first pass at substream * seems to work for a single stream * can also be defined in requester with stream_state * tmp update * update comment * Update airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/http_requester.py Co-authored-by: Sherif A. Nada <snadalive@gmail.com> * version: Update Parquet library to latest release (#14502) The upstream Parquet library that is currently pinned for use in the S3 destination plugin is over a year old. The current version is generating invalid schemas for date-time with time-zone fields which appears to be addressed in the `1.12.3` release of the library in commit https://github.com/apache/parquet-mr/commit/c72862b61399ff516e968fbd02885e573d4be81c * merge * 🎉 Source Github: improve schema for stream `pull_request_commits` added "null" (#14613) Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com> * Docs: Fixed broken links (#14622) * fixing broken links * more broken links * source-hubspot: change mentioning of Mailchimp into HubSpot doc (#14620) * Helm Chart: Add external temporal option (#14597) * conflict env configmap and chart lock * reverting lock * add eof lines and documentation on values yaml * conflict json file * rollback json * solve conflict * correct minio with new version Co-authored-by: Guy Feldman <gfeldman@86labs.com> * 🎉 Add YAML format to source-file reader (#14588) * Add yaml reader * Update docs * Bumpversion of connector * bump docs * Update pyarrow dependency * Upgrade pandas dependency * auto-bump connector version Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * :tada: Source Okta: add GroupMembers stream (#14380) * add Group_Members stream to okta source - Group_Members return a list of users, the same schema of Users stream. - Create a shared schema users, and both group_members and users sechema use it as a reference. - Add Group_Members stream to source connector * add tests and fix logs schema - fix the test error: None is not one of enums though the enum type includes both string and null, it comes from json schema validator https://github.com/python-jsonschema/jsonschema/blob/ddb87afad8f5d5c40600b5ede0ab96e4d4bdf7d3/jsonschema/_validators.py#L279-L285 - change grouop_members to use id as the cursor field since `filter` is not supported in the query string - fix the abnormal state test on logs stream, when since is abnormally large, until has to defined, an equal or a larger value - remove logs stream from full sync test, because 2 full sync always has a gap -- at least a new log about users or groups api. * last polish before submit the PR - bump docker version - update changelog - add the right abnormal value for logs stream - correct the sample catalog * address comments:: - improve comments for until parameter under the logs stream - add use_cache on groupMembers * add use_cache to Group_Members * change configured_catalog to test * auto-bump connector version Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * split test files * renames * missing unit test * add missing unit tests * rename * assert isinstance * start extracting to their own files * use final instead of classmethod * assert we retry 429 errors * Add log * replace asserts with valueexceptions * delete superfluous print statement * only accept minmaxdatetime * fix factory so we don't need to union everything with strings * get class_name from type * remove from class types registry * process error handlers one at a time * sort * delete print statement * comment * comment * format * delete unused file * comment * interpolatedboolean * comment * not optional * not optional * unit tests * fix request body data * add test * move file to right module * update * reset to master * format * rename to pass_by * rename to page size * fix * fix some tests * reset * fix * fix some of the tests * fix test * fix more tests * all tests pass * path is not optional * reset * reset * reset * delete print * remove prints * delete duplicate method * add test * fix body data * delete extra newlines * move to subpackage * fix imports * handle str body data * simplify * Update tests * filter dates before stream state * Revert "Update tests" This reverts commit c0808c8009c850848e18c4f0190ae5f26b9c086a. * update * fix test * state management * add test * delete dead code * update cursor * update cursor cartesian * delete unused state class * fix * missing test * update cursor substreams * missing test * fix typing * fix typing * delete unused field * delete unused method * update datetime stream slice * cleanup * assert * request options * request option cartesian * assert when passing by path * request options for substreams * always return a map * pass stream_state * refactor and almost done fixing tests * fix tests * rename to inject_into * only accept enum * delete conditional paginator * only return body data * missing test * update docstrings * update docstrings * update comment * rename * tests * class_name -> type * improve interface * fix some of the tests * fix more of the tests * fix tests * reset * reset * Revert "reset" This reverts commit eb9a918a095a22c6849d50f8881589a1b58a9309. * remove extra argument * docstring * update * delete unused file * reset * reset * rename * fix timewindow * create InterpolatedString * helper method * assert on request option * better asserts * format * docstrings * docstrings * remove optional from type hint * Update airbyte-cdk/python/airbyte_cdk/sources/declarative/stream_slicers/cartesian_product_stream_slicer.py Co-authored-by: Sherif A. Nada <snadalive@gmail.com> * inherit from request options provider * inherit from request options provider * remove optional from type hint * remove extra parameter * none check Co-authored-by: Sherif A. Nada <snadalive@gmail.com> Co-authored-by: Tobias Macey <tmacey@boundlessnotions.com> Co-authored-by: Serhii Chvaliuk <grubberr@gmail.com> Co-authored-by: Amruta Ranade <11484018+Amruta-Ranade@users.noreply.github.com> Co-authored-by: Bas Beelen <bjgbeelen@gmail.com> Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com> Co-authored-by: Guy Feldman <gfeldman@86labs.com> Co-authored-by: Christophe Duong <christophe.duong@gmail.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> Co-authored-by: Yiyang Li <yiyangli2010@gmail.com> Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> * All objects in the Airbyte Proticol have `additionalProperties: true` (#15081) * All objects in the Airbyte Proticol have `additionalProperties: true` * order of keys * rebuild airbyte proticol for python CDK * 📝 fix google analytics documentation urls (#15087) * update documentationUrl to point to universal analytics docs * fix google-analytics-v4 * fix google-analytics-data-api * fix source definitions * auto-bump connector version [ci skip] * remove additionalProperties from spec * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * #14048 Source Klaviyo: update release stage (#15083) * 🎉 Source Okta: return deprovisioned users (#15001) * add statuses to user filtering * bump version * upd * add comment * auto-bump connector version [ci skip] * format Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * 🪟 🎨 Trial end banner (#14913) * alert banner * cleanup * cleanup * review cleanup * Minor cleanup Co-authored-by: Tim Roes <tim@airbyte.io> * 🪟 🎨 Update layout of create connection and replication settings pages to mach design (#14946) * Update ConnectionForm to match design * Split sections into cards * Update connection name label to heading * Move create/edit buttons outside of card * Fix spacing to match design * Remove noTitles prop from CreateConnectionContent component * Update OperationsSection title to match the rest of the sections * Update connection name text to show that it's required * Divide streams into sections, center dont worry teext * Wrap SyncCatalogField in a div to prevent style impact from parent * Remove don't worry message from create connection page * Update resizing of left fields and right fields to match design. Now fields should stop at certain width of page * Move replication and tarnsformation section in connection form to its own card * Move OperationSection within same render condition as CreateControls * Apply design changes to main page with scroll, connection page title, and status main info * Update main page with scroll to only scroll x on the content, not the whole page * Update connection page title to all SCSS, add xl horizontal padding to match content * Update status main info to be more flexible on window resize * Remove extra div from status view that was adding extra horizontal margin to the card * Fix minor issues Co-authored-by: Tim Roes <tim@airbyte.io> * 🐛 Source Google Ads: Fix wrong schema for `ad_group_criterion.topic.path` and shifted `Campaigns` stream to non-managers stream list (#15084) * Fix wrong schema for ad_group_criterion.topic.path * Shifted campaingns stream to non-manager streams * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * 🎉Source PayPal Transactions: Updated docs (#15105) * Updated docs * Updated docs * Snowflake destination: support key pair authentication (#14388) * Snowflake destination: support key pair authentication. * Snowflake destination: update docs * Snowflake destination: support key pair authentication for normalization, added tests * Snowflake destination: update normalization * Snowflake destination: update way of read secrets for test * Snowflake destination: moved test to another class for test purpose * Snowflake destination: update secrets for test purpose * Snowflake destination: revert changes added for test purpose * Snowflake destination: changes added for test purpose * Snowflake destination: updated required fields in specs * Snowflake destination: clean up * Snowflake destination: support encrypted key pair authentication (#14589) * Snowflake destination: add support for ecrypted private key * Snowflake destination: temp reverting for test purpose * Revert "Snowflake destination: temp reverting for test purpose" This reverts commit 260ecce6da5c52a82257862e6eb06db59bfac9bc. * Snowflake destination: add passphrase and remove auth_type from required fields * Snowflake destination: format code * Snowflake destination: clean up * Snowflake destination: update docs * Normalization for Snowflake destination: added unit tests and change file creating process * Normalization for Snowflake destination: renamed property passphrase to password * Snowflake destination: apply changes from normalization * Snowflake destination: clean code * Snowflake destination: clean up * auto-bump connector version [ci skip] Co-authored-by: Edward Gao <edward.gao@airbyte.io> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * [low-code-connectors] Disable parse-time interpolation in favor of runtime-only (#14923) * abstract auth token * basichttp * remove prints * docstrings * get rid of parse-time interpolation * always pass options through * delete print * delete misleading comment * delete note * reset * pass down options * delete duplicate file * missing test * refactor test * rename to '$options' * rename to '' * interpolatedauth * fix tests * fix * docstrings * update docstring * docstring * update docstring * remove extra field * undo * rename to runtime_parameters * docstring * update * / -> * * update template * rename to options * Add examples * update docstring * Update test * newlines * rename kwargs to options * options init param * delete duplicate line * type hints * update docstring * Revert "delete duplicate line" This reverts commit 4255d5b3469fbd426be103a215e9ba172abdfbef. * delete duplicate code from bad merge * rename file * bump cdk version * fix: airbyte-integrations/connector-templates/source-python-http-api/Dockerfile to reduce vulnerabilities (#15111) The following vulnerabilities are fixed with an upgrade: - https://snyk.io/vuln/SNYK-ALPINE315-BUSYBOX-2440607 - https://snyk.io/vuln/SNYK-ALPINE315-BUSYBOX-2440607 - https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-2941810 - https://snyk.io/vuln/SNYK-ALPINE315-OPENSSL-2941810 - https://snyk.io/vuln/SNYK-ALPINE315-ZLIB-2434420 Co-authored-by: snyk-bot <snyk-bot@snyk.io> * Source Hubspot: implement new stream to read associations in incremental mode (#15099) * #359 oncall - source hubspot: implement new stream to read associations in incremental mode * #359 source hubspot: upd changelog * #359 source hubspot: do not pass identifiers * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Source PayPal Transactions: increase unit tests (#15098) * add unit tests * add unit tests * up * bump version * upd * upd * revert bump version * Source Hubspot: revert v0.1.78 (#15144) * Revert "Source Hubspot: implement new stream to read associations in incremental mode (#15099)" This reverts commit dd109debec80156c977a66b30c3cf2dc9b808844. * #359 rollback * hubspot: upd changelog * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * 🎉 Source Github: bugfix schemas for streams `deployments`, `workflow_runs`, `teams` (#15049) Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com> * Fix yaml formatting error (#15148) * pin flake8==4.0.1 (#15155) Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com> * Docs: Fix link to roadmap (#15103) Replace the broken markdown link and add an external link to the roadmap. * fix: Fix extraEnv, Update OSS charts (#15159) * fix: Add new labels to all charts + fix indent issue for extraEnv * fix: Update docs, add extraContainers parameter into dependent charts * GitHub Actions - workflow `Connector Integration Tests` - add retry (#14452) Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com> * source-S3: Support JSON format (#14213) * json format support added * json format support added * code formatted * format convertion changed * format naming convertion changed * test cased issue fixed * test case issued resolved * sample file and config added for integration tests * Json doc added Json doc added * update * sample file and config added for integration tests * sample file and config added for integration tests * update jsonl files * review 1 * review 1 * review 1 * pyarrow version upgrade * clean integration test folder architecture * add timestamp record to simple_test.jsonl * fixed integration test and parser review change * simplify table read * doc update * fix specs * user sample files * fix sample files * add newlines at end of files * rename json parser * rename jsonfile to jsonlfile * schema inference added * patch review fix * Update docs/integrations/sources/s3.md doc update Co-authored-by: George Claireaux <george@airbyte.io> * changing the version * changing the title to sync with other type * fix expected csv records * fix expected records for avro and parquet * review fix * fixed master schema handling * remove sample configs * fix expected records * json doc update added more details on json parser * fixed api name * bump version * auto-bump connector version [ci skip] Co-authored-by: alafanechere <augustin.lafanechere@gmail.com> Co-authored-by: George Claireaux <george@airbyte.io> Co-authored-by: George Claireaux <george@claireaux.co.uk> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Fix terminate workflow by using JQ Args. (#15160) This is failing today since we aren't injecting the bash variables correctly into JQ. This PR makes it so. We also increase the time limit to 4 hours to accommodate long running test jobs. * AnalyticsService event cleanup (#15142) * AnalyticsService event cleanup * Remove dead generics * Fix broken parameters * fix tests (#15133) * fix tests * fix redshift test * fix HikariPool error on postgres destination integration tests (#15165) * add ed's hikari fix * fix maybe? * remove debug lines * revert bandaid * Degenderize sign-up quote wording (#15164) * Add metrics for temporal workflow resets (#15016) * Add metrics for temporal workflow resets * Properly record metric attributes * PR feedback * Formatting * Record general workflow attempts/failures * Refactor metric methods * PR feedback * Formatting * Updated Postgres doc (#15166) * Fix callout formatting (#15162) * Fix callout formatting * Updating copy * Fixed typo Co-authored-by: Yowan Ramchoreeter <26179814+YowanR@users.noreply.github.com> * Deleted careers and open positions from Docs (#15171) * Docs cleanup (#15174) * Deleted extra pages + removed changelogs from the navigation bar * fixed broken link * publish source postgres+mysql strict-encrypt (#15176) * :bug: Source shortio: Changing links primary key (#15066) * Changing primary key in links stream and some small cleanups throughout the source * Adding unit tests and applied gradle formatting * update doc * auto-bump connector version [ci skip] Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Revert "Release per stream to the OSS project (#15008)" (#15177) This reverts commit 29fc124ee75a9057525020500e7b171c5bb38501. * fix build: re-generate scaffold connectors (#15175) * Export temporal metrics to datadog (#14842) * add error code to ManualOperationResult * fix a bug * support temporal metrics * metrics in temporal * use statsd * wrap otel config to temporal metric export * use http port 4318 for otlp exporter * simpilfy to support dd only * use /v1/metrics for endpoint * use statsd * fix * remove unused func * wrap up implementation to export temporal metrics to datadog * use deps.toml to wrap up the dependency * move to metric client factory * fix pmd error * pmd, comment fix * pr comment fix * Delete state object from template (#15178) * SAT: retrieve previous connector spec and create test to run checks against it (#14954) * Source Hubspot: fix 401 for associations (#15156) * Revert "Source Hubspot: revert v0.1.78 (#15144)" This reverts commit cbdb897aa12af8272d4cacc5370fd96ade6a98e2. * #379 source hubspot: fix 401 when reading associations * #379 source hubspot: fix 401 when reading associations * #379 source hubspot: upd changelog * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * 🎉 Source Github: PullRequestCommentReactions - re-implemented using GraphQL (#14795) Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com> * 🐛 Source Okta: fix for failed stream on `Json Validation` NPE (#15179) * report synchronous check/spec/discover failures to JobErrorReporter (#14818) * report failures for synchronous check/discover, refactor common logic * allow null workspace, send spec errors * add failure origin, format * rm connector_type, fix failing tests * add tests for other job types * log instead of throw * move swallow to common spot * connector jobs use context instead of passing full config * sync jobs use context instead of passing raw config * fix failing test * fix failing scheduler client test * Add and persist job failures for Normalization (#14790) * added TracedException and uncaught exception handler * added trace message capturing * added tests for TRACE messages * pre-json logging * propagating normalization failures * log format json & fix hang * parsing dbt json logs * bump normalization version * tests * Benoit comments * update trace exception user message * review comments * bump version * bump version * review comments * nit comments * add normalization trace failure test * version bump * pmd * formatto * bump version * Fix schema file path (#15209) * Generate reference docs source (#15183) * Fix bq standard mode (#15180) * Fix * Add version and changelog * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * fix millisecond error (#15203) * Source Stripe: external account streams (#14357) * added external account streams * fix metadata type and formatting * consolidate types to one line * fix api docs link * auto-bump connector version [ci skip] * format files Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> * 🪟 🐛 Fix out of credits banner (#15216) * fix out of credits banner * change intl string id * show trial expiration over credit issues * Removed outdated content (#15222) * Fix debug output for banner (#15223) * docs: add a note about how our build reports are persisted * Fix Validate JdbcUrls with additional test (#15190) * Fixed uncalled jdbcUrl validation and added test for exception * Removed unused constants * Converted assertCustomParamtersDontOverwriteDefaultParameters to protected static for testing and host/port retrieval * source-acceptance-test added (#15237) Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com> * Updated connector status page (#15240) * Update README.md * 🎉 Source Oracle: Use Service Name to connect to database (#14953) * add service_name as second connection option * fix test * merge with master * bump version * bump strict-encrypt * fixed test * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Update bigquery.md * Fix multiply log bindings (#14801) * Fix multiply log bindings * Exclude slf4j-reload4j * Exclude slf4j-log4j12 for debezium * Increase version for debezium related sources and json converter related destinations * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * rebump bigquery versions * mark destinations s3, gcs as unpublishd * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] Co-authored-by: subodh <subodh1810@gmail.com> Co-authored-by: Greg Solovyev <grishick@users.noreply.github.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> Co-authored-by: Edward Gao <edward.gao@airbyte.io> * Dual write old and new schedule schemas (#15039) * dual write old and new schedule schemas * validate that the old and new schedule types match * Docs: Add "External resources" section to TiDB source connector doc (#15238) * Sync Log Summary Doc (#15181) * edited table * added the sync log doc section * edited formatting * Edited table * Edited formatting * Edited line spacing * Ediiting spacing * Edited line spacing * :tada: Source SurveyMonkey - to beta (#14998) * fixed incremental syns for response stream, added unittest, fixed specs, fixed incremental SAT * removed comments * updated docs * updated docs * bumped connector version * bumped release stage * auto-bump connector version [ci skip] * updated source_specs.yaml Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Typo fix (#15244) * Helm Chart: make loadBalancerIP configurable for webapp (#14992) * Fix connector form cancellation logic to ensure all fields are reset (#14857) * Add resetUiWidgetsInfo function to buildUiWidgetsContext to reset the uiWidgetsInfo to its original state * Add resetServiceForm function to service form context that both resets service form values and formik values * Update Cancel button in EditControls to avoid disabling it when form values invalid * Cleanup typing * Source Google Sheets: exposes row batch size config (#15107) * exposes row batch size config to the connector * review changes * bump connector version * auto-bump connector version [ci skip] Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Destination BigQuery: Enabling Application Default Credentials (#14784) * Enabling different bigquery authentications Current implementation only accepts Service Accounts. For developers it is very desirable that we can login with Application Default Credentials (ADC) https://cloud.google.com/sdk/gcloud/reference/auth/application-default * bump version and update doc * auto-bump connector version [ci skip] Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Update css-modules rules from warnings to errors (#15239) * fix: Update helm charts (#15199) * fix: Fix postgres secret defenition, add new extraSecrets feature, update gcs-creds volume templating * Update helm docs * Revert db overrides back. Update docs * fix: fix extraContainers typo in worker and server charts * 🎉Source Postgres: 13608, 12026, 14590 - Align regular and CDC integration tests and data mappers; improve BCE date handling (#14534) * 13608 & 12026 - align regular and CDC integration tests and data mappers * format code * update int handling * fix build * fix PR remarks * revert changes for money type that are broken by #7338 * bump version * 🐛 Source Postgres: Improve BCE date handling (#15187) * 13608 & 12026 - align regular and CDC integration tests and data mappers * format code * update int handling * borked merge - re-delete deleted methods * enable catalog tests for postgres * fix build * fix PR remarks * revert changes for money type that are broken by #7338 * update BCE handling in JDBC * reuse existing method * handle bce dates * inline methods * fix JDBC BCE year inconsistency * use correct data type in test * format * Update airbyte-integrations/connectors/source-postgres/src/test-integration/java/io/airbyte/integrations/io/airbyte/integration_tests/sources/AbstractPostgresSourceDatatypeTest.java Co-authored-by: Edward Gao <edward.gao@airbyte.io> * pmd fix * use class.getname() * fix pmd * format * bump version * handle incremental mode * clean up diff * more comments * unused imports * format * versions+changelog Co-authored-by: Yurii Bidiuk <yura.bidyuk@gmail.com> Co-authored-by: Yurii Bidiuk <35812734+yurii-bidiuk@users.noreply.github.com> * auto-bump connector version [ci skip] Co-authored-by: Edward Gao <edward.gao@airbyte.io> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * 🎉 Postgres source: emit state messages more frequently for incremental sync (#14903) * Add order by clause in incremental query * Support emitting intermediate states * Add comment * Log state warning only for final state emission * Format code * Add unit tests * Define message iterator in each test case * Fix compilation error * Kubernetes: Datadog Constant Tags (#15213) * adding option for constant tags to be sent to datadog for split environments * removing unnecessary space * Adding tests on envconfig * clean test * adding empty string test for constant tags * format files * remove public class from test class Co-authored-by: Guy Feldman <gfeldman@86labs.com> * Align MS SQL regular and strict encrypt versions (#15260) * Align MS SQL regular and strict encrypt versions * Update changelog with the PR number * Align strict encrypt and regular connector versions for destination-mysql (#15258) * Align strict endrypt and regular connector versions * Retain changelog for strict encrypt * 🪟 Add catalog changes modal on schema refresh (#14074) * WIP - types, props, components * logic tweaks * moving around * begin styling and content * modal formatting, section header * client update, add/removed streams works * theme tweaks * WIP -- adding accordion * hook for modal display logic * display logic, row/accordion progress * fix atrocities of table rendering, move header to own component * headers cleanup * headers cleanup * imageblock more flexible * progress on table todo: consolidate, complete * styling good, animation TODO * self review pt. 1 * cleanup * note * note * accessibility and i18n improvements * fix typo in scss * missig i18l things * move icon to /icons * Update airbyte-webapp/src/views/Connection/CatalogDiffModal/CatalogDiffModal.tsx Co-authored-by: Edmundo Ruiz Ghanem <168664+edmundito@users.noreply.github.com> * Update airbyte-webapp/src/views/Connection/CatalogDiffModal/components/DiffAccordion.tsx Co-authored-by: Edmundo Ruiz Ghanem <168664+edmundito@users.noreply.github.com> * spacing, use ModalFooter * Update airbyte-webapp/src/views/Connection/CatalogDiffModal/components/StreamRow.tsx Co-authored-by: Edmundo Ruiz Ghanem <168664+edmundito@users.noreply.github.com> * begin moving to memoized reducer function * memoize diff sorter and remove extra divs * cleanup * modal body padding * up0date to use modal service * move calculated string mode out of component * respond to review * add accordionheader component * catalog can be undefined * cleanup cell rendering * cleanup and make storybook work again * move table styles within a parent class * subheading alignment consistency * more padding/spacing adjustments * cleanup from review * mixup from rebase * set width on modal level not content level * Update airbyte-webapp/src/views/Connection/CatalogDiffModal/utils.tsx Co-authored-by: Edmundo Ruiz Ghanem <168664+edmundito@users.noreply.github.com> * Update airbyte-webapp/src/views/Connection/CatalogDiffModal/utils.tsx Co-authored-by: Edmundo Ruiz Ghanem <168664+edmundito@users.noreply.github.com> * linting and unused class cleanup Co-authored-by: Edmundo Ruiz Ghanem <168664+edmundito@users.noreply.github.com> * Align Postgres Destination regular and strict encrypt versions (#15261) * Source Okta: add permission stream under a custom role (#14739) * Source Okta: add permission stream under a custom role - it supports full refresh only - add unit tests * bump connector version * bump connector version in Dockerfile * auto-bump connector version [ci skip] Co-authored-by: sajarin <sajarindider@gmail.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Update README.md * 🎉 Postgres source: sync data from beginning if lsn is no longer valid in cdc (#15077) * work in progress * cleanup * add test * introduce tests for state parsing util class * enable test via feature flag * review comments * Bump versions * auto-bump connector version [ci skip] Co-authored-by: Liren Tu <tuliren.git@outlook.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Add generic test to test the per stream state behavior (#15267) * Add generic test to test the per stream state behavior * Add missing dependency * Add license * Add missing newline * Bmoric/test bq standard (#15270) * Add generic test to test the per stream state behavior * Add missing dependency * Add test for the big query record consumer * Add license * Add missing newline * Fix Mongo dest (#15211) * Fix Mongo dest * Update changelog * auto-bump connector version [ci skip] * bump strict encrypt version Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * increase report interval to 120s to minimize temporal metric reporting (#15280) * Source Bing Ads adding missing columns (#14862) * only adding `Conversions` for now to satisfy basic use case, will compelte all fields addition once I get a time to add them https://github.com/airbytehq/airbyte/issues/14637 * Adding cost per conversion rate * adding CampaignId as part of the primary key * update primary key for ad performance report * adding more columns to primary key * completing campaign_performance_report * completing account+keyword performance report * complete ad_group_performance_report * adding ad_performance_report * revert to previous logger * remove logging import * chore: updated docs * auto-bump connector version [ci skip] Co-authored-by: Harshith Mullapudi <harshithmullapudi@gmail.com> Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * chore: fixed elasticsearch documentation (#15193) * 🐞 Postgres source: fix first record wait time parsing bug (#15273) * Fix first record wait time parsing bug * Bump version * Add connection check for first record waiting time * Update spec * Override initial waiting time when it is too short * Set is_test to true for cdc integration tests * Fix integration test * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * 🐛Destination-postgres: fixed normalization java tests after changes in Python part (#15289) * [15236] destination-Postgres: fixed normalization java tests after changes in Python part. Now datetime is stored as a datetime type in db after normalization, previously it was a String * 🐛 Source Zendesk Support: add `Subscription Plan` check for available streams (#15233) * fix disabled button (#15276) * Update Kafka destination to use outputRecordCollector to properly store state (#15287) * Added tracking state to destination kafka * Bump version * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Update Keen destination to use outputRecordCollector to properly store state (#15291) * Update Keen destination to use outputRecordCollector to properly store state * Bump version * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * 🪟 🐛 Correctly revalidate form when validationSchema changes (#15109) * Correctly revalidate form when validationSchema changes * Remove old validation logic * Docs: Update google-analytics-v4.md (#15288) * Update Cassandra destination to use outputRecordCollector to properly store state (#15294) * Update Cassandra destination to use outputRecordCollector to properly store state * Bump version * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Bmoric/test mongo per stream (#15279) * Add generic test to test the per stream state behavior * Add missing dependency * Add license * Add missing newline * Add per stream test for mongoDb * 🪟 🎉 Improved rendering of variable input fields in source/destination and update custom transformations form. (#14514) * Show field names instead of index in ArraySection items * Update ArraySection to show description * Update EditorRow to show description as tooltip * Update ArrayOfObjectsEditor with render name and description props, fix base item interface * Update ToolTip so that cursor only changes if specified, make inline * Update form types from type to interface * Update EditorRow style to match new design * Update Add/Edit mode in ArrayOfObjectsEditor to use Modal * Add bottom margin to content card title to ensure drop shadow does not overlap bottom content * Cleanup spacing in Modal scss * Add width and height settings to ArrayOfObjectEditor Ensure that Transformation field edit modal and ArraySection edit modal have the right width and height * Move buttons outside of ArrayOfObjects editor and let TransformationField and ArraySection handle it * Move DBT command reference docs to links config * Move ArrayOfObjectsEditor to css module * Split variable input fields form into component, make editable via hidden field * Removed unfinished flow usage from ArraySection * Add button type to EditorRow buttons to prevent Formik warning * Fix path prop in ArraySection to not include "hidden" * Add validation to VariableInputFieldForm * Move default field from FromGroup to FormItemBase * Add Form validationSchema to Service Form Context * Update path in ArraySection to use correct path * Update VariableInputFieldForm to use form validatonSchema to determine if the data is valid or not * Add default values to VariableInputFieldForm item * Move EditorRow styles to scss and fix wrappig on small width * Add styles for tooltip in ArraySection * Update ArrayOfObjectsEditor component to match design * Update edit and close button icons * Fix spacing * Show item count * Show add / edit string on modal depending on mode * Update 0 items to No items in en file * Fix serviceForm 'should fill right values in array of objects field' test * Add testId field to Modal * Add testId to ArrayOfObjectsEditor modal * Cleanup test * Fix 'should fill all fields by right values' in serviceForm tests * Update addPriceListItem to a utility * Only use document.body for modal query * Cleanup mocks in serviceForm test * Update naming in DocumentationPanelContext * Fix stylings in EditorRow and ContentCard * Update EditorRow to always show a border between items regardless of having a description or not * Update VariableInputField field name such that it can be removed from formik values when done or cancelled * Update temp field name in variable input fields form and explain the reasons for it * Update ArrayOfObjectEditor to render form as prop instead of children Co-authored-by: Tim Roes <tim@airbyte.io> * Source-MSSQL : special character support in dbname #14824 #15186 (#15268) * dot in db name * doc update doc update * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Emit a state message even if no records were read (#15067) * Emit a state message even if no records were read * newline * merge * comment * implement logic in the abstract source * remove logic from declarative source * comment * bump cdk version * 13758: SSH control host port setting not used when tunneling (#14295) * 13758: ssh port fixed * 13758: version updated * 13758: changelog updated * 13758: merge fix * 13758: docker image version updated * 13758: docker image version updated * 13758: docker image version updated as deployment was unsuccesful Co-authored-by: Greg Solovyev <grishick@users.noreply.github.com> * cast to string before passing to strptime (#15323) * low-code connectors: Set slicer's request options (#15283) * requester is a request options provider * get request options from slicer * remove prints * share interface * actual fix with test * small fix * missing tests * missing * * simplify intersection logic * bump cdk version * Deleted trial info (#15277) * Deleted additonal info about free trial * edited line * Edited line about 14-day trial * Make `connectionTimeoutMs` configurable (#15226) * Extract connectionTimeout from jdbc_url_params along with corresponding tests * Fixed linter issues * Reverted createDataSourceWithConnectionTimeout and migrated logic to get operation * Fixed dangling createDataSourceWithConnectionTimeout and linter issues * Fixed import to use java standard library * Bump Postgres Source and Postgres Source Strict Encrypt versions * Fixed import ordering issues * Bumped the connector version [CI fix] for definitions not generated * SAT: new tests for spec backward compatibility - syntactic validation (#15194) * 🐛 Source Amazon Ads: Improve report streams date-range generation (#15031) Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com> * Update Kinesis destination to use outputRecordCollector to properly store state (#15348) * Update Kinesis destination to use outputRecordCollector to properly store state * Bump version * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Update Pulsar destination to use outputRecordCollector to properly store state (#15349) * Update Pulsar destination to use outputRecordCollector to properly store state * Bump version * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * 🎉 Destination DynamoDB: Handle per-stream state (#15350) * [15304] 🎉 Destination DynamoDB: Handle per-stream state * [15304] 🎉 Destination DynamoDB: Handle per-stream state * [15304] 🎉 Destination DynamoDB: Handle per-stream state * [15304] 🎉 Destination DynamoDB: Handle per-stream state * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * fixed formatting (#15359) * Source Hubspot: fix Deals stream schema (#15354) * #393 oncall. Source Hubspot: fix Deals stream schema * #393 source hubspot: upd changelog * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * 15308 Destination PubSub: Handle per-stream state (#15355) * 15308 Destination PubSub: Handle per-stream state * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> Co-authored-by: Oleksandr Sheheda <alexandr-shegeda@users.noreply.github.com> * [15245] Destination-mysql: fixed normalization tests after changes in python part (#15362) Destination-mysql: fixed normalization tests after changes in python part * 🪟 🎉 Move the cancel button outside the run click area (#14955) * Moved the cancel button outside the run click area * S3, Databricks and Gcs destinations fix test and publish (#15360) * Postgres source added items for array data type * Postgres source updated tests for array data type * S3 destination fix key pair oauth test * S3 destination clean code * S3 destination bump version * S3 destination bump version * Databricks and gcs destinations bump versions * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] * auto-bump connector version [ci skip] Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com> * Fix typo in change log (#15343) * Fix typo in several change logs Co-authored-by: andrii.leonets <aleonets@gmail.com> * Fix connection settings changing randomly (#15332) * Fix connection settings changing randomly * Update airbyte-webapp/src/locales/en.json Co-authored-by: Andy Jih <andyjih@users.noreply.github.com> Co-authored-by: Andy Jih <andyjih@users.noreply.github.com> * Update hubspot.md (#15369) * 🪟 Add Segment call for Connection Delete …
What
First part of #14310, instrumenting Airbyte workers to fail sync on normalization TRACE message failures and propagate to DB.
Next PR (TODO: link here) will implement reporting these errors up to Sentry.
How
Modelled the changes in normalization to roughly follow how replication failures are handled.
FailureHelper
DefaultNormalizationRunner.java
now parses stdout to grab AirbyteMessages and/or dbt errorsDefaultNormalizationWorker.java
no longer raises an Exception on any failure. If we have failureReasons, we set these in the summary and return it.ConnectionManagerWorkflowImpl.java
now checks if the Normalization summary contains failureReasons to set the attempt as a fail and store the reasons in db.Recommended reading order
main_dev_transform_catalog.py
&main_dev_transform_config.py
NormalizationSummary.yaml
FailureHelper.java
NormalizationAirbyteStreamFactory.java
DefaultNormalizationRunner.java
DefaultNormalizationWorker.java
ConnectionManagerWorkflowImpl.java