Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeout to connector pod init container command #10592

Merged
merged 4 commits into from
Feb 24, 2022

Conversation

pmossman
Copy link
Contributor

Addresses #10587

@github-actions github-actions bot added area/platform issues related to the platform area/worker Related to worker labels Feb 23, 2022
@pmossman pmossman temporarily deployed to more-secrets February 23, 2022 18:23 Inactive
@pmossman pmossman temporarily deployed to more-secrets February 23, 2022 18:23 Inactive
@pmossman pmossman force-pushed the parker/add-init-container-timeout branch from 1ace545 to 691a54b Compare February 23, 2022 18:24
@pmossman pmossman temporarily deployed to more-secrets February 23, 2022 18:26 Inactive
@pmossman pmossman temporarily deployed to more-secrets February 23, 2022 18:26 Inactive
@@ -159,7 +164,16 @@ private static Container getInit(final boolean usesStdin,
initEntrypointStr = String.format("mkfifo %s && ", STDIN_PIPE_FILE) + initEntrypointStr;
}

initEntrypointStr = initEntrypointStr + String.format(" && until [ -f %s ]; do sleep 0.1; done;", SUCCESS_FILE_NAME);
initEntrypointStr = initEntrypointStr +
Copy link
Contributor

@cgardens cgardens Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any sane way to heartbeat to the orchestrator?

timeouts could be such a headache here. if we do need a time out, are we confident that in 99% of cases that if it is taking longer than 5 min it is that something is broken and that it isn't a big config?

also, if we ever do exit due to timeout, I think it is worth explicitly logging that the error was due to a timeout to make it easier to debug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking through how the heartbeat server works now, I just wanted to put this PR up to build a version of the timeout because it was pretty low effort to throw together. 5 minutes is arbitrary but yeah, you make a good point that no matter how high we set it, there'd always be some concern that we just have a massive config to copy over and we end up killing a working container.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A kube cluster internal kubectl cp should be moving data at >1MB/s. Are we expecting 300mb+ configs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way we could approach this is to check to see if the disk usage in the folder is increasing over time. If it's increasing, we could keep on resetting our timeout, but if it's stalled and not increasing for 1min that seems safe.

If we do want to allow this to use heartbeating, I think the approach would be to get rid of the init container entirely and instead move this waiting logic to the main container. Then we'd also need to change the KubePodProcess logic to copy files into the main container instead of the init container, and change the watches for pod status to look at the other containers.

For the orchestrator, we don't want a consistent connection between the two so we either need the heartbeat as it exists today or some additional check that the transfer is actually occurring.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, fwiw, I'm fine with going with the timeout now (as long as it logs when the timeout is the thing that terminates it) and create a follow up issue to be smarter later.

@pmossman pmossman temporarily deployed to more-secrets February 23, 2022 23:40 Inactive
@pmossman pmossman temporarily deployed to more-secrets February 23, 2022 23:41 Inactive
@pmossman
Copy link
Contributor Author

@jrhizor @cgardens updated this PR with disk usage checking and a timeout of 1 minute if disk utilization hasn't changed.

A few notes:

  1. the init container uses a version of busybox that doesn't support du -b, so we can't be as granular as I would have liked. I don't think this will be an issue since there should be plenty of copied data before the timeout to cause du to report a new value.
  2. I wasn't able to come up with an elegant way to write an integration test for this. Would need to somehow interrupt the file copy and wait for it to timeout. Open to suggestions there.

I ran this locally to make sure the script was working as expected, and added some verbose logging that I didn't end up committing:

❯ kl logs busybox-sync-some-id-0-wawht -c init -f
iteration: 1
iteration: 2
iteration: 3
iteration: 4
iteration: 5
iteration: 6
iteration: 7
iteration: 8
disk usage was 988, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 1976, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 2964, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 3952, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 4940, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 5928, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 6916, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 7904, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 8892, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 9880, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 10868, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 11856, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 12844, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 13832, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 14820, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 15808, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 16796, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 17784, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 18772, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 19760, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 20748, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 21736, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 22724, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 23712, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 24700, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 25688, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 26676, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 27664, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 28652, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 29640, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 30628, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 31616, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 32604, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 33592, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 34580, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 35568, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 36556, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 37544, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 38532, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 39520, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 40508, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 41496, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 42484, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 43472, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 44460, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 45448, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 46436, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 47424, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 48412, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 49400, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 50388, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 51376, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 52364, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 53352, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 54340, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 55328, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 56316, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 57304, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 58292, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 59280, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 60268, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 61256, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 62244, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 63232, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 64220, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 65208, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 66196, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 67184, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 68172, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 69160, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 70148, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 71136, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 72124, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 73112, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 74100, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 74932, setting iteration back to 0
iteration: 1
disk usage was 75088, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 76076, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 77064, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 78052, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 79040, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 80028, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 81016, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 82004, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 82992, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 83980, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 84968, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 85956, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 86944, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 87932, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 88920, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 89908, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 90896, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 91884, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 92872, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 93860, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 94848, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 95836, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 96824, setting iteration back to 0
iteration: 1
iteration: 2
iteration: 3
disk usage was 97812, setting iteration back to 0
iteration: 1
iteration: 2
disk usage was 98800, setting iteration back to 0
iteration: 1
All files copied successfully, exiting with code 0...

And here's what the pod manifest looks like when described:

Command:
      sh
      -c
      USES_STDIN=false

      mkfifo /pipes/stdout
      mkfifo /pipes/stderr

      if [ "$USES_STDIN" = true ]; then
        mkfifo /pipes/stdin
      fi

      ITERATION=0
      MAX_ITERATION=600
      DISK_USAGE=$(du -s /config | awk '{print $1;}')

      until [ -f FINISHED_UPLOADING -o $ITERATION -ge $MAX_ITERATION ]; do
        ((ITERATION=ITERATION+1))
        echo "iteration: ${ITERATION}"
        LAST_DISK_USAGE=$DISK_USAGE
        DISK_USAGE=$(du -s /config | awk '{print $1;}')
        if [ $DISK_USAGE -gt $LAST_DISK_USAGE ]; then
          echo "disk usage was ${DISK_USAGE}, setting iteration back to 0"
          ITERATION=0
        fi
        sleep 0.1
      done

      if [ -f FINISHED_UPLOADING ]; then
        echo "All files copied successfully, exiting with code 0..."
        exit 0
      else
        echo "Timeout while attempting to copy to init container, exiting with code 1..."
        exit 1
      fi

@pmossman pmossman temporarily deployed to more-secrets February 23, 2022 23:54 Inactive
@pmossman pmossman temporarily deployed to more-secrets February 23, 2022 23:54 Inactive
Copy link
Contributor

@cgardens cgardens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool! looks good to me.

@pmossman pmossman merged commit 34be57c into master Feb 24, 2022
@pmossman pmossman deleted the parker/add-init-container-timeout branch February 24, 2022 00:28
etsybaev pushed a commit that referenced this pull request Mar 5, 2022
* add timeout to init container command

* add disk usage check into init command

* fix up disk usage checking and logs from init entrypoint

* run format
etsybaev added a commit that referenced this pull request Mar 25, 2022
…1093)

* [10033] Destination-Snowflake: added basic part for support oauth login mode

* added basic logic for token refresh

* Fixed code to support pooled connections

* Hide DBT transformations in cloud (#10583)

* Bump Airbyte version from 0.35.35-alpha to 0.35.36-alpha (#10584)

Co-authored-by: timroes <timroes@users.noreply.github.com>

* 🐛 Source Shopify: fix wrong field type for tax_exemptions (#10419)

* fix(shopify): wrong type for tax_exemptions

abandoned_checkouts customer tax_exemptions had the wrong field type

* fix(shopify): wrong type for tax_exemptions

abandoned_checkouts customer tax_exemptions had the wrong field type

* bump connector version

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* Remove storybook-addon-styled-component-theme (#10574)

* Helm Chart: Secure chart for best practices (#10000)

* 🐛 Source FB Marketing: fix `execute_in_batch` when batch is bigger than 50 (#10588)

* fix execute_in_batch

* add tests

* fix pre-commit config

Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: Eugene Kulak <kulak.eugene@gmail.com>
Co-authored-by: Sherif A. Nada <snadalive@gmail.com>

* Bmoric/move flag check to handler (#10469)

Move the feature flag checks to the handler instead of the configuration API. This could have avoid some bug related to the missing flag check in the cloud project.

* Documented product release stages (#10596)

* Set resource limits for connector definitions: api layer (#10482)

* Updated link to product release stages doc (#10599)

* Change the block logic and block after the job creation (#10597)

This is changing the check to see if a connection exist in order to make it more performant and more accurate. It makes sure that the workflow is reachable by trying to query it.

* Add timeout to connector pod init container command (#10592)

* add timeout to init container command

* add disk usage check into init command

* fix up disk usage checking and logs from init entrypoint

* run format

* fix orchestrator restart problem for cloud (#10565)

* test time ranges for cancellations

* try with wait

* fix cancellation on worker restart

* revert for CI testing that the test fails without the retry policy

* revert testing change

* matrix test the different possible cases

* re-enable new retry policy

* switch to no_retry

* switch back to new retry

* paramaterize correctly

* revert to no-retry

* re-enable new retry policy

* speed up test + fixees

* significantly speed up test

* fix ordering

* use multiple task queues in connection manager test

* use versioning for task queue change

* remove sync workflow registration for the connection manager queue

* use more specific example

* respond to parker's comments

* Fix the toggle design (#10612)

* Source Hubspot: cast timestamp to date/datetime (#10576)

* cast timestamp to date

* change test name

* fix corner cases

* fix corner cases 2

* format code

* changed method name

* add return typing

* bump version

* updated spec and def yaml

Co-authored-by: auganbay <auganenu@gmail.com>

* Update _helpers.tpl (#10617)

as helm templates integers as float64, when using %d, it renders the value of external airbyte.minio.endpoint to "S3_MINIO_ENDPOINT: "http://minio-service:%!d(float64=9000)", therefore needed to be changed to %g

* 🎉 Source Survey Monkey: add option to filter survey IDs (#8768)

* Add custom survey_ids

* bump version

* Update survey_question schema

* Add changelog

* Allow null objects

* merge master and format

* Make all types safe with NULL and add survey_ids to all streams

* Make additional types safe with NULL

* Make additional types safe with NULL

* One last safe NULL type

* small fixes

* solve conflic

* small fixes

* revert fb wrong commit

* small fb correction

* bump connector version

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* Fix doc links/loading (#10621)

* Allow frontmatter in rendered markdown (#10624)

* Adjust to new normalization name (#10626)

* sweep pods from end time not start time (#10614)

* Source Pinterest: fix typo in schema fields (#10223)

* 🎉 add associations companies to deals, ticket and contacts stream (from PR 9027) (#10631)

* Added associations to some CRM Object streams in Hubspot connector

* Added associations in the relevant schemas

* fix eof

* bump connector version

Co-authored-by: ksoenandar <kevin.soenandar@gmail.com>

* Source Chargebee: add transaction stream (#10312)

* added transactions model

* changes

* fix

* few changes

* fix

* added new stream in configured_catalog*.json

* changes

* removed new stream in configured_catalog*.json

* solve small schema issues

* add eof

* bump connector version

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>

* Add missing continue as new (#10636)

* Bump Airbyte version from 0.35.36-alpha to 0.35.37-alpha (#10640)

Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>

* exclude workers test from connectors builds on CI (#10615)

* 🎉 Source Google Workspace Admin Reports: add support for Google Meet Audit Activity Events (#10244)

* source(google-workspace-admin-reports): add support for Google Meet Audit activity events

Signed-off-by: Michele Zuccala <michele@zuccala.com>

* remove required fields

* bump connector version

* run format

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* stabilize connection manager tests (#10606)

* stabilize connection manager tests

* just call shutdown once

* another run just so we can see if it's passing

* another run just so we can see if it's passing

* re-disable test

* run another test

* run another test

* run another test

* run another test

* Log pod state if init pod wait condition times out (for debugging transient test issue) (#10639)

* log pod state if init pod search times out

* increase test timeout from 5 to 6 minutes to give kube pod process timeout time to trigger

* format

* upgrade gradle from 7.3.3 -> 7.4 (#10645)

* upgrade temporal sdk to 1.8.1 (#10648)

* upgrade temporal from mostly 1.6.0 to 1.8.1

* try bumping GSM to get newer grpc dep

* Revert "try bumping GSM to get newer grpc dep"

This reverts commit d837650.

* upgrade temporal-testing as well

* don't change version for temporal-testing-junit5

* 🎉 Source Google Ads: add network fields to click view stream

* Google Ads #8331 - add network fields to click_view stream schema

* Google Ads #8331 - add segments.ad_network_type to click_view pk according to PR review

* Google Ads #8331 - bump version

* Google Ads #8331 - update definition

* Cloud Dashboard 1 (#10628)

Publish metrics for:
- created jobs tagged by release stage
- failed jobs tagged by release stage
- cancelled jobs tagged by release stage
- succeed jobs tagged by release stage

* Correct cancelled job metric name. (#10658)

* Add attempt status by release stage metrics. (#10659)

Add,

- attempt_created_by_release_stage
- attempt_failed_by_release_stage
- attempt_succeeded_by_release_stage

* 🐛 Source CockroachDB: fix connector replication failure due to multiple open portals error (#10235)

* fix cockroachdb connector replication failure due to multiple open portals error

* bump connector version

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* 🐙 octavia-cli: implement `generate` command (#10132)

* Add try catch to make sure all handlers are closed (#10627)

* Add try catch to make sure all handlers are closed

* Handle exceptions while initializing writers

* Bumpversion of connectors

* bumpversion in seed

* Fix bigquery denormalized tests

* bumpversion seed of destination bigquery denormalized

* Fix links in onboarding page (#10656)

* Fix missing key inside map

* Fix onboarding progress links

* Add use-case links to onboarding (#10657)

* Add use-case links to onboarding

* Add new onboarding links

* Set resource limits for connector definitions: expose in worker (#10483)

* pipe through to worker

* wip

* pass source and dest def resource reqs to job client

* fix test

* use resource requirements utils to get resource reqs for legacy and new impls

* undo changes to pass sync input to container launcher worker factory

* remove import

* fix hierarchy order of resource requirements

* add nullable annotations

* undo change to test

* format

* use destination resource reqs for normalization and make resource req utils more flexible

* format

* refactor resource requirements utils and add tests

* switch to storing source/dest resource requirements directly on job sync config

* fix tests and javadocs

* use sync input resource requirements for container orchestrator pod

* do not set connection resource reqs to worker reqs

* add overrident requirement utils method + test + comment

Co-authored-by: lmossman <lake@airbyte.io>

* add mocks to tests

* Bump Airbyte version from 0.35.37-alpha to 0.35.38-alpha (#10668)

Co-authored-by: lmossman <lmossman@users.noreply.github.com>

* 🎉 Source Salesforce: speed up discovery >20x by leveraging parallel API calls (#10516)

* 📖  improve salesforce docs & reorder properties in the spec (#10679)

* Bump Airbyte version from 0.35.38-alpha to 0.35.39-alpha (#10680)

Co-authored-by: sherifnada <sherifnada@users.noreply.github.com>

* Improve note in salesforce docs about creating a RO user

* Upgrade plop in connector generators (#10578)

* Upgrade plop

* Remove scaffolded code

* Build fixes

* Remove scaffolded code

* Revert "Remove scaffolded code"

This reverts commit 3911f52.

* Revert "Remove scaffolded code"

This reverts commit 549f790.

* Remove .gitignore changes

* Remove .gitignore changes

* Update scaffold generated code

* Replace titleCase with capitalCase (#10654)

* Add capitalCase helper

* Replace titleCase with capitalCase

* Update generated scaffold files

Co-authored-by: LiRen Tu <tuliren.git@outlook.com>

* 🐛 Fix toggle styling (#10684)

* Fix error NPE in metrics emission. (#10675)

* Fix missing type=button (#10683)

* close ssh in case of exception during check in Postgres connector (#10620)

* close ssh in case of exception

* remove unwanted change

* remove comment

* format

* do not close scanner

* fix semi-colon

* format

* Refactor to enable support for optional JDBC parameters for all JDBC destinations (#10421)

* refactoring to allow testing

* MySQLDestination uses connection property map instead of url arguments

* Update jdbc destinations

* A little more generic

* reset to master

* reset to master

* move to jdbcutils

* Align when multiline

* Align when multiline

* Update postgres to use property map

* Move tests to AbstractJdbcDestinationTest

* clean

* Align when multiline

* return property map

* Add postgres tests

* update clickhouse

* reformat

* reset

* reformat

* fix test

* reformat

* fix bug

* Add mssql tests

* refactor test

* fix oracle destination test

* oracle tests

* fix redshift acceptance test

* Pass string

* Revert "Pass string"

This reverts commit 6978217.

* Double deserialization

* Revert "Double deserialization"

This reverts commit ee8d752.

* try updating json_operations

* Revert "try updating json_operations"

This reverts commit c8022c2.

* json parse

* Revert "json parse"

This reverts commit 11a6725.

* Revert "Revert "Double deserialization""

This reverts commit 213f47a.

* Revert "Revert "Revert "Double deserialization"""

This reverts commit 6682245.

* move to constant

* Add comment

* map can be constant

* Add comment

* move map

* hide in method

* no need to create new map

* no need to create new map

* no need to create new map

* enably mysql test

* Update changelogs

* Update changelog

* update changelog

* Bump versions

* bump version

* disable dbt support

* update spec

* update other oracle tests

* update doc

* bump seed

* fix source test

* update seed spec file

* fix expected spec

* Fix trial period time frame (#10714)

* Bmoric/restore update with temporal (#10713)

Restore the missing update call to temporal.

It was making the update of a schedule to not be effective immediately.

* Bump Airbyte version from 0.35.39-alpha to 0.35.40-alpha (#10716)

Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>

* Fix CockroachDbSource compilation error (#10731)

* Fix CockroachDbSource compilation error

* fix test too

* 🎉 Source Zendesk: sync rate improvement (#9456)

* Update Source Zendesk request execution with future requests.

* Revert "Update Source Zendesk request execution with future requests."

This reverts commit 2a3c1f8.

* Add futures stream logics.

* Fix stream

* Fix full refresh streams.

* Update streams.py.
Fix all streams.
Updated schema.

* Add future request unit tests

* Post review fixes.

* Fix broken incremental streams.
Fix SAT.
Remove odd unit tests.

* Comment few unit tests

* Bump docker version

* CDK: Ensure AirbyteLogger is thread-safe using Lock (#9943)

* Ensure AirbyteLogger is thread-safe

- Introduce a global lock to ensure `AirbyteLogger` is thread-safe.
- The `logging` module is thread-safe, however `print` is not, and is currently used. This means that messages sent to stdout can clash if connectors use threading. This is obviously a huge problem when the IPC between the source/destination is stdout!
- A `multiprocessing.Lock` could have been introduced however given that `logging` module is not multiprocess-safe I thought that thread-safety should be first goal.
- IMO the `AirbyteLogger` should be a subclass of the `logging.Logger` so you have thread-safety automatically, however I didn't want to make a huge wholesale change here.

* Revert lock and add deprecation warning instead

* remove --cpu-shares flag (#10738)

* Bump Airbyte version from 0.35.40-alpha to 0.35.41-alpha (#10740)

Co-authored-by: jrhizor <jrhizor@users.noreply.github.com>

* Add Scylla destination to index (#10741)

* Add scylla to destination_definitions

* Add woocommerce source

* Update definition id

* Add icon

* update docker repository

* reset to master

* fix version

* generate spec

* Update builds.md

* run gradle format (#10746)

* Bump Airbyte version from 0.35.41-alpha to 0.35.42-alpha (#10747)

Co-authored-by: girarda <girarda@users.noreply.github.com>

* Change offer amount

* Fix back link on signup page (#10732)

* Fix back link on signup page

* Add and correct uiConfig links

* 🎉 Source redshift: implement privileges check (#9744)

* update postgres source version (#10696)

* update postgres source version

* update spec

* fix[api]: nullable connection schedule (#10107)

* fix[api] inconsistent casing on OperationID for Operations API  (#10464)

* #10307 Fixes inconsistent casing on OperationID for Operations API

* update generated doc

Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>

* Display numbers in usage per connection table (#10757)

* Add connector stage to dropdown value (#10677)

* Add connector stage to dropdown value

* Remove line break from i18n message

* Update snowflake destination docs for correct host (#10673)

* Update snowflake destination docs for correct host

* Update snowflake.md

* Update README.md

* Update spec.json

* Update README.md

* Update spec.json

* Update README.md

* Update snowflake.md

* Update spec.json

* Update spec.json

* 📕 source salesforce: fix broken page anchor in spec.json & add guide for adding read only user (#10751)

* 🎉  Source Facebook Marketing: add activities stream (#10655)

* add facebook marketing activities stream

* update incremental test

* add overrides for activities specific logic

* formatting

* update readme docs

* remove test limitation

* update dockerfile airbyte version

* correct tests

* bump connector version in config module

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* Add a note about running only in dev mode on M1 (#10772)

Macs with M1 chip can run Airbyte only in dev mode right now, so to make it clear, I added a note about it and moved the hint about M1 chips to the top of the section.

* push failures to segment (#10715)

* test: new failures metadata for segment tracking

* new failures metadata for segment tracking

failure_reasons: array of all failures (as json objects) for a job
- for general analytics on failures
main_failure_reason: main failure reason (as json object) for this job
- for operational usage (for Intercom)
- currently this is just the first failure reason chronologically
    - we'll probably to change this when we have more data on how to
determine failure reasons more intelligently

- added an attempt_id to failures so we can group failures by attempt
- removed stacktrace from failures since it's not clear how we'd use
these in an analytics use case (and because segment has a 32kb size
limit for events)

* remove attempt_id

attempt info is already in failure metadata

* explicitly sort failures array chronologically

* replace "unknown" enums with null

note: ImmutableMaps don't allow nulls

* move sorting to the correct place

* Update temporal retention TTL from 7 to 30 days (#10635)

Increase the temporal retention to 30 days instead of 7. It will help with on call investigation.

* Add count connection functions (#10568)

* Add count connection functions

* Fix new configRepository queries

- Remove unnecessary joins
- Fix countConnection

* Use existing mock data for tests

* Adds default sidecar cpu request and limit and add resources to the init container (#10759)

* close ssh tunnel in case of exception in destination consumer (#10686)

* close ssh tunnel in case of exception

* format

* fix salesforce docs markdown formatting

* Fix typo in salesforce docs

* Extract event from the temporal worker run factory (#10739)

Extract of different events that can happen to a sync into a non temporal related interface.

* Bump Airbyte version from 0.35.42-alpha to 0.35.43-alpha (#10778)

Co-authored-by: sherifnada <sherifnada@users.noreply.github.com>

* Added a note about running in dev mode on M1 macs (#10776)

Currently, Macs with M1 chips can run Airbyte only in dev mode. I added a note about that.

* Destination Snowflake: add missing version in changelog (#10779)

* Hide shopify in Cloud (#10783)

* Metrics Reporter Queries Part 1 (#10663)

Add all the simpler queries from https://docs.google.com/document/d/11pEUsHyKUhh4CtV3aReau3SUG-ncEvy6ROJRVln6YB4/edit?usp=sharing.

- Num Pending Jobs
- Num Concurrent Jobs
- Oldest Pending Job
- Oldest Running Job

* Bump Airbyte version from 0.35.43-alpha to 0.35.44-alpha (#10789)

* Bump Airbyte version from 0.35.43-alpha to 0.35.44-alpha

* Commit.

* Add exception block.

* Why would having try catch work?

* Add logging to figure out.

* Undo all debugging changes.

* Better comments.

Co-authored-by: davinchia <davinchia@users.noreply.github.com>
Co-authored-by: Davin Chia <davinchia@gmail.com>

* Update api-documentation.md

* jdbc build fixes (#10799)

* Update api-documentation.md

* Exclude package.json from codeowners (#10805)

* 🎉 Source Chargebee: add credit note model (#10795)

* feat(chargebee) add credit note model

* fix(airbyte): update version Dockerfile

* fix(airbyte): update version Dockerfile v2

* Source Chargebee: run format and correct unit test (#10811)

* feat(chargebee) add credit note model

* fix(airbyte): update version Dockerfile

* fix(airbyte): update version Dockerfile v2

* correct unit test

Co-authored-by: Koen Sengers <k.sengers@gynzy.com>

* 🎉 Source Chartmogul: Add CustomerCount stream (#10756)

* 🎉 Source Chartmogul: Add CustomerCount stream

* Update description

* address comments

* update changelog

* format source file

* run seed file

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* default to no resource limits for OSS (#10800)

* Add autoformat (#10808)

* Bump Airbyte version from 0.35.44-alpha to 0.35.45-alpha (#10818)

Co-authored-by: lmossman <lmossman@users.noreply.github.com>

* Set default values as current values in editMode (#10486)

* Set default values as current values in editMode

* Fix unit tests

* Save signup fields (#10768)

* Temporary save signup fields into firebase_user.displayName

* Use default values if no displayName was stored before

* Move regsiter to localStorage

* Address PR comments

* Source Woocommerce: fixes (#10529)

* fixed issues

* Fix: multiple issues

* modify configured catalog

* Fix: remove unused variables

* Fix: orders request with parameters

* Fix: add new line in configured catalogs

* Fix: remove unused imports

* Fix: catalog changes

* Source woocommerce: publishing connector (#10791)

* fixed issues

* Fix: multiple issues

* modify configured catalog

* Fix: remove unused variables

* Fix: orders request with parameters

* Fix: add new line in configured catalogs

* Fix: remove unused imports

* Fix: catalog changes

* fix: change schema for meta_data

Co-authored-by: Manoj <saimanoj58@gmail.com>

* Surface any active child thread of dying connectors  (#10660)

* Interrupt child thread of dying connectors to avoid getting stuck

* Catch and print stacktrace

* Add test on interrupt/kill time outs

* Send message to sentry too

* Add another token to alleviate API limit pressure. (#10826)

We are running into Github API rate limits.

This PR:
- introduces another token as a temp solution.
- reorganises the workflow file.

* Add caching to all jobs in the main build. (#10801)

Add build dependency caching to all jobs in the main build.

This speeds things up by 5 mins over the previously uncached time.

* 🐛 Handle try/catch in BigQuery destination consumers (#10755)

* Handle try/catch in BigQuery destination consumers

* Remove parallelStream

* Bumpversion of connector

* update changelogs

* update seeds

* Format code (#10837)

* Regenerate MySQL outputs from normalization tests

* format

* Use cypress dashboard and stabilize e2e tests (#10807)

* Record e2e tests to cypress dashboard

* Make env variable accessible in script

* Improve e2e_test script

* Properly wait for server to be ready

* Isolate test suites better

* More test isolation

* Revert baseUrl for development

* 🐛 Source Github: add new streams `Deployments`, `ProjectColumns`, `PullRequestCommits` (#10385)

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* Remove the use of ConfigPersistence for ActorCatalog operation (#10387)

* Skip ConfigPersistence for ActorCatalog operations

* Fix catalog insertion logic

- ActorCatalog and ActorCatalogFetchEvent are stored within the same
  transation.
- The function writing catalog now automatically handles deduplication.
- Fixed function visibility: helper function to handle ActorCatalog
  insertion are now private.

* Fix fetch catalog query

take the catalog associated with the latest fetch event in case where
multiple event are present for the same config, actorId, actor version.

* Fix name of columns used for insert

* Add testing on deduplication of catalogs

* Add javadoc for actor catalog functions

* Rename sourceId to actorId

* Fix formatting

* Update integrations README.md (#10851)

Updated verbiage from grades to stages
Updated connector stages to match cloud stage tags
Added connectors missing on README.md that appear on cloud drop down

* [10033] Destination-Snowflake: added basic part for support oauth login mode

* added basic logic for token refresh

* Updated spec to support DBT normalization and OAuth

* snowflake oauth

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* test_transform_snowflake_oauth added

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* [4654] Added backward compatibility

* Added test to check a backward compatibility

* fixed oauth connection

* Updated doc, fixed code as per comments in PR

* to be more explicit

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* Added executor service

* Fixed merge conflict

* Updated doc and bumped version

* Bumped version

* bump 0.1.71 -> 0.1.72

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* Updated doc

* fix version in basic-normalization.md

Signed-off-by: Sergey Chvalyuk <grubberr@gmail.com>

* Added explicit re-set property, but even now it already works

* dummy bumping version

* updated spec

Co-authored-by: ievgeniit <etsybaev@gmail.com>
Co-authored-by: Tim Roes <tim@airbyte.io>
Co-authored-by: Octavia Squidington III <90398440+octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: timroes <timroes@users.noreply.github.com>
Co-authored-by: Philippe Boyd <philippeboyd@users.noreply.github.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Álvaro Torres Cogollo <atorrescogollo@gmail.com>
Co-authored-by: Eugene Kulak <widowmakerreborn@gmail.com>
Co-authored-by: Sherif A. Nada <snadalive@gmail.com>
Co-authored-by: Eugene Kulak <kulak.eugene@gmail.com>
Co-authored-by: Benoit Moriceau <benoit@airbyte.io>
Co-authored-by: Amruta Ranade <11484018+Amruta-Ranade@users.noreply.github.com>
Co-authored-by: Charles <charles@airbyte.io>
Co-authored-by: Parker Mossman <parker@airbyte.io>
Co-authored-by: Jared Rhizor <me@jaredrhizor.com>
Co-authored-by: augan-rymkhan <93112548+augan-rymkhan@users.noreply.github.com>
Co-authored-by: auganbay <auganenu@gmail.com>
Co-authored-by: keterslayter <32784192+keterslayter@users.noreply.github.com>
Co-authored-by: Daniel Diamond <33811744+danieldiamond@users.noreply.github.com>
Co-authored-by: Ronald Fortmann <72810611+rfortmann-ewolff@users.noreply.github.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: ksoenandar <kevin.soenandar@gmail.com>
Co-authored-by: Aaditya Sinha <75474786+aadityasinha-dotcom@users.noreply.github.com>
Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>
Co-authored-by: Michele Zuccala <michele@zuccala.com>
Co-authored-by: vitaliizazmic <75620293+vitaliizazmic@users.noreply.github.com>
Co-authored-by: Davin Chia <davinchia@gmail.com>
Co-authored-by: Lakshmikant Shrinivas <lakshmikant@gmail.com>
Co-authored-by: Augustin <augustin.lafanechere@gmail.com>
Co-authored-by: Christophe Duong <christophe.duong@gmail.com>
Co-authored-by: lmossman <lake@airbyte.io>
Co-authored-by: lmossman <lmossman@users.noreply.github.com>
Co-authored-by: Maksym Pavlenok <antixar@gmail.com>
Co-authored-by: sherifnada <sherifnada@users.noreply.github.com>
Co-authored-by: LiRen Tu <tuliren.git@outlook.com>
Co-authored-by: Subodh Kant Chaturvedi <subodh1810@gmail.com>
Co-authored-by: girarda <alexandre@airbyte.io>
Co-authored-by: Vadym Hevlich <vege1wgw@gmail.com>
Co-authored-by: jdclarke5 <jdclarke5@gmail.com>
Co-authored-by: jrhizor <jrhizor@users.noreply.github.com>
Co-authored-by: girarda <girarda@users.noreply.github.com>
Co-authored-by: Azhar Dewji <azhardewji@gmail.com>
Co-authored-by: Alasdair Brown <sdairs@users.noreply.github.com>
Co-authored-by: Julia <julia.chvyrova@gmail.com>
Co-authored-by: Lucas Wiley <lucas@tremendous.com>
Co-authored-by: Philip Corr <PhilipCorr@users.noreply.github.com>
Co-authored-by: Greg Solovyev <grishick@users.noreply.github.com>
Co-authored-by: Peter Hu <peter@airbyte.io>
Co-authored-by: Malik Diarra <malik@airbyte.io>
Co-authored-by: Thibaud Chardonnens <thibaud.ch@gmail.com>
Co-authored-by: davinchia <davinchia@users.noreply.github.com>
Co-authored-by: Erica Struthers <93952107+erica-airbyte@users.noreply.github.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: Tim Roes <mail@timroes.de>
Co-authored-by: ksengers <30521298+Koen03@users.noreply.github.com>
Co-authored-by: Koen Sengers <k.sengers@gynzy.com>
Co-authored-by: Titas Skrebe <titas@omnisend.com>
Co-authored-by: Artem Astapenko <3767150+Jamakase@users.noreply.github.com>
Co-authored-by: Manoj Reddy KS <saimanoj58@gmail.com>
Co-authored-by: Harshith Mullapudi <harshithmullapudi@gmail.com>
Co-authored-by: Juan <80164312+jnr0790@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform issues related to the platform area/worker Related to worker
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants