Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose cron scheduling in the Connections APIs #15253

Merged
merged 16 commits into from
Aug 11, 2022

Conversation

mfsiega-airbyte
Copy link
Contributor

What

Expose cron scheduling in the connection APIs.

How

  • Introduce the scheduleType and scheduleData fields alongside the now-deprecated schedule.
  • Consume these if available.
  • Populate both the old and new schemas for the existing types, so the the frontend can switch without disruption.

Recommended reading order

  1. airbyte-api/src/main/openapi/config.yaml
  2. airbyte-server/src/main/java/io/airbyte/server/handlers/ConnectionsHandler.java
  3. airbyte-server/src/main/java/io/airbyte/server/handlers/helpers/ConnectionScheduleHelper.java
  4. airbyte-server/src/test/java/io/airbyte/server/handlers/ConnectionSchedulerHelperTest.java

🚨 User Impact 🚨

None

Pre-merge Checklist

Expand the relevant checklist and delete the others.

Tests

Unit
  • Updated unit tests for connections handler
  • Added unit tests for cron parsing from Create requests
Integration

Todo in a subsequent PR.

Acceptance

Todo in a subsequent PR.

@github-actions github-actions bot added area/api Related to the api area/documentation Improvements or additions to documentation area/platform issues related to the platform area/server area/worker Related to worker labels Aug 3, 2022
@mfsiega-airbyte mfsiega-airbyte requested review from timroes, a team and terencecho and removed request for a team August 3, 2022 17:32
Copy link
Contributor

@terencecho terencecho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm. Added a few comments to the PR.

@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 4, 2022 10:35 Inactive
@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 5, 2022 17:09 Inactive
@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 5, 2022 17:43 Inactive
if "cron" in configuration["schedule_data"]:
cron = ConnectionScheduleDataCron(**configuration["schedule_data"]["cron"])
configuration["schedule_data"]["cron"] = cron
configuration["schedule_data"] = ConnectionScheduleData(**configuration["schedule_data"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason why the manual type isn't here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the type is manual, then schedule_data is null, so we only need to handle the schedule_type (in line 606).

@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 8, 2022 10:00 Inactive
@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 8, 2022 12:03 Inactive
@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 9, 2022 09:11 Inactive
@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 9, 2022 09:40 Inactive
@mfsiega-airbyte
Copy link
Contributor Author

mfsiega-airbyte commented Aug 9, 2022

Believe any remaining test failures are unrelated to this PR.

Adding @alafanechere to take a look at the octavia-cli changes. Also a question for you - it took a bit of time to figure out what code change needed to happen, though I think this will probably happen anytime anybody adds a new object field to an API (maybe?). Any thoughts on a possible doc update to make it easier? Perhaps just a note in the README? (Though I'm not 100% sure what it should say.)

And @timroes if you want to take a quick look from the API/FE perspective?

@alafanechere alafanechere temporarily deployed to more-secrets August 10, 2022 10:37 Inactive
@alafanechere alafanechere temporarily deployed to more-secrets August 10, 2022 10:40 Inactive
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mfsiega-airbyte I think I made the appropriate changes to the CLI to manage the two new fields schedule_type and schedule_data in the generate, apply and import commands. For simplicity, I directly deprecated the schedule field to encourage CLI users to update their configuration to use these two new fields.

I'm not entirely sure of the logic of the deserialize_raw_configuration function. This why I "request changes". Could you please let me know if the implementation looks fine?

I also updated the tests so the CLI build should be green.


if "schedule_type" in configuration:
# If schedule type is manual we do not expect a schedule_data field to be set
# TODO: sending a WebConnectionCreate payload without schedule_data (for manual) fails.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mfsiega-airbyte I tried to send a payload with only schedule_type set to manual but received an error response saying schedule_data must be set. Could you please double-check the behavior of this function and if it matches the expected logic on the API side? I'd be thankful if you could also update the corresponding unit test (test__deserialize_raw_configuration) to ensure all cases are properly handled.

Copy link
Contributor Author

@mfsiega-airbyte mfsiega-airbyte Aug 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out this was happening when updating from manual -> non-manual; fixed this in the API.

Added a bit of test coverage as well.

Otherwise, LGTM!

@timroes
Copy link
Collaborator

timroes commented Aug 10, 2022

Looked at the API (not the implementation). Given that oneOf still don't seem to work in our generator (joelittlejohn/jsonschema2pojo#392) this seems like the best solution we can get as an API. If that would work it would be better if we could notate in the type that if scheduleType is cron, the cron expression would be set as well. But this would require oneOf to properly work :( So given this limitation I feel this is the best API we can create for now.

@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 11, 2022 07:18 Inactive
@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 11, 2022 07:42 Inactive
@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 11, 2022 08:58 Inactive
@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 11, 2022 09:23 Inactive
@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 11, 2022 10:59 Inactive
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on the octavia-cli side, thanks!

octavia-cli/unit_tests/test_apply/test_resources.py Outdated Show resolved Hide resolved
Co-authored-by: Augustin <augustin.lafanechere@gmail.com>
@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 11, 2022 13:39 Inactive
@mfsiega-airbyte mfsiega-airbyte temporarily deployed to more-secrets August 11, 2022 14:19 Inactive
@mfsiega-airbyte mfsiega-airbyte merged commit 294ee8f into master Aug 11, 2022
@mfsiega-airbyte mfsiega-airbyte deleted the msiega/cronstrings-api branch August 11, 2022 17:27
@@ -4564,6 +4634,10 @@ components:
$ref: "#/components/schemas/AirbyteCatalog"
schedule:
$ref: "#/components/schemas/ConnectionSchedule"
scheduleType:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it not encapsulated in one object?

Schedule:
  type:
  data:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion the deeper nesting would make it a bit more unwieldy without too much benefit - encapsulating it under an object doesn't really reduce the cognitive load since every user has to figure out what to do with schedule data anyway whether it's top-level or nested.

I did consider the approach you propose, and I'm not really opposed to it. It comes with the wrinkle that during the migration we have this schedule object that contains both the old and new schemas. (Or - we have some transition period where we have the new schema with some new name, remove the old schema, and then migrate again the new schema to the existing Schedule name.) All in all this felt simpler; but open to feedback here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong opinion on my end. However everything related to migration complexity is a non-issue to me since we are v0. I prefer to have something clean that we want than to make design tradoffs based on backward comp or migrations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api Related to the api area/documentation Improvements or additions to documentation area/platform issues related to the platform area/server area/worker Related to worker
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants