Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore HttpAvailabilityStrategy as default for HttpStreams #21924

Merged
merged 2 commits into from
Feb 1, 2023

Conversation

erohmensing
Copy link
Contributor

@erohmensing erohmensing commented Jan 26, 2023

Set HttpAvailabilityStrategy as default for HttpStreams
Reverts #21488 and updates documentation accordingly
Closes #20401

Originally was going to be part of #21888, but want to avoid race conditions where if a connector is not published correctly, it could accidentally get published with the default by someone else before it could be published with the override.

Don't merge/release until connectors in #21888 have been published!

Pre-Merge

Post-Merge follow-up

  • Publish CDK

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@sherifnada sherifnada requested review from maxi297 and removed request for a team January 31, 2023 06:52
@sherifnada
Copy link
Contributor

asking for a review from one of @brianjlai or @maxi297 to represent ConEx

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

You can use this `HttpAvailabilityStrategy` in your `HttpStream` by adding the following property to your stream class:
### Customizing stream availability

You can subclass `HttpAvailabilityStrategy` to override the `reasons_for_unavailable_status_codes` to except more HTTP error codes and inform the user how to resolve errors specific to your connector or stream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be outside of the scope of this PR but I think it's still important to have a discussion around this: I fear sources being able to subclass any of the CDK classes. Like, I almost don't sleep at night thinking about all the things sources could do when re-implementing classes. This brings a lot of maintenance in the long run. For example, let's say we refactor the concept of HttpAvailabilityStrategy, we now have to migrate all the custom implementations. It's even harder with source custom implementations because we often don't know the full context of the source and all its usage/use cases and we don't have an exhaustive safety net for most of the sources (or at least, I'm not very confident when I modify a source and want to know if I broke anything).

Is there a reason why we wouldn't add the functionality to support new HTTP code. Something like:

class HttpAvailabilityStrategy(AvailabilityStrategy):
  _REASONS_PER_HTTP_STATUS = {
    403: <a message that might support templating if needed>,
    <...>
 } 
    
  def __init__(self, unavailable_http_statuses=[403]):
    <...>

If eventually we want to allow custom message, this is still possible with the solution above ; we can add a override_reason parameter in the constructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's say we refactor the concept of HttpAvailabilityStrategy, we now have to migrate all the custom implementations

This is a good point, and something that I think Alex was trying to get across in a previous review. However, I find a lot of the same problems in the current implementation of having users subclass AbstractSource and HttpStream. I think a lot of the implementations for handling availability will be specific to different sources - being able to provide specific error messages to the users is key to making this an actually useful feature. However, passing that dictionary of statuses and messages as an init parameter does seem like it'd also work and allow for easier refactoring down the line.

If you're feeling strongly about it, do you want to pair for a short amount of time tomorrow to talk problems and potential solutions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a lot of the implementations for handling availability will be specific to different sources

For HTTP, I think we could probably get fancy there but I'm wondering if the value it provides is worth the maintenance incurred by subclassing. And yes, other technologies (RPC, SOAP, DB, etc...) will probably need to have different strategies but the less we have, the better it is.

If you're feeling strongly about it, do you want to pair for a short amount of time tomorrow to talk problems and potential solutions?

Feel free to book anything that fits in my calendar! I don't have the full context on this and therefore my point might be void once I know more about the situation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, we'll move forward with this as planned and take a look at what the implementations look like as GL starts migrating connectors which have custom error handling to the HttpAvailabilityStrategy, with the hopes of getting something with a good default case, and configurable via the low-code parameters, into the low-code CDK 1.0. I created an issue in my team's board to track it, but see it as something we might collaborate on. Feel free to add anything to the issue, especially timelines re: low code release!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Roll out AvailabilityStrategy - slowly for GA connectors, as default for others
6 participants