Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for managing offsets in Connect #121

Merged
merged 11 commits into from
Jul 16, 2024

Conversation

katheris
Copy link
Contributor

No description provided.

@katheris katheris requested a review from mimaison May 21, 2024 12:57
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proposal. I left some comments ... but apart from them, I have a bit mixed feelings about it.

I think it is well designed in terms of the API and I'm not sure it could be done in some significantly better way. But it seems very expensive for something what seems like a pretty niche feature:

  • The initial implementation will be non-trivial
  • The maintenance effort will be very significant because you have a new custom resource watches, CRDs, you will need to have some metrics => historically, all of these are very expensive to maintain.

So I wonder if it is really worth the effort. And I guess for me it falls to whether we still expect that we can disable the Connect REST API access completely to users using the KafkaConnector resources. If we still think we can cut off the REST API access, why haven't we done it yet? Or if we think it is not something we can really do and that users will always want the REST API access, isn't it better to leave this on the REST API and focus on securing the REST API?

Or did I somehow missed the importance of this feature and it is now (despite being missing for many years) expected that users will be reseting the offsets twice per day?

074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
@katheris
Copy link
Contributor Author

Thanks for the proposal. I left some comments ... but apart from them, I have a bit mixed feelings about it.

I think it is well designed in terms of the API and I'm not sure it could be done in some significantly better way. But it seems very expensive for something what seems like a pretty niche feature:

  • The initial implementation will be non-trivial
  • The maintenance effort will be very significant because you have a new custom resource watches, CRDs, you will need to have some metrics => historically, all of these are very expensive to maintain.

So I wonder if it is really worth the effort. And I guess for me it falls to whether we still expect that we can disable the Connect REST API access completely to users using the KafkaConnector resources. If we still think we can cut off the REST API access, why haven't we done it yet? Or if we think it is not something we can really do and that users will always want the REST API access, isn't it better to leave this on the REST API and focus on securing the REST API?

Or did I somehow missed the importance of this feature and it is now (despite being missing for many years) expected that users will be reseting the offsets twice per day?

I wouldn't say it is a niche feature, but I do think it is something that users won't do especially frequently. I think for me the most valuable feature is being able to delete offsets before deleting a connector. So at a minimum I think we should support the delete offset path, which also happens to be the simplest and least additional maintenance.

As a bonus we could also look at providing an option that allows users to request we clean up the offsets before deleting a connector, although I don't think we'd be able to do that in a way that guarantees it was successful (e.g. if the operator goes down half way through stopping, deleting offsets and deleting the connector).

In terms of supporting securing the Connect REST API that's still stuck behind the other certificate work which is obviously progressing, but isn't a small task.

In terms of listing and altering offsets, I do think those are less likely to be used often. If we switched to using a configmap as you suggested for the altering of offsets then perhaps we could do it without a separate CR to reduce the additional maintenance? Then for listing offsets we could again reduce the support, so e.g allow a user to request the offsets using an annotation or something, but make it a one time thing, so to get the updated offsets you have to add the annotation again.

@mimaison what are your thoughts on how frequently these endpoints would be used?

@mimaison
Copy link
Contributor

These update/delete endpoints are not expected to be used a lot apart during development. You typically only need to update offsets when setting up a new pipeline or when handling issues like an unprocessable record.

The list offsets endpoint is useful to monitor a connector and validate it is making progress. This is especially the case with MirrorMaker at least until KIP-971 gets implemented.

Still being able to fully operate connectors via CRDs would be nice.

Securing the REST API is in my opinion more fundamental but it's an orthogonal issue and seems to be "in progress".

@katheris
Copy link
Contributor Author

katheris commented Jun 3, 2024

@mimaison do you think it will be a common usecase to want to continually fetch offsets, or would having the user annotate the CR every time they want the offsets to be listed be sufficient for most usecases?

@mimaison
Copy link
Contributor

mimaison commented Jun 3, 2024

I'm not sure continually fetching offsets is very important. You can track consumer offsets for sink connectors and for source connectors the offsets are specific for the connector and may not may a lot of users. In my opinion the main values of the API are 1) retrieving offsets to update them (or apply them to another Connect cluster), 2) see they are changing to validate the connector is making progress.

Today the main reason for continuous fetching would be MirrorMaker as in this case source offsets are Kafka offsets and understandable by users. But as I said, this should be addressed by KIP-971.

@katheris
Copy link
Contributor Author

katheris commented Jun 4, 2024

@scholzj @mimaison @ppatierno thanks for your reviews. I've updated the proposal to hopefully simplify things from a maintenance perspective based on our discussions. In summary the changes are:

  • Update to only list offsets once, not continually
  • Update list and alter offsets to use a ConfigMap
  • Add option to reset offsets when deleting a connector

Let me know if you think this works better

074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SOme nits. But LGTM otherwise.

074-connector-offsets-support.md Outdated Show resolved Hide resolved
074-connector-offsets-support.md Outdated Show resolved Hide resolved
Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

You do not mention anything about MM2 -> Is it because the MM2 source connectors store the offsets as regular commits in the source cluster? Or will we possibly need a separate proposal to cver MM2 in the future? (In both cases, it might be worth explaining it in the affected projects section)?

@katheris
Copy link
Contributor Author

katheris commented Jul 2, 2024

You do not mention anything about MM2 -> Is it because the MM2 source connectors store the offsets as regular commits in the source cluster? Or will we possibly need a separate proposal to cver MM2 in the future? (In both cases, it might be worth explaining it in the affected projects section)?

That's a good point, if the users are using the KafkaConnector CR they can manage the offsets for the MM2 source connectors like any other source connector. However, if they are using the KafkaMirrorMaker2 CR I think we should also provide support. Since the format of the offsets is known for these I think we can give a more guided experience, rather than allowing a free-form ConfigMap, so I've added a note in the Affected/not affected projects that this should be covered in a separate proposal.

@katheris
Copy link
Contributor Author

katheris commented Jul 3, 2024

@mimaison @ppatierno when you have some time please take another look at the proposal with the recent updates

Copy link
Contributor

@mimaison mimaison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think this simpler approach works well and it should still enable users to perform all the offsets operations on their connectors.

Copy link
Member

@ppatierno ppatierno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I added two more comments/nits.

075-connector-offsets-support.md Outdated Show resolved Hide resolved
075-connector-offsets-support.md Outdated Show resolved Hide resolved
Copy link
Member

@tombentley tombentley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I think this will make a nice improvement. I left a few minor comments, but over all this LGTM.

075-connector-offsets-support.md Outdated Show resolved Hide resolved
075-connector-offsets-support.md Outdated Show resolved Hide resolved
075-connector-offsets-support.md Outdated Show resolved Hide resolved
Copy link
Contributor

@fvaleri fvaleri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com>
Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com>
* Update to only list offsets once, not continually
* Update list and alter offsets to use a ConfigMap
* Add option to reset offsets when deleting a connector

Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com>
* Use more complex API for ConfigMap
names to make it extensible.
* Align behaviour when Connector is not stopped.
* Remove reference to "file" in ConfigMap.

Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com>
* Remove section on resetting offsets when
removing connectors.
* Make listOffsets and alterOffsets properties
required.
* Leave the annotation on the resource when an
error occurs.

Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com>
Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com>
Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com>
Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com>
* Change annotation to connector-offsets, not camel case.
* Clarifying overwriting of data filed in ConfigMap.
* Add ownerReferences to ConfigMap examples.
* Clarify operator will validatate for syntatically valid json.
* Explicitly state that offsets.json must be present in alter ConfigMap.
* Explicitly state that operator will ignore other fields in alter ConfigMap.

Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com>
@scholzj
Copy link
Member

scholzj commented Jul 10, 2024

This has now 3 binding and 2 non-binding +1 votes. If you have any comments or plan to review it, please do so until Friday. Otherwise, we will close it as approved.

Copy link
Contributor

@PaulRMellor PaulRMellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clear proposal. I left some suggestions.
The proposed solution seems easy to set up and use: user-friendly

076-connector-offsets-support.md Outdated Show resolved Hide resolved
076-connector-offsets-support.md Outdated Show resolved Hide resolved
076-connector-offsets-support.md Outdated Show resolved Hide resolved
076-connector-offsets-support.md Show resolved Hide resolved
076-connector-offsets-support.md Outdated Show resolved Hide resolved
076-connector-offsets-support.md Outdated Show resolved Hide resolved
076-connector-offsets-support.md Outdated Show resolved Hide resolved
076-connector-offsets-support.md Outdated Show resolved Hide resolved
076-connector-offsets-support.md Outdated Show resolved Hide resolved
katheris and others added 2 commits July 15, 2024 10:49
Co-authored-by: PaulRMellor <47596553+PaulRMellor@users.noreply.github.com>
Signed-off-by: Kate Stanley <11195226+katheris@users.noreply.github.com>
Signed-off-by: Katherine Stanley <11195226+katheris@users.noreply.github.com>
@scholzj
Copy link
Member

scholzj commented Jul 16, 2024

Approved with 5 binding and 2 non-binding +1 votes.

@scholzj scholzj merged commit 0b29747 into strimzi:main Jul 16, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants