Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributed KafkaChannel Sarama Client Refactor #297

Conversation

travis-minke-sap
Copy link
Contributor

This PR is a comprehensive refactoring of the Sarama clients (admin / producer / consumer) used in the distributed KafkaChannel implementation. The reason for the refactor is to move closer towards an implementation that could be used by both Channel implementations. The work of further refining the implementation, moving to pkg/common and using from the consolidated channel is left for the future as it is not a trivial exercise.

In general the creation of Sarama clients is not a complicated process so you might ask why would we need/want a common set of code for doing so. There are two main reasons...

  • The distributed implementation provides support for Azure EventHubs as well as a "custom" sidecar proxy for the Sarama AdminClient's ability to Create/Delete Kafka Topics. If this logic were common then the consolidated implementation would instantly gain this capability as well.
  • The Sarama library does not provide fakes or mocks for testing and is not coded against interfaces. Therefore the distributed implementation provides an Interface for the AdminClient usage as well as testing utilities (mocks/stubs) for all the Sarama clients.

OK fine, but that doesn't explain why we need to refactor the current distributed implementation? The answer is that for historical reasons the implementation was dynamically loading Secrets containing Kafka Authentication and adding them to the Sarama.Config provided. This was done to support the "pooling" of Azure EventHub Authentication credentials in order to work around the Azure limitation of 10 EventHubs (Kafka Topics) per EventHub Namespace. This refactor removes this pooling capability in favor of a simple approach whereby the Sarama.Config struct is expected to be complete and no Kubernetes resources are loaded. This also allows the distributed implementation to align with recent changes around common configuration/authentication handling.

Great, but why doesn't this PR go further and actually make the new implementation a reusable component in pkg/common? That was was the original intent of this work, but... recent changes to the consolidated have made the complicated. New code was added to List the ConsumerGroups and cache them in order to maintain KafkaChannel Status. The distributed channel has a different architecture and is able to do this tracking inline. This means that the consolidated implementation is now using the ListConsumerGroups() from the Sarama.AdminClient which is cannot be supported by the Azure EventHubs (and likely the custom-sidecar) use cases. Therefore, it makes sense to tackle that problem (if desired) as a separate effort that might consider one of the following approaches...

  • Refactor the listing/caching of ConsumerGroups in the consolidated implementation to eliminate this incompatability.
  • Add the ListConsumerGroups() functionality to the distributed implementations AdminClientInterface and find a way to support it for EventHubs/Custom-Sidecar usage.
  • etc..

Proposed Changes

  • Remove the dynamic lookup of K8S Secrets and the modification of the provided Sarama.Config from the AdminClient, SyncProducer, and Consumer clients and instead expect to be provided a complete Sarama.Config with all necessary Authentication already included.
  • Remove the EventHub Namespace Pooling thus limiting usage to 10 EventHubs (Kafka Topics)
  • Enhance the ability to "stub" the creation of these client to make unit testing easier
  • Enhanced unit tests a bit
  • Miscellaneous cleanup/enhancements

Release Note

Removed support for pooling Azure EventHub Namespaces and now only support a single Namespace/Authentication which limits Azure EventHub usage to their constrained number of EventHubs (Kafka Topics).

@knative-prow-robot knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 7, 2021
@google-cla google-cla bot added the cla: yes Indicates the PR's author has signed the CLA. label Jan 7, 2021
@knative-prow-robot knative-prow-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jan 7, 2021
@codecov
Copy link

codecov bot commented Jan 7, 2021

Codecov Report

Merging #297 (795f44d) into master (0c43d38) will increase coverage by 0.32%.
The diff coverage is 81.32%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #297      +/-   ##
==========================================
+ Coverage   75.76%   76.08%   +0.32%     
==========================================
  Files         116      114       -2     
  Lines        4555     4391     -164     
==========================================
- Hits         3451     3341     -110     
+ Misses        900      851      -49     
+ Partials      204      199       -5     
Impacted Files Coverage Δ
...tributed/common/kafka/admin/eventhub/hubmanager.go 100.00% <ø> (ø)
...hannel/distributed/common/kafka/admin/util/util.go 100.00% <ø> (ø)
...stributed/common/kafka/admin/custom/adminclient.go 61.42% <55.55%> (ø)
...channel/distributed/common/config/configwatcher.go 40.00% <66.66%> (ø)
.../distributed/controller/kafkachannel/controller.go 77.27% <68.75%> (-2.10%) ⬇️
.../channel/distributed/receiver/producer/producer.go 62.88% <72.72%> (-5.81%) ⬇️
.../channel/distributed/common/kafka/sarama/sarama.go 86.20% <80.00%> (ø)
pkg/channel/distributed/controller/util/secret.go 90.00% <82.35%> (-10.00%) ⬇️
...istributed/common/kafka/admin/kafka/adminclient.go 82.75% <82.75%> (ø)
...kg/channel/distributed/common/kafka/admin/admin.go 100.00% <100.00%> (+47.05%) ⬆️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0c43d38...9b1bad4. Read the comment docs.

Copy link
Contributor

@eric-sap eric-sap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall refactor looks great. I have a few relatively minor comments.

pkg/channel/distributed/common/config/testing/util.go Outdated Show resolved Hide resolved
pkg/channel/distributed/common/config/testing/util.go Outdated Show resolved Hide resolved
defer producertesting.RestoreNewSyncProducerFn()

// Create Producer To Test
producer := createTestProducer(t, brokers, config, mockSyncProducer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is doing exactly the same thing that TestNewProducer() is doing (except for the assert.False). Do we need TestNewProducer() as a separate test instead of just verifying "!mockSyncProducer.Closed()" here ? On the other hand, it's not hurting anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've had them combined before (removed the test) and it can sometimes get messy when you want to verify different things... You are right about similarity but I'd like to leave it this way for now.

Copy link
Contributor

@eric-sap eric-sap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look great; thanks!

@aliok
Copy link
Member

aliok commented Jan 11, 2021

@travis-minke-sap I couldn't check every change but overall the changes look good. I would like to get this merged before I start working on #298

Could you fix the linting issue please?

@travis-minke-sap
Copy link
Contributor Author

Thanks @aliok - appreciate you taking a look. @eric-sap did a really thorough review so it should be good to merge once all the builds pass. I just pushed the last linter issue - not sure how I missed that (blaming it on Friday Afternoon Syndrome ; )

@matzew
Copy link
Contributor

matzew commented Jan 11, 2021

/assign

Copy link
Contributor

@matzew matzew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 12, 2021
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: eric-sap, matzew, travis-minke-sap

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [matzew,travis-minke-sap]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow-robot knative-prow-robot merged commit facef00 into knative-extensions:master Jan 12, 2021
@travis-minke-sap travis-minke-sap deleted the sarama-client-refactor branch January 12, 2021 14:32
devguyio pushed a commit to devguyio/eventing-kafka that referenced this pull request Aug 11, 2021
…#297)

Signed-off-by: Matthias Wessendorf <mwessend@redhat.com>
matzew added a commit to matzew/eventing-kafka that referenced this pull request Aug 12, 2021
…#297)

Signed-off-by: Matthias Wessendorf <mwessend@redhat.com>
matzew added a commit to matzew/eventing-kafka that referenced this pull request Aug 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants