Skip to content
Please note that GitHub no longer supports Internet Explorer.

We recommend upgrading to the latest Microsoft Edge, Google Chrome, or Firefox.

Learn more
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka: Add broker-level metrics-collecting filter #8188

Merged
merged 59 commits into from Jan 6, 2020

Conversation

@adamkotwasinski
Copy link
Contributor

adamkotwasinski commented Sep 9, 2019

Description: Simple Kafka filter for broker. Grabs and decodes messages and updates metrics.
If message could not be recognised (the protocol used by client<->broker comms is higher than what's present in Envoy), we just pass through the payloads (IMHO it would be bad to have Envoy impact communication in this particular filter - this is not going to be possible in "mesh filter" as we we'd need).

In detail the change includes:

  • kafka broker filter object & related config object,
  • configurability: metrics prefix (if someone ever decides to use single envoy instance for multiple brokers),
  • generated metrics lists (because we have metrics per message type type),
  • minor refactoring in response codec (expected response list kept in a separate object to reduce coupling); also a stack has been replaced with a map to make envoy less-intrusive wrt protocol ordering issues,
  • messaging_utilities test code library that's capable of making example payloads of various kinds (is going to be used for further tests).

Relates to #2852
Risk Level: Low (Kafka code is unused right now)
Testing: automated tests, manual testing with real Kafka broker and client
Docs Changes: Kafka broker filter added
Release Notes: n/a

@repokitteh repokitteh bot added waiting and removed waiting labels Sep 9, 2019
@repokitteh

This comment has been minimized.

Copy link

repokitteh bot commented Sep 10, 2019

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/.

🐱

Caused by: #8188 was synchronize by adamkotwasinski.

see: more, trace.

@repokitteh repokitteh bot added the waiting label Sep 10, 2019
@adamkotwasinski adamkotwasinski force-pushed the adamkotwasinski:kafka branch from ce10704 to 8796af1 Sep 12, 2019
@repokitteh repokitteh bot added waiting and removed waiting labels Sep 12, 2019
Signed-off-by: Adam Kotwasinski <adam.kotwasinski@gmail.com>
@adamkotwasinski adamkotwasinski force-pushed the adamkotwasinski:kafka branch from 8796af1 to 8056d1e Sep 13, 2019
@repokitteh repokitteh bot added waiting and removed waiting labels Sep 13, 2019
Signed-off-by: Adam Kotwasinski <adam.kotwasinski@gmail.com>
@adamkotwasinski adamkotwasinski force-pushed the adamkotwasinski:kafka branch from af862dd to 83d695d Sep 20, 2019
@repokitteh repokitteh bot added waiting and removed waiting labels Sep 20, 2019
…eption handling in broker-level filter

Signed-off-by: Adam Kotwasinski <adam.kotwasinski@gmail.com>
@repokitteh repokitteh bot added waiting and removed waiting labels Sep 23, 2019
… tests

Signed-off-by: Adam Kotwasinski <adam.kotwasinski@gmail.com>
@repokitteh repokitteh bot added waiting and removed waiting labels Sep 23, 2019
…sts and use these utils instead

Signed-off-by: Adam Kotwasinski <adam.kotwasinski@gmail.com>
@repokitteh repokitteh bot added waiting and removed waiting labels Sep 27, 2019
Signed-off-by: Adam Kotwasinski <adam.kotwasinski@gmail.com>
Signed-off-by: Adam Kotwasinski <adam.kotwasinski@gmail.com>
Signed-off-by: Adam Kotwasinski <adam.kotwasinski@gmail.com>
@repokitteh repokitteh bot removed the waiting:any label Dec 13, 2019
@adamkotwasinski

This comment has been minimized.

Copy link
Contributor Author

adamkotwasinski commented Dec 13, 2019

@mattklein123 I have changed the test to be manual + flaky for now.
The readme file in test directory encourages any future committer to re-run the test (just to catch something weird).

Given it's an integration test and it might run differently depending on machine speed, virtualization, other factors I can't predict today => WDYT about making it manual for now, and me working on making it become part of normal (automated) build in future.
A way to achieve it might be me basically sending a few emails on the mailing list for volunteers to run the manual test, report any issues if seen (hopefully zero), and if the scenario succeeds after a few tries => making the proper PR with removal of manual part.

Now waiting for the build to go green.

/wait

@repokitteh repokitteh bot added the waiting label Dec 13, 2019
@mattklein123

This comment has been minimized.

Copy link
Member

mattklein123 commented Dec 13, 2019

WDYT about making it manual for now, and me working on making it become part of normal (automated) build in future.

Sure sounds good. Very excited for this to land and to figure out what we can build on top.

Signed-off-by: Adam Kotwasinski <adam.kotwasinski@gmail.com>
@repokitteh repokitteh bot removed the waiting label Dec 13, 2019
@adamkotwasinski

This comment has been minimized.

Copy link
Contributor Author

adamkotwasinski commented Dec 13, 2019

/wait (will re-request review when I finally get it to green)

@repokitteh repokitteh bot added the waiting label Dec 13, 2019
…me during int tests

Signed-off-by: Adam Kotwasinski <adam.kotwasinski@gmail.com>
@repokitteh repokitteh bot added waiting and removed waiting labels Dec 20, 2019
Signed-off-by: Adam Kotwasinski <adam.kotwasinski@gmail.com>
@repokitteh repokitteh bot added waiting and removed waiting labels Jan 6, 2020
Kick CI
Signed-off-by: Adam Kotwasinski <adam.kotwasinski@gmail.com>
@repokitteh repokitteh bot removed the waiting label Jan 6, 2020
@adamkotwasinski

This comment has been minimized.

Copy link
Contributor Author

adamkotwasinski commented Jan 6, 2020

/wait-any

@repokitteh repokitteh bot added the waiting:any label Jan 6, 2020
@adamkotwasinski adamkotwasinski requested a review from mattklein123 Jan 6, 2020
@adamkotwasinski

This comment has been minimized.

Copy link
Contributor Author

adamkotwasinski commented Jan 6, 2020

Ready for re-review.
The integration test is now fully parallelizable (basically 4 random ports are picked at startup, and retried if something complains) => there will be further steps to make automated in future.
Some v2->v3 proto migration happened in master, but I think I got everything working (basically copied the structure from zookeeper filter).

Copy link
Member

mattklein123 left a comment

Absolutely epic work. Let's ship an iterate!

@repokitteh repokitteh bot removed the api label Jan 6, 2020
@mattklein123 mattklein123 merged commit a60f685 into envoyproxy:master Jan 6, 2020
18 checks passed
18 checks passed
DCO DCO
Details
ci/circleci: api Your tests passed on CircleCI!
Details
ci/circleci: coverage Your tests passed on CircleCI!
Details
ci/circleci: coverage_publish Your tests passed on CircleCI!
Details
ci/circleci: docs Your tests passed on CircleCI!
Details
ci/circleci: filter_example_mirror Your tests passed on CircleCI!
Details
ci/circleci: go_control_plane_mirror Your tests passed on CircleCI!
Details
envoy-linux Build #20200106.14 succeeded
Details
envoy-linux (bazel asan) bazel asan succeeded
Details
envoy-linux (bazel clang_tidy) bazel clang_tidy succeeded
Details
envoy-linux (bazel compile_time_options) bazel compile_time_options succeeded
Details
envoy-linux (bazel gcc) bazel gcc succeeded
Details
envoy-linux (bazel release) bazel release succeeded
Details
envoy-linux (bazel tsan) bazel tsan succeeded
Details
envoy-linux (format) format succeeded
Details
envoy-macos Build #20200106.16 succeeded
Details
envoy-windows Build #20200106.15 succeeded
Details
envoyproxy/api-shepherds must approve changes to api/
Details
sha256 = "ae7a1696c0a0302b43c5b21e515c37e6ecd365941f68a510a7e442eebddf39a1", # 2.2.0-rc2
strip_prefix = "kafka-2.2.0-rc2/clients/src/main/resources/common/message",
urls = ["https://github.com/apache/kafka/archive/2.2.0-rc2.zip"],
Comment on lines +307 to +309

This comment has been minimized.

Copy link
@moderation

moderation Jan 6, 2020

Contributor

Congrats on shipping this @adamkotwasinski. I suspect this version lines up with when you started the PR. Kafka is now at 2.4.0 - https://github.com/apache/kafka/releases/tag/2.4.0. Are there any blockers on moving from the old RC release to the latest stable release?

This comment has been minimized.

Copy link
@adamkotwasinski

adamkotwasinski Jan 6, 2020

Author Contributor

@moderation there should be none - I'll take a look as soon as I can

This comment has been minimized.

Copy link
@adamkotwasinski

adamkotwasinski Jan 7, 2020

Author Contributor

@moderation work will be in #9582 ; I will need some more time to get the 2.4 running (the descriptor files used to generate the C++ code changed a little)

strip_prefix = "kafka_2.12-2.2.0",
urls = ["http://us.mirrors.quenda.co/apache/kafka/2.2.0/kafka_2.12-2.2.0.tgz"],
Comment on lines +313 to +314

This comment has been minimized.

Copy link
@moderation

moderation Jan 6, 2020

Contributor
@adamkotwasinski adamkotwasinski deleted the adamkotwasinski:kafka branch Jan 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.