Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-14760: Move ThroughputThrottler from tools to clients, remove tools dependency from connect-runtime #13313

Merged

Conversation

gharris1727
Copy link
Contributor

Currently connect-runtime pulls in tools as a dependency in order to use the ThroughputThrottler.
In the future it will be desirable to have tools pull in connect-runtime as a (test) dependency, so we must remove this dependency first.

The last time this was tried (https://issues.apache.org/jira/browse/KAFKA-2752) it broke system tests and had to be reverted (https://issues.apache.org/jira/browse/KAFKA-2807).

This was because the dev version of tools was being used with 0.8.2.x clients jar in the system tests, because the test needs the VerifiableProducer which did not exist in 0.8.2.x. In order to work around this restriction, this PR changes the system test to use the 0.9.x version of tools with 0.8.2.x instead of the dev version. This will relax the current compatibility requirement between tools and clients, and should allow this refactor to land without the upgrade test failing.

This conflicts with #13302 which move some of the classes which depend on tools.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Signed-off-by: Greg Harris <greg.harris@aiven.io>
This will allow the tools implementation to depend on modern features and refactors available after 0.8.

Signed-off-by: Greg Harris <greg.harris@aiven.io>
Signed-off-by: Greg Harris <greg.harris@aiven.io>
Signed-off-by: Greg Harris <greg.harris@aiven.io>
@gharris1727
Copy link
Contributor Author

I removed the checkstyle statements which allow connect-runtime to import tools. Since this is the only dependency on tools, we also have an opportunity to disallow any importing of the tools package, similar to the kafka server.

Is this a good idea? Does it make sense for the tools package to disallow everyone from importing it?

@gharris1727
Copy link
Contributor Author

This refactor first landed in #432 and then the current dependency graph was set by #512

@gharris1727
Copy link
Contributor Author

@ijuma Can you review this project layering change? You reviewed the last time it was attempted.

Thanks!

@ijuma
Copy link
Contributor

ijuma commented Jun 6, 2023

@gharris1727 do the system tests pass with this change?

Signed-off-by: Greg Harris <greg.harris@aiven.io>
Signed-off-by: Greg Harris <greg.harris@aiven.io>
@gharris1727
Copy link
Contributor Author

@ijuma Here are the results of my local test runs of test suites which use LATEST_0_8_2 and VerifiableProducer:

  • test_verifiable_producer.py: 25/25 PASS

  • compatibility_test_new_broker_test.py 44/52
    producer_version=0.8.2.2.consumer_version=0.8.2.2.compression_types=.none.new_consumer=False.timestamp_type=None PASS

  • upgrade_test.py 51/54
    from_kafka_version=0.8.2.2.to_message_format_version=None.compression_types=.none PASS from_kafka_version=0.8.2.2.to_message_format_version=None.compression_types=.snappy FAIL

Looking into the snappy failure, it appears that the broker can't find the native snappy library, and the client is receiving an opaque InvocationTargetException. I think this is an environment specific problem, and it also appears on trunk.

Are you able to start a branch build system test for this?

@gharris1727
Copy link
Contributor Author

It may be the case that the Snappy native library failures are due to running the tests on a Mac M1, with aarch64. If that is true, I would expect the upgrade test to pass on an i86_64 architecture test runner.

@ijuma
Copy link
Contributor

ijuma commented Jun 10, 2023

I started a system tests run for this branch.

@gharris1727
Copy link
Contributor Author

Hey @ijuma Do you have the system test results for this branch?

@gharris1727
Copy link
Contributor Author

I ran a full system test run:

================================================================================
SESSION REPORT (ALL TESTS)
ducktape version: 0.11.3
session_id:       2023-07-18--002
run time:         1602 minutes 29.170 seconds
tests run:        1096
passed:           786
flaky:            0
failed:           20
ignored:          290
================================================================================

With the following failed tests:

'tests/kafkatest/tests/core/throttling_test.py::ThrottlingTest.test_throttled_reassignment@{"bounce_brokers":true}' 
'tests/kafkatest/tests/core/mirror_maker_test.py::TestMirrorMakerService.test_bounce@{"clean_shutdown":false,"security_protocol":"SASL_PLAINTEXT"}' 
'tests/kafkatest/tests/core/mirror_maker_test.py::TestMirrorMakerService.test_bounce@{"clean_shutdown":false,"security_protocol":"SASL_SSL"}' 
'tests/kafkatest/tests/core/reassign_partitions_test.py::ReassignPartitionsTest.test_reassign_partitions@{"bounce_brokers":false,"reassign_from_offset_zero":false,"metadata_quorum":"REMOTE_KRAFT"}' 
'tests/kafkatest/tests/core/reassign_partitions_test.py::ReassignPartitionsTest.test_reassign_partitions@{"bounce_brokers":true,"reassign_from_offset_zero":false,"metadata_quorum":"REMOTE_KRAFT"}' 
'tests/kafkatest/tests/core/reassign_partitions_test.py::ReassignPartitionsTest.test_reassign_partitions@{"bounce_brokers":true,"reassign_from_offset_zero":false,"metadata_quorum":"ZK"}' 
'tests/kafkatest/tests/streams/streams_smoke_test.py::StreamsSmokeTest.test_streams@{"processing_guarantee":"at_least_once","crash":false,"metadata_quorum":"REMOTE_KRAFT"}' 
'tests/kafkatest/tests/core/mirror_maker_test.py::TestMirrorMakerService.test_bounce@{"clean_shutdown":false,"security_protocol":"PLAINTEXT"}' 
'tests/kafkatest/tests/core/mirror_maker_test.py::TestMirrorMakerService.test_bounce@{"clean_shutdown":false,"security_protocol":"SSL"}' 
'tests/kafkatest/tests/tools/replica_verification_test.py::ReplicaVerificationToolTest.test_replica_lags@{"metadata_quorum":"REMOTE_KRAFT"}' 
'tests/kafkatest/tests/client/quota_test.py::QuotaTest.test_quota@{"quota_type":"(user, client-id)","override_quota":false}' 
'tests/kafkatest/tests/client/quota_test.py::QuotaTest.test_quota@{"quota_type":"(user, client-id)","override_quota":true}' 
'tests/kafkatest/tests/client/quota_test.py::QuotaTest.test_quota@{"quota_type":"client-id","consumer_num":2}' 
'tests/kafkatest/tests/client/quota_test.py::QuotaTest.test_quota@{"quota_type":"client-id","old_broker_throttling_behavior":true}' 
'tests/kafkatest/tests/client/quota_test.py::QuotaTest.test_quota@{"quota_type":"client-id","old_client_throttling_behavior":true}' 
'tests/kafkatest/tests/client/quota_test.py::QuotaTest.test_quota@{"quota_type":"client-id","override_quota":false}' 
'tests/kafkatest/tests/client/quota_test.py::QuotaTest.test_quota@{"quota_type":"client-id","override_quota":true}' 
'tests/kafkatest/tests/client/quota_test.py::QuotaTest.test_quota@{"quota_type":"user","override_quota":false}' 
'tests/kafkatest/tests/client/quota_test.py::QuotaTest.test_quota@{"quota_type":"user","override_quota":true}' 
'tests/kafkatest/tests/core/network_degrade_test.py::NetworkDegradeTest.test_rate@{"task_name":"rate-1000-latency-50","device_name":"eth0","latency_ms":50,"rate_limit_kbit":1000000}'

None of which make use of the 0.8.2.x artifacts version which is being affected here. In particular, the test which I was concerned about (upgrade_test.py from_kafka_version=0.8.2.2.to_message_format_version=None.compression_types=.snappy FAIL) does pass on this i86_64 machine when it failed on my arm64 machine, indicating that the failure was due to native library dependencies missing.

@gharris1727
Copy link
Contributor Author

@ijuma Could you take another look at this? This is blocking KIP-898 that I'm trying to get landed in time for 3.6.0. Thanks!

Copy link
Contributor

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@gharris1727 gharris1727 merged commit 125dbb9 into apache:trunk Jul 20, 2023
1 check failed
@gharris1727
Copy link
Contributor Author

Thanks for your help Ismael!

jeqo pushed a commit to jeqo/kafka that referenced this pull request Jul 21, 2023
…ools dependency from connect-runtime (apache#13313)

Reviewers: Ismael Juma <ismael@juma.me.uk>
@fvaleri
Copy link
Collaborator

fvaleri commented Jul 25, 2023

Hi @gharris1727, it seems that this is causing some system test failures: https://issues.apache.org/jira/browse/KAFKA-15239. I'm working on a fix.

@gharris1727
Copy link
Contributor Author

@fvaleri Thanks for pointing that out! I mistakenly assumed that I wasn't going to be affecting intermediate versions, or that the location I changed was the only place where we mixed JARs. If you believe it's necessary to revert this, please let me know.

@fvaleri
Copy link
Collaborator

fvaleri commented Jul 25, 2023

I opened a PR with an attempt to fix it: #14092.

It works with mentioned STs, but I'm running all non-core STs to actually verify it's all good.

Cerchie pushed a commit to Cerchie/kafka that referenced this pull request Jul 25, 2023
…ools dependency from connect-runtime (apache#13313)

Reviewers: Ismael Juma <ismael@juma.me.uk>
jeqo pushed a commit to aiven/kafka that referenced this pull request Aug 15, 2023
…ools dependency from connect-runtime (apache#13313)

Reviewers: Ismael Juma <ismael@juma.me.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants