-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Fix Kafka TLS configuration with plaintext authentication #6764
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix Kafka TLS configuration with plaintext authentication #6764
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6764 +/- ##
==========================================
- Coverage 96.10% 96.09% -0.02%
==========================================
Files 358 358
Lines 21585 21597 +12
==========================================
+ Hits 20745 20754 +9
- Misses 633 635 +2
- Partials 207 208 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
what are you trying to achieve by merging main? It erases the CI checks which clearly show that your PR does not pass the linter. |
c2267d2
to
2b355d0
Compare
I was updating the branch to latest, just that. I have committed for Lint checks , now. Can you check again ? |
I have made the corrections for Unit Tests, can you update the PR label please ? @yurishkuro and run it again, I dont have necessary permissions to add the label I think. |
tlsClientConfig := tlscfg.ClientFlagsConfig{ | ||
Prefix: configPrefix, | ||
} | ||
tlsCfg, err := tlsClientConfig.InitFromViper(v) | ||
if err != nil { | ||
return fmt.Errorf("failed to process Kafka TLS options: %w", err) | ||
} | ||
tlsCfg.IncludeSystemCACertsPool = (config.Authentication == tls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed to maintain the security model difference between TLS authentication and TLS encryption:
- When using TLS authentication (auth="tls"), we need system CA certs to validate client certificates
- When using TLS encryption with SASL PLAIN auth, we don't need system CA certs
The unit tests specifically verify this distinction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using TLS encryption with SASL PLAIN auth, we don't need system CA certs
why? The only time you don't need system certs is if you are providing your own. Am I wrong about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. Both modes need CA certs for TLS validation, but they use different sources:
-
TLS auth (authentication="tls"):
- IncludeSystemCACertsPool=true to validate client/server certificates using system CA pool
- Uses mutual TLS where both authenticate each other
-
SASL PLAIN with TLS:
- IncludeSystemCACertsPool=false because it validates the server certificate using explicitly provided CA cert
- Client authentication happens via username/password
The tests expect this specific behavior to enforce these different trust models, without this distinction tests fails. Both approaches provide certificate validation, just from different trust sources.
what is the testing procedure for this change? How do we know it does what's needed? |
To verify this fix works, I've set up a test environment with:
Docker Image configuration used -
Verified with standard Kafka clients (producer/consumer) that the connection works with the same settings Without the fix in PR #6764, the ingester command would fail because TLS settings weren't properly applied when using SASL PLAIN authentication. With the fix, the connection succeeds. Do you need any more configuration related information used for testing here ? |
Is this something we can add to |
Yes, we can add an integration test in
This would formalize the manual test case I've been using to verify the fix. Would you like me to implement this as part of the PR? |
Yes, I prefer the tests to be part of the PR. However, is it possible to configure a single instance of Kafka to work with different auth-n methods, or do we need to spin different Kafka container for each auth flavor? The latter is much more expensive to run in the CI. |
Yes, this is possible. We can open different listeners in the same Kafka instance to work with different types of Authentication/security protocols. Should I go ahead with implementation ? |
yes, sounds good. One broker/container, multiple listeners, multiple tests using different ports. I would recommend not running a full test suite against each listener, only some basic write/read tests. |
Sure, I will keep in mind |
Hi @yurishkuro, As we discussed, I have added -
To test these changes -
Alternatively you can run specific integration test as well - go test -v github.com/jaegertracing/jaeger/internal/storage/integration -run TestKafkaStorageWithSASLSSLPlaintext (need to setup the docker container first though and required certs as well) |
@yurishkuro Can you review this , please. |
services: | ||
zookeeper: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need a separate zookeeper, especially for a single-node cluster? it just makes the CI longer to start
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right, I have removed zookeeper in my latest commit and used Kraft with Kafka instead
- "9092:9092" | ||
- "9092:9092" #Internal PLAINTEXT | ||
- "9094:9094" #External PLAINTEXT | ||
- "29093:29093" #SASL_PLAINTEXT + SSL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not 9093?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to 9093
scripts/e2e/kafka.sh
Outdated
CERTS_DIR="${SCRIPT_DIR}/../../docker-compose/kafka/certs" | ||
TEST_CERTS_DIR="${SCRIPT_DIR}/../../certs" | ||
|
||
mkdir -p "$CERTS_DIR" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how critical is it to generate cert for the test? We already have working certs used in other parts of unit tests
./pkg/config/tlscfg/testdata/example-server-cert.pem
./pkg/config/tlscfg/testdata/example-CA-cert.pem
./pkg/config/tlscfg/testdata/example-server-key.pem
./pkg/config/tlscfg/testdata/example-client-key.pem
./pkg/config/tlscfg/testdata/wrong-CA-cert.pem
./pkg/config/tlscfg/testdata/example-client-cert.pem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did try with these certs , but SASL + SSL would require certs generated with IP SANS information specified, these certs dont have it. But you are right, generating certs again and again is unnecessary... So I have generated the certs with IP Sans info and kept all the certs for this integration test at path -
pkg/config/tlscfg/testdata/kafka-certs , and removed the code for generating/cleaning up certs, in kafka.sh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
certs generated with IP SANS information specified
Why do we need this, isn't there skipVerify
option that ignores the host address? And how can you pre-generate IP SANS if you don't know which IP the server will be running in the CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you right, I missed that ... added the flag now. Now it will just use the previous testdata. We need JKS format , so I have generated those through the testdata provided at -
internal/config/tlscfg/testdata, and used the same in kafka's docker compose file.
@yurishkuro Please review the changes made |
your change has merge conflicts that need to be resolved |
## Which problem is this PR solving? - The test in jaegertracing#6753 was still failing, even with the submodule update ## Description of the changes - Try to fetch tags ## How was this change tested? - CI --------- Signed-off-by: Yuri Shkuro <github@ysh.us> Signed-off-by: amol-verma-allen <amol.verma@allen.in>
This change fixes the Kafka TLS configuration to work correctly when tls.enabled flag is not provided but authentication=tls is set. Previously, TLS would not be enabled in this case. Changes: - TLS is now properly configured when authentication=tls, regardless of tls.enabled - Maintains backward compatibility with existing tls.enabled flag - Sets explicit insecure mode only when TLS is intentionally disabled Testing: - Added unit tests for TLS configuration scenarios - Verified with local Kafka cluster using TLS authentication - Tested with HotROD example application Resolves jaegertracing#6744 Signed-off-by: Amol Verma <amolverma@LT-BEN-90852.local> Signed-off-by: amol-verma-allen <amol.verma@allen.in>
This change fixes the Kafka TLS configuration to work correctly when using plaintext authentication with TLS enabled. Previously, TLS would only be configured when authentication=tls, breaking SASL-SSL with PLAIN authentication. Changes: - Modified TLS configuration logic to support TLS with other authentication methods - Fixed SASL-SSL with PLAIN authentication scenario - Maintained backward compatibility with existing authentication methods - Restored pre-PR-6270 behavior for TLS configuration Resolves jaegertracing#6744 Signed-off-by: Amol Verma <amolverma@LT-BEN-90852.local> Signed-off-by: amol-verma-allen <amol.verma@allen.in>
## Which problem is this PR solving? - This is solution to issue jaegertracing#6752 ## Description of the changes - Done minor code change of deprecated symbols after bot provided upgrade ## How was this change tested? - ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [ ] I have added unit tests for the new functionality - [ ] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `npm run lint` and `npm run test` --------- Signed-off-by: AnmolxSingh <anmol7344@gmail.com> Signed-off-by: Yuri Shkuro <github@ysh.us> Co-authored-by: Yuri Shkuro <github@ysh.us> Signed-off-by: amol-verma-allen <amol.verma@allen.in>
scripts/e2e/kafka.sh
Outdated
@@ -129,4 +129,4 @@ main() { | |||
success="true" | |||
} | |||
|
|||
main "$@" | |||
main "$@" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these last two files have no changes. please git checkout main $file
to restore them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
v.Set(configPrefix+".tls.enabled", "true") | ||
} | ||
// Initialize TLS config with default values | ||
var tlsCfg configtls.ClientConfig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not necessary, you can assign config.TLS = tlsCfg
inside if()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -75,20 +84,26 @@ func (config *AuthenticationConfig) InitFromViper(configPrefix string, v *viper. | |||
config.Kerberos.KeyTabPath = v.GetString(configPrefix + kerberosPrefix + suffixKerberosKeyTab) | |||
config.Kerberos.DisablePAFXFast = v.GetBool(configPrefix + kerberosPrefix + suffixKerberosDisablePAFXFAST) | |||
|
|||
if config.Authentication == tls { | |||
if !v.IsSet(configPrefix + ".tls.enabled") { | |||
v.Set(configPrefix+".tls.enabled", "true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't you need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we don't need this. The tests are already passing without this, indicating the current implementation handles this case correctly.
Signed-off-by: Amil Bcahat <amilbcahat@gmail.com> Signed-off-by: amol-verma-allen <amol.verma@allen.in>
6d62d38
to
4a7a5c2
Compare
Signed-off-by: Amil Bcahat <amilbcahat@gmail.com> Signed-off-by: Amil Bcahat <amilbcahat@gmail.com> Signed-off-by: amol-verma-allen <amol.verma@allen.in>
4a7a5c2
to
b6d4a64
Compare
@yurishkuro Have removed the binary files... kafka.sh now has the code to generate it on each run. Please review the PR, and sorry for delay
|
8b2f91b
to
e0cd314
Compare
environment: | ||
- KAFKA_CFG_NODE_ID=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the need for these changes. Some of them just saying the same thing in reverse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right about the PROCESS_ROLES reordering from controller,broker to broker,controller is purely cosmetic. I have reverted that.
However, the other changes are functionally required because we're adding SSL and SASL authentication support to our Kafka integration tests. Here's why:
- New Test Requirements:
Our integration test suite now includes TestKafkaStorageWithSASLSSLPlaintext and TestKafkaStorageWithSASLPlaintext
These tests require different connection types: basic plaintext, SASL with SSL, and SASL without SSL
Each needs its own listener and port - Port Changes Were Mandatory:
Port 9093 was needed for SASL_SSL connections (test requirement)
Had to move controller to port 9096 to avoid conflict
Port 9095 added for SASL_PLAINTEXT connections - The 'Redundant' Security Protocol Map:
SASL_SSL:SASL_SSL,SASL_PLAINTEXT:SASL_PLAINTEXT
This isn't redundant - Kafka requires explicit mapping of each listener to its security protocol
Without these mappings, Kafka doesn't know how to handle the new connection types
4. SSL/SASL Configuration:
All the SSL keystore and SASL authentication settings support the kafka.sh script that generates certificates and runs the secure connection tests
The NODE_ID change from 0→1 was needed because NODE_ID=0 failed to start with our specific multi-listener configuration, while NODE_ID=1 works.
Signed-off-by: amol-verma-allen <amol.verma@allen.in>
Signed-off-by: amol-verma-allen <amol.verma@allen.in>
b738b03
to
75309e8
Compare
s.CleanUp = func(_ *testing.T) {} | ||
} | ||
|
||
func (s *KafkaIntegrationTestSuite) initializeWithSASLSSLPlaintext(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a lot of redundancy here, please DRY
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
scripts/e2e/kafka.sh
Outdated
if [[ ! -f "${cert_dir}/example-CA-cert.pem" || ! -f "${cert_dir}/example-server-cert.pem" || ! -f "${cert_dir}/example-server-key.pem" ]]; then | ||
echo "PEM certificate files not found. Generating certificates first..." | ||
cd "${cert_dir}" | ||
./gen-certs.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was at internal/config/tlscfg/testdata/gen-certs.sh, but have removed that if check now, as the .pem files were already checked in.
scripts/e2e/kafka.sh
Outdated
echo "Generating Kafka JKS files..." | ||
|
||
# Check if PEM files exist | ||
if [[ ! -f "${cert_dir}/example-CA-cert.pem" || ! -f "${cert_dir}/example-server-cert.pem" || ! -f "${cert_dir}/example-server-key.pem" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these files are checked in, why wouldn't they exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you are right. I have removed this if check now.
Signed-off-by: amol-verma-allen <amol.verma@allen.in>
Signed-off-by: amol-verma-allen <amol.verma@allen.in>
@mahadzaryab1 can you review? |
Which problem is this PR solving?
Description of the changes
How was this change tested?
Checklist
jaeger
:make lint test