Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable authentication, authorization, and SSL for Cruise Control API #5073

Merged
merged 29 commits into from Oct 4, 2021

Conversation

kyguy
Copy link
Member

@kyguy kyguy commented Jun 2, 2021

Type of change

  • Enhancement / new feature

Description

Locks down the Cruise Control API by enforcing API authentication, authorization, and SSL by default. But gives users the flexibility to disable these security configurations [1] if needed through the spec.cruiseControl.config section the the Kafka resource. e.g.

apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
  ...
spec:
  cruiseControl:
    config:
      webserver.security.enable: false  # Disables CC HTTP Basic authentication
      webserver.ssl.enable: false       # Disables CC API SSL

Addresses #3770

[1] https://github.com/linkedin/cruise-control/wiki/Security

Checklist

Please go through this checklist and make sure all applicable tasks have been done

  • Write tests
  • Make sure all tests pass
  • Update documentation
  • Check RBAC rights for Kubernetes / OpenShift roles
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging
  • Update CHANGELOG.md
  • Supply screenshots for visual changes, such as Grafana dashboards

@kyguy kyguy marked this pull request as draft June 2, 2021 15:21
@kyguy kyguy changed the title Prototype for configurable authentication and authorization for Cruis… Configurable authentication and authorization for Cruis… Jun 2, 2021
@kyguy kyguy force-pushed the lock-down-cc branch 3 times, most recently from f2a5b7f to f3905d2 Compare June 15, 2021 20:48
@kyguy kyguy force-pushed the lock-down-cc branch 3 times, most recently from d1527a3 to 32a60ff Compare June 29, 2021 19:39
@kyguy kyguy marked this pull request as ready for review June 29, 2021 19:40
@kyguy kyguy added this to the 0.25.0 milestone Jun 29, 2021
Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The file api/src/test/resources/io/strimzi/api/kafka/model/040-Crd-kafka.yaml should not exist anymore. You might need to rebase the PR.

@@ -129,8 +129,10 @@ static EnvVar tlsSidecarLogEnvVar(TlsSidecar tlsSidecar) {
}

public static Secret buildSecret(Reconciliation reconciliation, ClusterCa clusterCa, Secret secret, String namespace, String secretName,
String commonName, String keyCertName, Labels labels, OwnerReference ownerReference, boolean isMaintenanceTimeWindowsSatisfied) {
Map<String, String> data = new HashMap<>(4);
String commonName, String keyCertName, Labels labels, Map<String, String> data, OwnerReference ownerReference, boolean isMaintenanceTimeWindowsSatisfied) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method main focus is to deal with the certificate secrets. I do not think you should mix any other data into it. You should add it in some other way or maybe even better, create additional secret instead of using the certificate secrets.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following your suggestion, I have just separated the Cruise Control API credentials its own secret in the latest commit

Comment on lines 161 to 164
public static String clientsCaCertStorePassword() {
return Base64.getEncoder().encodeToString(CLIENTS_CERT_STORE_PASSWORD.getBytes(Charset.defaultCharset()));
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the secret mocked in other tests without this method? Maybe you can use the same instead of adding this?

Copy link
Member Author

@kyguy kyguy Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is hard-coded in the other tests e.g. [1] [2] I figured we could refactor the other occurances in a separate PR since there are a lot of other references [3]

[1]


[2]
AbstractModel.clusterCaCertSecretName(name), MockCertManager.clusterCaCert(), MockCertManager.clusterCaCertStore(), "123456"));

[3] https://github.com/strimzi/strimzi-kafka-operator/search?q=123456

@@ -43,12 +43,16 @@ cluster.configs.file=$CC_CLUSTER_CONFIG_FILE
webserver.accesslog.path=$CC_ACCESS_LOG
webserver.http.address=0.0.0.0
webserver.http.cors.allowmethods=OPTIONS,GET
webserver.ssl.keystore.location=/tmp/cruise-control/client-server.keystore.p12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this deal with the TLS hostname verification when you use the same secret for both connecting to Kafka and as a server certificate for the webserver?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. I have now added Cruise Control's public certificate to the truststore of the KafkaRebalance client so that TLS hostname verification is preformed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some better name than client-server would be better? Is it a client certificate? Server certificate? What client and what server? Maybe it should be called cruise-control.keystore.p12?

@scholzj scholzj modified the milestones: 0.25.0, 0.26.0 Aug 2, 2021
@kyguy
Copy link
Member Author

kyguy commented Aug 6, 2021

Currently working on the Cruise Control mock-server tests to handle TLS requests, I will ask for another round of reviews once that is fixed!

@kyguy kyguy force-pushed the lock-down-cc branch 3 times, most recently from 5f057a3 to 2b8bf5b Compare August 20, 2021 16:30
@kyguy kyguy requested a review from d-laing August 20, 2021 19:26
@kyguy
Copy link
Member Author

kyguy commented Aug 20, 2021

This PR is now ready for another round of reviews @scholzj @tomncooper @ppatierno !

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the api/src/test/resources/io/strimzi/api/kafka/model/040-Crd-kafka.yaml file should not be there anymore.

return API_AUTH_CONFIG_VOLUME_MOUNT + CC_AUTH_CREDENTIALS_FILENAME;
}

private static boolean isEnabled(CruiseControlConfiguration config, String s1, String s2) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like isEnabledInConfiguration would be better name. isEnabled might be confusing as it suggests to tell whether CC is enabled or disabled.

Comment on lines 274 to 280
CruiseControlConfiguration c = (CruiseControlConfiguration) cruiseControl.getConfiguration();
Boolean sslEnabled = isApiAuthenticationEnabled(c);
if (sslEnabled) {
cruiseControl.uriScheme = HTTPS_SCHEME;
} else {
cruiseControl.uriScheme = HTTP_SCHEME;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused. From some parts it seems like the authentication is based on some username / password. But here you seem to use it only to enable the TLS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this part only to sets the scheme used by the probes without setting up the certs/keys since the kubelet skips certificate verification when the scheme is set to HTTPS [1]

The username/password bits in the other parts is setting up HTTP Basic authentication primarily used here for role based(authorization) access to the API

[1] https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#http-probes

Comment on lines 496 to 497
.withLivenessProbe(ProbeGenerator.httpProbe(livenessProbeOptions, livenessPath, REST_API_PORT_NAME, uriScheme, getHttpHeaders()))
.withReadinessProbe(ProbeGenerator.httpProbe(readinessProbeOptions, readinessPath, REST_API_PORT_NAME, uriScheme, getHttpHeaders()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will expose the username and password in the Pod YAML. I do not think this is secure. Doesn't Cruise Control have some health check endpoint outside of authentication?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately Cruise Control doesn't have a health check outside of the authentication but the credentials of the role exposed is meant to be public anyway. The role exposed has USER privileges which only has access to the GET endpoints except bootstrap and train. This role is exposed so users can get connect third-party tools for querying Cruise Control for information about their cluster

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this makes sense. Why do we build the authentication if we say that the secret credentials are not secret?

Copy link
Member Author

@kyguy kyguy Aug 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two roles:

  • ADMIN role which has access to all of the endpoints, these credential for this role are kept secret and used exclusively by Strimzi for Cruise Control admin operations.
  • USER role which has access to the GET endpoints, these credentials are exposed for Strimzi users to use as they please
    The probes only need/expose the USER role to check the health of Cruise Control

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I don't think this is right:

  • You cannot say you implement security and share the credentials publicly like this. You need to find some other way. This would if nothing else trigger a lot of questions and issues about exposed credentials etc.
  • You say this is intended to be used by the users. But this is not the impression I got from the docs changes or PR description ... it does not explain how to create the credentials for the users or how to get them etc. Or did I missed it somewhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. To address these concerns I have done the following:

  • Updated the probes to use healthcheck script to protect the username and password from being exposed in the Pod Yaml
  • Removed the static password for the user role. If people want direct access to the REST API whether it be for reading or writing, they will have to disable the security configurations. We can always make the API roles configurable in a later PR.

@@ -43,12 +43,16 @@ cluster.configs.file=$CC_CLUSTER_CONFIG_FILE
webserver.accesslog.path=$CC_ACCESS_LOG
webserver.http.address=0.0.0.0
webserver.http.cors.allowmethods=OPTIONS,GET
webserver.ssl.keystore.location=/tmp/cruise-control/client-server.keystore.p12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some better name than client-server would be better? Is it a client certificate? Server certificate? What client and what server? Maybe it should be called cruise-control.keystore.p12?

ssl.keystore.password=$CERTS_STORE_PASSWORD
ssl.truststore.type=PKCS12
ssl.truststore.location=/tmp/cruise-control/replication.truststore.p12
ssl.truststore.location=/tmp/cruise-control/client.truststore.p12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, the naming seems to be a bit unintuitice ... should this be called something like kafka.truststore or something? Or keep the old name whicih was much more clear than this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your suggestion for the keystore: cruise-control.keystore.p12 and I was thinking we could use the same pattern for the truststore: cruise-control.trustore.p12 since it is cruise control's truststore. Although I do understand the reasoning behind the original naming scheme that the truststore contains the public Kafka certs. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cruise-control.trustore.p12 reads as This is the truststore used for mTLS authentication of users connecting to the Cruise Control APIs.. But it is not that, or? It is the Truststore used to trust the Kafka brokers when connecting to them. So if I got it right, I think something what relates it to the Kafka cluster instead of Cruise Control makes more sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was reading cruise-control.truststore.p12 as This is the truststore holding the certs used for identifying apps which Cruise Control trusts. From what I understood a truststore can hold multiple certificate entries used for identifying different apps ( although in this specific case, Cruise Control only needs the certs for identifying/trusting the Kafka brokers).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I guess this This is the truststore holding the certs used for identifying apps which Cruise Control trusts applies for both cases ... but at least me personally, I would by the name link it to the Cruise Control API which for me is the main CC interface. Not to the Kafka client which is inside.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, in that case I will switch it back to what is was originally, replication.truststore.p12, I want it to be as clear as possible and there is no sense in me changing the name if it is going to mislead people!

@@ -14,10 +14,10 @@ mkdir -p /tmp/cruise-control
# Import certificates into keystore and truststore
"$CRUISE_CONTROL_HOME"/cruise_control_tls_prepare_certificates.sh

export STRIMZI_TRUSTSTORE_LOCATION=/tmp/cruise-control/replication.truststore.p12
export STRIMZI_TRUSTSTORE_LOCATION=/tmp/cruise-control/client-server.truststore.p12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the file name correct? In other places you use client.truststore.p12 (which as I commented above might not be the best choice, but here I wonder if the name is correct in the first place).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch, this was a typo

Comment on lines 177 to 208
.An example of disabling Cruise Control API authorization.
[source,yaml,subs="attributes+"]
----
apiVersion: {KafkaApiVersion}
kind: Kafka
metadata:
name: my-cluster
spec:
# ...
cruiseControl:
config:
webserver.security.enable: false
# ...
----

.An example of disabling Cruise Control API authentication.
[source,yaml,subs="attributes+"]
----
apiVersion: {KafkaApiVersion}
kind: Kafka
metadata:
name: my-cluster
spec:
# ...
cruiseControl:
config:
webserver.ssl.enable: false
# ...
----
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is weird. Does Cruise Control really enable HTTP Basic authentication with .ssl.enable option? And Authorization with security.enable? If it is like than then I guess we have no choice, but this seems highly misleading.

Copy link
Member Author

@kyguy kyguy Aug 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I mixed these up in the original description of the PR, it should be:

  • webserver.security.enable -> HTTP Basic authentication (which I am using for authorization)
  • webserver.ssl.enable -> HTTPS authentication

Copy link
Member

@d-laing d-laing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I made a few more comments.

documentation/modules/appendix_crds.adoc Show resolved Hide resolved

[[api-security-configuration]]
[discrete]
== API security
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
== API security
== Disabling API security

[discrete]
== API security

The Cruise Control API is secured by default to protect the cluster against any potentially destructive Cruise Control operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The Cruise Control API is secured by default to protect the cluster against any potentially destructive Cruise Control operations.
The Cruise Control REST API is secured by default to protect the cluster against potentially destructive Cruise Control operations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please give an example of such an operation.

"The Cruise Control REST API is secured by default to protect the cluster against potentially destructive Cruise Control operations, such as decommissioning Kafka brokers."


The Cruise Control API is secured by default to protect the cluster against any potentially destructive Cruise Control operations.

However, these security settings can be disabled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
However, these security settings can be disabled.
However, you can disable these security settings if needed.


However, these security settings can be disabled.

.An example of disabling Cruise Control API authorization.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.An example of disabling Cruise Control API authorization.
.Example Cruise Control configuration to disable API authorization.

# ...
----

.An example of disabling Cruise Control API authentication.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.An example of disabling Cruise Control API authentication.
.Example Cruise Control configuration to disable API authentication.

The Cruise Control API is secured by default to protect the cluster against any potentially destructive Cruise Control operations.

However, these security settings can be disabled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a bulleted list:

  • To disable the built-in authorization, set webserver.security.enable to false.
  • To disable the built-in authentication, set webserver.ssl.enable false.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved.

Copy link
Member

@d-laing d-laing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits


However, you can disable these security settings if needed.

* To disable the built-in authorization, set webserver.security.enable to false.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* To disable the built-in authorization, set webserver.security.enable to false.
* To disable the built-in authorization, set `webserver.security.enable` to `false`.

However, you can disable these security settings if needed.

* To disable the built-in authorization, set webserver.security.enable to false.
* To disable the built-in authentication, set webserver.ssl.enable false.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* To disable the built-in authentication, set webserver.ssl.enable false.
* To disable the built-in authentication, set `webserver.ssl.enable` to `false`.

* To disable the built-in authorization, set webserver.security.enable to false.
* To disable the built-in authentication, set webserver.ssl.enable false.

.Example Cruise Control configuration to disable API authorization.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.Example Cruise Control configuration to disable API authorization.
.Example Cruise Control configuration to disable API authorization

# ...
----

.Example Cruise Control configuration to disable API authentication.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.Example Cruise Control configuration to disable API authentication.
.Example Cruise Control configuration to disable API authentication

@kyguy kyguy force-pushed the lock-down-cc branch 3 times, most recently from ea0c603 to e9b8a4b Compare September 2, 2021 18:27
Signed-off-by: Kyle Liberti <kliberti@redhat.com>
Signed-off-by: Kyle Liberti <kliberti@redhat.com>
Signed-off-by: Kyle Liberti <kliberti@redhat.com>
Copy link
Member

@d-laing d-laing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@d-laing
Copy link
Member

d-laing commented Sep 21, 2021

Should we change the title of the procedure to "Cruise Control REST API security"?

Signed-off-by: Kyle Liberti <kliberti@redhat.com>
Signed-off-by: Kyle Liberti <kliberti@redhat.com>
@kyguy
Copy link
Member Author

kyguy commented Sep 21, 2021

@scholzj I updated the Cruise Control systemtest client logic as well as added some tests to confirm the Cruise Control REST API is reachable when the security is enabled and disabled. Let me know what you think!

This is ready for another automated system tests run!

@scholzj
Copy link
Member

scholzj commented Sep 21, 2021

/azp run regression

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@@ -105,6 +128,8 @@
private TlsSidecar tlsSidecar;
private String tlsSidecarImage;
private String minInsyncReplicas = "1";
private boolean authenticationEnabled;
private boolean authorizationEnabled;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about the naming of these ... I mean they should just reflect what they are in Cruise Control so "security" and "ssl". AFAIU from CC doc, the webserver.security.enable enables the HTTP basic authentication so why using the isApiAuthorizationEnabled method to fill the authorizationEnabled field? it's about enabling the HTTP basic authentication; for webserver.ssl.enable, it's enabling the SSL/TLS layer on HTTP, so HTTPS but it is not related to authentication, so why using isApiAuthenticationEnabled to fill the corresponding authenticationEnabled? I would say that fields should just be securityEnabled and sslEnabled and so the corresponding methods getting the config from CC. Am I misunderstading anything here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe you could use authenticationEnabled for the webserver.security.enable and something like encryptionEnabled for webserver.ssl.enable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you are right! Let's go with your suggestion below:

  • authEnabled for the webserver.security.enable
  • sslEnabled for webserver.ssl.enable

This will be more clear and reduce the characters in the variable names!

@@ -105,6 +128,8 @@
private TlsSidecar tlsSidecar;
private String tlsSidecarImage;
private String minInsyncReplicas = "1";
private boolean authenticationEnabled;
private boolean authorizationEnabled;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe you could use authenticationEnabled for the webserver.security.enable and something like encryptionEnabled for webserver.ssl.enable

Signed-off-by: Kyle Liberti <kliberti@redhat.com>
Signed-off-by: Kyle Liberti <kliberti@redhat.com>
@kyguy kyguy changed the title Configurable authentication and authorization for Cruis… Configurable auth and SSL for Cruise Control API Sep 30, 2021
CHANGELOG.md Outdated
@@ -8,6 +8,7 @@
* Enable Cruise Control anomaly.detection configurations
* Add support for building connector images from the Maven coordinates
* Allow Kafka Connect Build artifacts to be downloaded from insecure servers (#5542)
* Configurable auth and SSL for Cruise Control API
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for users we should be more explicit, there is no hurt to say "authentication/authorization" here.

@@ -216,6 +261,9 @@ public static CruiseControl fromCrd(Reconciliation reconciliation, Kafka kafkaAs
cruiseControl.setTlsSidecar(tlsSidecar);

cruiseControl = cruiseControl.updateConfiguration(spec);
CruiseControlConfiguration c = (CruiseControlConfiguration) cruiseControl.getConfiguration();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ccConfiguration variable maybe better than just c ?

webserver.ssl.keystore.location=/tmp/cruise-control/cruise-control.keystore.p12
webserver.ssl.keystore.password=$CERTS_STORE_PASSWORD
webserver.ssl.keystore.type=PKCS12
webserver.ssl.key.password=$CERTS_STORE_PASSWORD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be the passphrase to "decrypt" the key stored in the key store I guess. Is it mandatory in cruise control SSL feature? We don't have this kind of configuration for Kafka related certificates in the corresponding keystores, right? I mean, we already have the password for the keystore to get certificate and key, do we really need the "passphrase" for the key as well? You are setting to the same as the certs store password so kind of not adding more security. Just wondering if it's not mandatory and we can avoid setting it.

Copy link
Member Author

@kyguy kyguy Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same understanding and although we shouldn't need it because we don't encrypt keystore keys(from what I understand), Cruise Control with throw a NPE if we have don't include the webserver.ssl.key.password when webserver.ssl.enable is enabled [1]

[1] https://github.com/linkedin/cruise-control/blob/0ba9ddd44aecf8f278d9fca3d38df6d54b1ac55b/cruise-control/src/main/java/com/linkedin/kafka/cruisecontrol/KafkaCruiseControlApp.java#L107 (edited to fix link)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the CC code I see that it just sets the password on the Jetty SslContextFactory, and without specifying it we are passing null. From the Jetty code [1] I see that passing null should not hurt, it should mean just "no password on the key". Have you tried to avoid using the password? Maybe it's worth giving it a try?

[1] https://github.com/eclipse/jetty.project/blob/b56edf511ab4399122ea2c6162a4a5988870f479/jetty-util/src/main/java/org/eclipse/jetty/util/ssl/SslContextFactory.java#L816

Copy link
Member Author

@kyguy kyguy Oct 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when I tested this yesterday, I removed the whole line webserver.ssl.key.password=$CERTS_STORE_PASSWORD, that's probably why CC threw the NPE from [1], Cruise Control goes to look for the webserver.ssl.key.password key and throws a NPE when it's not found. Let me try testing with just webserver.ssl.key.password= or webserver.ssl.key.password="" in the config to see if we get a different result!

[1] https://github.com/linkedin/cruise-control/blob/0ba9ddd44aecf8f278d9fca3d38df6d54b1ac55b/cruise-control/src/main/java/com/linkedin/kafka/cruisecontrol/KafkaCruiseControlApp.java#L107

Copy link
Member Author

@kyguy kyguy Oct 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the CC config, I just tested with:

  • webserver.ssl.key.password=
  • webserver.ssl.key.password=""
  • webserver.ssl.key.password=null

All of which cause CC to throw the exception below. From what I understand from SO, this can happen when you use the wrong password leading me to think that the $CERT_STORE_PASSWORD is used/needed afterall

 java.security.UnrecoverableKeyException: Get Key failed: Given final block not properly padded. Such issues can arise if a bad key is used during decryption.
     at sun.security.pkcs12.PKCS12KeyStore.engineGetKey(PKCS12KeyStore.java:446) ~[?:?]
     at sun.security.util.KeyStoreDelegator.engineGetKey(KeyStoreDelegator.java:90) ~[?:?]
     at java.security.KeyStore.getKey(KeyStore.java:1057) ~[?:?]
     at sun.security.ssl.SunX509KeyManagerImpl.<init>(SunX509KeyManagerImpl.java:145) ~[?:?]
     at sun.security.ssl.KeyManagerFactoryImpl$SunX509.engineInit(KeyManagerFactoryImpl.java:70) ~[?:?]
     at javax.net.ssl.KeyManagerFactory.init(KeyManagerFactory.java:271) ~[?:?]    
     at org.eclipse.jetty.util.ssl.SslContextFactory.getKeyManagers(SslContextFactory.java:1249) ~[jetty-util-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.util.ssl.SslContextFactory$Server.getKeyManagers(SslContextFactory.java:2363) ~[jetty-util-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.util.ssl.SslContextFactory.load(SslContextFactory.java:373) ~[jetty-util-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.util.ssl.SslContextFactory.doStart(SslContextFactory.java:244) ~[jetty-util-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[jetty-util-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) ~[jetty-util-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) ~[jetty-util-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.server.SslConnectionFactory.doStart(SslConnectionFactory.java:97) ~[jetty-server-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[jetty-util-9.4.42.v20210604.jar:9.4.42.v20210604]   
     at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169) ~[jetty-util-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117) ~[jetty-util-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.server.AbstractConnector.doStart(AbstractConnector.java:321) ~[jetty-server-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:81) ~[jetty-server-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:234) ~[jetty-server-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[jetty-util-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.server.Server.doStart(Server.java:401) ~[jetty-server-9.4.42.v20210604.jar:9.4.42.v20210604]
     at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73) ~[jetty-util-9.4.42.v20210604.jar:9.4.42.v20210604]
     at com.linkedin.kafka.cruisecontrol.KafkaCruiseControlApp.start(KafkaCruiseControlApp.java:66) ~[cruise-control-2.5.57.jar:?]
     at com.linkedin.kafka.cruisecontrol.KafkaCruiseControlMain.main(KafkaCruiseControlMain.java:40) ~[cruise-control-2.5.57.jar:?]
 Caused by: javax.crypto.BadPaddingException: Given final block not properly padded. Such issues can arise if a bad key is used during decryption.
     at com.sun.crypto.provider.CipherCore.unpad(CipherCore.java:975) ~[?:?]
     at com.sun.crypto.provider.CipherCore.fillOutputBuffer(CipherCore.java:1056) ~[?:?]
     at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:853) ~[?:?]
     at com.sun.crypto.provider.PKCS12PBECipherCore.implDoFinal(PKCS12PBECipherCore.java:408) ~[?:?]    
     at com.sun.crypto.provider.PKCS12PBECipherCore$PBEWithSHA1AndDESede.engineDoFinal(PKCS12PBECipherCore.java:440) ~[?:?]
     at javax.crypto.Cipher.doFinal(Cipher.java:2202) ~[?:?]
     at sun.security.pkcs12.PKCS12KeyStore.lambda$engineGetKey$0(PKCS12KeyStore.java:387) ~[?:?]
     at sun.security.pkcs12.PKCS12KeyStore$RetryWithZero.run(PKCS12KeyStore.java:283) ~[?:?]     
     at sun.security.pkcs12.PKCS12KeyStore.engineGetKey(PKCS12KeyStore.java:381) ~[?:?]
     ... 24 more

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems to be Jetty that sucks! We have to keep the password for now but I would open an issue and PR on CC repo to avoiding running setKeyManagerPassword if it's not provided in the configuration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened an issue here [1], I'll follow up from there

[1] linkedin/cruise-control#1707

@@ -17,7 +17,7 @@ mkdir -p /tmp/cruise-control
export STRIMZI_TRUSTSTORE_LOCATION=/tmp/cruise-control/replication.truststore.p12
export STRIMZI_TRUSTSTORE_PASSWORD="$CERTS_STORE_PASSWORD"

export STRIMZI_KEYSTORE_LOCATION=/tmp/cruise-control/replication.keystore.p12
export STRIMZI_KEYSTORE_LOCATION=/tmp/cruise-control/cruise-control.keystore.p12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change in naming? Let me see if I got it correctly ... before this PR, this keystore was used for containing CC certificate and private key in order to have "mutual" authentication (so TLS client authentication) of CC against Kafka cluster (when it connects to the cluster for doing its stuff).
With this PR you are adding in the same keystore, the cert and key that are used by CC for TLS encryption versus clients connecting to its API (so the cluster operator).
Would it make sense to change the name of replication.truststore.p12 in the same way? In the end it's not about replication here. Wdyt @kyguy @tomncooper @scholzj ?

Copy link
Member Author

@kyguy kyguy Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly right Paolo, this was the reasoning behind the name change for the keystore. As for the truststore, Jakub and I had the same discussion here [1] and decided that it was best to leave the truststore as replication. Take a look over the discussion and let us know what you think!

[1] #5073 (comment)

* To disable the built-in HTTP Basic authentication, set `webserver.security.enable` to `false`.
* To disable the built-in SSL, set `webserver.ssl.enable` to `false`.

.Example Cruise Control configuration to disable API authorization and authentication
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"authorization and authentication" now it's more authentication/authorization and SSL

Signed-off-by: Kyle Liberti <kliberti@redhat.com>
@kyguy kyguy changed the title Configurable auth and SSL for Cruise Control API Configurable authentication, authorization, and SSL for Cruise Control API Sep 30, 2021
Copy link
Contributor

@tomncooper tomncooper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great work Kyle. My only outstanding question is why you have included the USER role as well as the ADMIN one? You don't seem (but I may have missed something) to be using it?

@kyguy
Copy link
Member Author

kyguy commented Oct 1, 2021

@tomncooper Nice catch! The USER role is only used by the probes, but the probes could have easily been set up to use the ADMIN role instead. I originally included the USER role for Strimzi users to have read access to the Cruise Control API but stripped out the public password to simplify the PR(with the intention of adding it and documentation in another PR!). Since the probes only need read access to the CC API I left them using the USER role

@kyguy
Copy link
Member Author

kyguy commented Oct 1, 2021

@scholzj when you get a chance, could you run the automated system tests again?

@scholzj
Copy link
Member

scholzj commented Oct 1, 2021

/azp run regression

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@scholzj scholzj merged commit 45eecae into strimzi:main Oct 4, 2021
@kyguy kyguy deleted the lock-down-cc branch November 16, 2021 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants