Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kine #66

Merged
merged 4 commits into from
Jul 7, 2022
Merged

kine #66

merged 4 commits into from
Jul 7, 2022

Conversation

mendrugory
Copy link
Contributor

@mendrugory mendrugory commented Jun 23, 2022

This PR includes a couple of commits whose goal is related to the issue #22

The first one is a refactor to accommodate the code in order to include new storage systems for the tenants.
The second one totally related to #22. The approach has been to deploy a kine instance per tenant, as side car container into the control manager pod, which will be pointed to the right storage backend. Currently, there is only mysql implementation.

Some new parameters has been added to work with new storage back-ends (as flags or env variables):

--etcd-storage-type                  ETCD Storage type (i.e. "etcd", "kine-mysql"). (default: "etcd")
--kine-mysql-host                    Host where MySQL is running (default: "localhost")
--kine-mysql-port int                Port where MySQL is running (default: 3306)
--kine-mysql-secret-name             Name of the secret where the necessary configuration and certificates are. (default: "mysql-config")
--kine-mysql-secret-name             Name of the namespace of the secret where the necessary configuration and certificates are. (default: "kamaji-system")

It is necessary to have a running compatible-mysql database.

The secret with the configuration and certificates for mysql should look like:

apiVersion: v1
data:
  MYSQL_ROOT_PASSWORD: ...
  ca.crt: ...
  ca.key: ...
  mysql-ssl.cnf: ...
  server.crt:  ...
  server.key:  ...
kind: Secret
metadata:
  name: mysql-config
  namespace: kamaji-system
type: Opaque

and mysql-ssl.cnf:

[mysqld]
ssl-ca=/etc/mysql/conf.d/ca.crt
ssl-cert=/etc/mysql/conf.d/server.crt
ssl-key=/etc/mysql/conf.d/server.key
require_secure_transport=ON

A new issue has been opened in order to perform a future refactor #67

@mendrugory mendrugory force-pushed the gonzalo/issues/22 branch 21 times, most recently from 56f31d2 to c969422 Compare June 29, 2022 15:22
@mendrugory mendrugory changed the title WIP: kine kine Jun 30, 2022
@prometherion
Copy link
Member

@bsctl some changes are requested to address also pgsql, as well some refactoring: would you like to get it merged in order to continue the development?

@bsctl
Copy link
Member

bsctl commented Jul 1, 2022

What about to create a tag before to merge? It seems a huge refactoring ..

Copy link
Member

@prometherion prometherion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm testing this feature and I'm encountering this bug.

NAME                   READY   STATUS             RESTARTS     AGE
test-69d7664cf-9tcbb   3/4     CrashLoopBackOff   3 (5s ago)   111s

Looking at API Server logs, it seems it's still pointing to the etcd instances where the 3.5 feature of multi-tenancy is enabled.

I0704 07:02:23.914632       1 server.go:565] external host was not specified, using 172.18.0.2
I0704 07:02:23.914981       1 server.go:172] Version: v1.23.1
I0704 07:02:24.090733       1 shared_informer.go:240] Waiting for caches to sync for node_authorizer
I0704 07:02:24.091549       1 plugins.go:158] Loaded 11 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,TaintNodesByCondition,Priority,DefaultT
olerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
I0704 07:02:24.091584       1 plugins.go:161] Loaded 11 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,PodSecurity,Priority,PersistentVolumeClaimResize,Runtime
Class,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
I0704 07:02:24.092539       1 plugins.go:158] Loaded 11 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,TaintNodesByCondition,Priority,DefaultT
olerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
I0704 07:02:24.092556       1 plugins.go:161] Loaded 11 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,PodSecurity,Priority,PersistentVolumeClaimResize,Runtime
Class,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
W0704 07:02:24.100296       1 clientconn.go:1331] [core] grpc: addrConn.createTransport failed to connect to {etcd-2.etcd.kamaji-system.svc.cluster.local:2379 etcd-2.etcd.kamaji-system.svc.cluster.local <nil> 0 
<nil>}. Err: connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority". Reconnecting...

Once the Kamaji operator has been disabled and patched manually the resources pointing to the right ETCD endpoint (http://127.0.0.1:2379, the kine sidecar container one), I'm able to get the Pod up and running, although with the missing reconciliation of the following add-ons since these handlers are the following ones.

Trying to dig up the code, I noticed the root cause of this misbehavior could be possibly the following line: although using kine-mysql, the values are the etcd 3.5 ones, that are then inherited to the underling Deployment handler.

I think we got two options here:

  1. add a function that points to 127.0.0.1:2379 to the ETCDEndpoints struct member when using kine-mysql
  2. totally ignore the passed values and move this logic to the underlying handler

Last option is not likely correct, in my opinion.

Let me know if you can address these changes on your own, please, otherwise I'll contribute to your upstream in order to get this appropriately merged without introducing a bug.

docs/getting-started-with-kamaji.md Outdated Show resolved Hide resolved
@@ -21,8 +21,13 @@ Available flags are the following:
--etcd-client-secret-name Name of the secret which contains ETCD client certificates. (default: "root-client-certs")
--etcd-client-secret-namespace Name of the namespace where the secret which contains ETCD client certificates is. (default: "kamaji")
--etcd-compaction-interval ETCD Compaction interval (i.e. "5m0s"). (default: "0" (disabled))
--etcd-endpoints Comma-separated list with ETCD endpoints (i.e. etcd-0.etcd.kamaji.svc.cluster.local,etcd-1.etcd.kamaji.svc.cluster.local,etcd-2.etcd.kamaji.svc.cluster.local)
--etcd-endpoints Comma-separated list with ETCD endpoints (i.e. https://etcd-0.etcd.kamaji.svc.cluster.local,https://etcd-1.etcd.kamaji.svc.cluster.local,https://etcd-2.etcd.kamaji.svc.cluster.local)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a non-backward compatible change? Previously, it was just the name of the etcd instance, with no requirement for the protocol.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say no since we're forcing the TLS in the etcd client configuration.

TLS: &tls.Config{ // nolint:gosec
Certificates: []tls.Certificate{cert},
RootCAs: pool,
},

@prometherion prometherion merged commit a487908 into clastix:master Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants