Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting admin_nopass to false prevents operator from configuring instances #133

Open
diffuse opened this issue Nov 17, 2023 · 7 comments
Open

Comments

@diffuse
Copy link

diffuse commented Nov 17, 2023

Configuring authentication on the non-admin port works perfectly with:

authentication:
  passwordFromSecret:
    name: foo
    key: password

However, on inspection of kubectl describe statefulset dragonfly, containers are configured with the following arguments:

Args:
      --alsologtostderr
      --primary_port_http_enabled=false
      --admin_port=9999
      --admin_nopass

I confirmed that when authentication is configured, connecting to the non-admin port requires a password, and connecting to the admin port does not. I assumed that overriding admin_nopass would cause the operator to authenticate with the secret above, but it still attempts an unauthenticated connection to the admin port.

2023-11-17T02:39:39Z    ERROR   Failed to mark pod as master    {"controller": "pod", "controllerGroup": "", "controllerKind": "Pod", "Pod": {"name":"dragonfly-0","namespace":"default"}, "namespace": "default", "name": "dragonfly-0", "reconcileID": "<id>", "podName": "dragonfly-0", "error": "error running SLAVE OF NO ONE command: NOAUTH Authentication required."}
github.com/dragonflydb/dragonfly-operator/internal/controller.(*DragonflyInstance).configureReplication
        /workspace/internal/controller/dragonfly_instance.go:108
github.com/dragonflydb/dragonfly-operator/internal/controller.(*DfPodLifeCycleReconciler).Reconcile
        /workspace/internal/controller/dragonfly_pod_lifecycle_controller.go:92
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:122
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:323
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
        /go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.4/pkg/internal/controller/controller.go:235
2023-11-17T02:39:39Z    INFO    could not initialize replication. will retry    {"controller": "pod", "controllerGroup": "", "controllerKind": "Pod", "Pod": {"name":"dragonfly-0","namespace":"default"}, "namespace": "default", "name": "dragonfly-0", "reconcileID": "ac977a2c-ea01-49f8-8812-574a2d9acc31", "error": "error running SLAVE OF NO ONE command: NOAUTH Authentication required."}

I assume this is just because the username/password pair is not passed on construction of the redis client at internal/controller/dragonfly_instance.go:369. Are there plans to support this/would a pull request supporting this be welcome?

@ashotland
Copy link
Contributor

Hi @diffuse - thanks for reaching out

IIUC, the operator uses admin port with admin_npoass so that no auth will be required for the commands invoked by the operator.

Can you please provide the log line from line:

dfi.log.Info("Trying to invoke SLAVE OF command", "pod", pod.Name, "master", masterIp, "addr", redisClient.Options().Addr)

?

@diffuse
Copy link
Author

diffuse commented Nov 18, 2023

Hello @ashotland, thanks for responding! That line is never logged because configuring the master in replicaOfNoOne fails before it can get to the replicaOf calls below.

I could be missing something, but leaving the admin connection unprotected for operator access means that any compromised containers in the cluster could connect with full db access.

I can take a crack at connecting with the credentials provided in the resources if this is something that would be a welcome addition :)

@ashotland
Copy link
Contributor

Hi @diffuse, you are correct that --adminnopass assumes that intra cluster access is trusted.

If this is not the case, #136 makes sense.

Thanks for your contribution we'll try to get it reviewed soon.

@SirCorneliusCob
Copy link

+1

@Tristan971
Copy link

Tristan971 commented Mar 28, 2024

Replying to @Pothulapati's comment on the MR here:

I'm sure some users would not like giving the Operator access to all the secrets in the cluster

It can be limited to secrets in the operator's namespace specifically (which is a very common pattern).

It doesn't even have to be configurable per-cluster in the first round, I think.

We can add a toggle but we aren't yet sure how many users actually want this as most users trust applications in their cluster

It is very much the same as not setting any authentication on a database, in particular for the superuser of said database.

I don't believe anyone seriously does that anymore these days.

In particular, imagine someone cluelessly setting .serviceSpec.type: Loadbalancer, thinking they're fine because of the password auth...

and many others who have issues like this probably already have network RBAC around which services can access what.

Requiring a full service mesh (or similar deep k8s networking RBAC) comes at an exceptionally high cost in both resources and complexity.

Until we have more requests from others users, We are not sure if its the right thing to go ahead on this. :/

Hopefully here is the right place to do so. Because at the moment this makes the operator impossible to seriously consider to me.

@Pothulapati
Copy link
Collaborator

@Tristan971 Thanks for chiming and adding some valid points.

It can be limited to secrets in the operator's namespace specifically (which is a very common pattern).

But this means the password is now in a different namespace from the DragonflyDB instance which seems confusing? As users when they want a DragonflyDB instance would want it all to be in a single namespace. wdyt? 🤔

In particular, imagine someone cluelessly setting .serviceSpec.type: Loadbalancer, thinking they're fine because of the password auth...

Valid point!

@Tristan971
Copy link

As users when they want a DragonflyDB instance would want it all to be in a single namespace. wdyt?

It is preferrable that the secret can be where the relevant dragonfly is, yes.

If we look at what everyone else does, generally, it boils down to:

  • have a cluster-wide ON/OFF flag for the operator
  • if the flag is OFF, use a role and monitor only the operator's namespace
  • if the flag is ON, use a clusterrole and monitor all namespaces

This leaves the operator with the option of doing per-namespace operator deployments or clusterwide ones with the associated elevated access to secrets if they trust the operator enough.

And I think in general people would (and would be right to) trust a handful of operators more than they do hundreds of pods in their cluster handling user traffic etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants