Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Define Flux tenancy models #2086

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

[RFC] Define Flux tenancy models #2086

wants to merge 4 commits into from

Conversation

stefanprodan
Copy link
Member

@stefanprodan stefanprodan commented Nov 15, 2021

The main goal of this RFC is to define the Kubernetes tenancy models supported by Flux.

This PR attempts to document the status quo, and should provide clarity of what multi-tenancy capabilities Flux has. It also functions as a base for rewriting the loose proposal in #582 into well scoped RFCs.

@stefanprodan stefanprodan added the area/rfc Feature request proposals in the RFC format label Nov 15, 2021

The platform admins have unrestricted, cluster-scoped access to Kubernetes API.
They are responsible for installing Flux and granting Flux
access to the sources (Git, Helm, OCI repositories) that make up the cluster(s) control plane desired state.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that tenants should not configure their own sources? In the tenants section it does however state "Register their sources with Flux". I might just be misinterpreting the meaning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is about the cluster control plane desired state as in cluster-wide resources, controllers, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a new section "Tenants Onboarding". Hopefully this clarifies that tenants can add their app repos to their main repo which is registered by admins.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was initially confused by this as well. It might pay to spend some paras up front explaining which git repositories are assumed to exist, and how they are used (i.e., what they contain).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would make sense to introduce a definition of the "control plane" in a separate paragraph at the beginning of the RFC somewhere. I'm thinking of everything that is either shared among tenants or created as part of the on-boarding of a tenant; the Flux instance itself, components such as Gatekeeper and resources such as ServiceAccounts.

It might also be helpful to explain the repo hierarchy: Each tenant has a root repo that's created by cluster admins and as many subsequent repos maintained by themselves.


Example of operations performed by tenants:

- Register their sources with Flux (`GitRepositories`, `HelmRepositories` and `Buckets`).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is said here above that cluster-admin Onboard tenants by registering their Git repositories with Flux. This might need a clarification on separation of concern

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a new section "Tenants Onboarding". Hopefully this clarifies that tenants can add their app repos to their main repo which is registered by admins.

Copy link
Contributor

@yebyen yebyen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great overview of the current state-of-the-art with lots of good references for follow-up education. 👍 LGTM with changes, a few typos corrected.

rfcs/0001-multi-tenancy/README.md Outdated Show resolved Hide resolved
rfcs/0001-multi-tenancy/README.md Outdated Show resolved Hide resolved
@squaremo
Copy link
Member

squaremo commented Nov 29, 2021

It's not uncommon to have a "memorandum" RFC which describes the status quo, rather than proposing a new design. It seems like a needless indirection to use an RFC to propose new documentation, giving the content verbatim, though. I would expect either

  • an RFC which will stand itself as the description of the status quo at some point (and, e.g., documentation can point at it, or paraphrase it while still relevant); or,
  • a PR with the new documentation.

Given the goal of building up to new designs, I think the first is the appropriate form here (and would require only a little adaptation).

Copy link
Contributor

@jonathan-innis jonathan-innis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

Copy link

@nikkomiu nikkomiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! (I'm with the MS/Azure group)

Copy link
Member

@squaremo squaremo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall comment: 💯 👏 for the effort to definitively set out tenancy models for Flux. I think the content could be more pointed in how it does that, by

  • being clear about which bits are definitions or assumptions;
  • describing the models more directly -- in some places, the text lapses into being "how to" rather than being definitive.

rfcs/0001-multi-tenancy/README.md Outdated Show resolved Hide resolved
Comment on lines 17 to 18
- List the tenancy models supported by Flux.
- Explain the differences between tenancy models.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't the only ways to set up a multi-tenant Flux system though, are they? This feels like it's partly a guide to good practice, rather than a reference. In which case, the language could be more like

  • Define two models for multi-tenancy, "soft multi-tenancy" and "hard multi-tenancy"
  • Explain when each is appropriate
  • Describe a reference implementation of each model with Flux

(this distinguishes between definitions, which are normative; and implementations, of which there may be variations).

rfcs/0001-multi-tenancy/README.md Outdated Show resolved Hide resolved
rfcs/0001-multi-tenancy/README.md Outdated Show resolved Hide resolved
rfcs/0001-multi-tenancy/README.md Outdated Show resolved Hide resolved
rfcs/0001-multi-tenancy/README.md Outdated Show resolved Hide resolved

### Hard Multi-Tenancy

With hard multi-tenancy, the platform admins use Kubernetes Cluster API to create dedicated clusters for each tenant.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a strict requirement of the model? Or could the kubeConfig secrets come from some other mechanism, e.g., if clusters are constructed with terraform, or with clicking buttons.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear on whether applying things remotely is required for the hard multi-tenancy, or kind of a mixed-in concern (if you're giving each tenant a cluster, you probably have a management cluster, so let's base that model on that assumption ...). Could you provide some justification in the text for this approach? Or explicitly give it as an assumption.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D2iQ actually currenty implements hard multi-tenancy without kubeConfig but instead we have controllers install Flux and create the sync resources on each tenant cluster. So I suppose kubeConfig is one of several ways to enforce hard multi-tenancy.

When onboarding tenants, platform admins have the option to assign namespaces, set
permissions and register the tenants main repositories onto clusters in a declarative manner.

The Flux CLI offers an easy way of generating all the Kubernetes manifests needed to onboard tenants:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the examples following are "how to set up multi-tenancy" rather than describing the model or implementation. Demonstrating how to set it up is not a goal, in the text as it stands -- neither are describing the model or its implementation, but according to the PR title, perhaps they should be.

I suggest reworking this section to describe what the soft-tenancy model requires of RBAC (things like "each tenant namespace has a service account, with these bindings"). Telling people how to make it so conveniently, as you have here, is useful as extra information, but informative rather than definitive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expanding the RBAC recommendations here would be really useful.
It would be good to ensure we cater for protecting the tenant's service account from being misused.
Here's some ideas:

A) Vanila K8S

The Platform Admin would pre-create all namespaces the tenant will use ahead of time, setting access via rolebindings for all the tenant's namespaces.
All Flux objects are created at the tenant flux namespace.

flux-tenant-alpha
├── flux-tenant-alpha-serviceaccount
├── flux-tenant-alpha-rolebinding
├── podinfo-helmrepository
└── podinfo-helmrelease
apps
├── flux-tenant-alpha-rolebinding
└── podinfo

B) HNC

The Platform Admin pre-creates the tenant top level namespace, with its service account and rolebindings.
All Flux objects are created at the tenant top level namespace.
Tenants can create subnamespaces and deploy apps to it.

flux-tenant-alpha
├── flux-tenant-alpha-serviceaccount
├── flux-tenant-alpha-rolebinding
├── podinfo-helmrepository
├── podinfo-helmrelease
└── [ns] apps
         ├── flux-tenant-alpha-rolebinding
         └── podinfo

In both cases, the "deployment" service account is never placed on a namespace that is shared with other applications.

If a tenant's flux namespace needs to have mixed use (shared between applications and flux components), it would require admission controllers to block the misuse of the tenant's service account.

C) Vanila K8S + Admission Controllers

flux-tenant-alpha
├── flux-tenant-alpha-serviceaccount (i.e. kyverno policy to block misuse of this service account)
├── flux-tenant-alpha-rolebinding
├── podinfo-helmrepository
└── podinfo-helmrelease
└── podinfo

- [EKS multi-tenancy best practices](https://aws.github.io/aws-eks-best-practices/security/docs/multitenancy/)

### Soft Multi-Tenancy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The paras here are a nice, and concise, explanation 💟

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

make use of it without any manual actions. For clusters created by other means than Cluster API, the
platform team has to create the `kubeConfig` secrets to allow Flux access to the remote clusters.

As of Flux v0.23.0, we don't provide any guidance for cluster admins on how to generate the `kubeConfig` secrets.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text above says they come from Cluster API.


## Motivation

The documentation [here](https://fluxcd.io/docs/) describes the security model of Flux.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The documentation [here](https://fluxcd.io/docs/) describes the security model of Flux.
The documentation [here](https://fluxcd.io/docs/security/) describes the security model of Flux.

Isn't this the more concrete page? The main one doesn't mention security.


## Introduction

Flux allows different organizations and/or teams to share the same Kubernetes control plane.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall someone (maybe it was @stefanprodan) telling me there shouldn't be multiple instances of Flux running on a single cluster (which could help in isolating tenants). Maybe that notion should be part of this doc as some kind of "official guidance"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are configuration options in which this theoretically still is a solution, but need to adhere to a set of rules that do not apply to most.


## User Roles

The tenancy models assume two types of user: platform admins and tenants.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The tenancy models assume two types of user: platform admins and tenants.
The existing Flux tenancy models assume two types of user: platform admins and tenants.

Not sure if that's the intention here but I figure a bit of clarification of which tenancy model we're talking about here might be helpful.


The platform admins have unrestricted, cluster-scoped access to Kubernetes API.
They are responsible for installing Flux and granting Flux
access to the sources (Git, Helm, OCI repositories) that make up the cluster(s) control plane desired state.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would make sense to introduce a definition of the "control plane" in a separate paragraph at the beginning of the RFC somewhere. I'm thinking of everything that is either shared among tenants or created as part of the on-boarding of a tenant; the Flux instance itself, components such as Gatekeeper and resources such as ServiceAccounts.

It might also be helpful to explain the repo hierarchy: Each tenant has a root repo that's created by cluster admins and as many subsequent repos maintained by themselves.


### Hard Multi-Tenancy

With hard multi-tenancy, the platform admins use Kubernetes Cluster API to create dedicated clusters for each tenant.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D2iQ actually currenty implements hard multi-tenancy without kubeConfig but instead we have controllers install Flux and create the sync resources on each tenant cluster. So I suppose kubeConfig is one of several ways to enforce hard multi-tenancy.

Comment on lines 227 to 228
Note that with hard multi-tenancy, tenants have full access to cluster-wide resources, so they have the option
to manage Flux independently of platform admins, by deploying a Flux instance on each cluster.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should mention here that hard multi-tenancy can be combined with soft multi-tenancy to get around this limitation.


The Kubernetes tenancy models supported by Flux are: soft multi-tenancy and hard multi-tenancy.

For an overview of the Kubernetes multi-tenant architecture please consult the following documentation:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


## Tenancy Models

The Kubernetes tenancy models supported by Flux are: soft multi-tenancy and hard multi-tenancy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the four sources below (Kubernetes, GCP, Azure and AWS) only AWS uses the terms soft and hard for multi-tenancy. It would be useful to expand slightly here to clarify what we mean by it, which may speak to the RFC's goal of "Explain when each model is appropriate.".

Some ideas:

Soft Multi-tenancy Hard Multi-tenancy
Tenants may share cluster with other tenants Yes No
Tenants may share cluster with the flux management instance Yes No
Tenants access to cluster-wide resources Limited Unrestricted

- [EKS multi-tenancy best practices](https://aws.github.io/aws-eks-best-practices/security/docs/multitenancy/)

### Soft Multi-Tenancy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment on lines 108 to 110
Note that with soft multi-tenancy, true tenant isolation requires security measures beyond Kubernetes RBAC.
Please refer to the Kubernetes [security considerations documentation](https://kubernetes.io/blog/2021/04/15/three-tenancy-models-for-kubernetes/#security-considerations)
for more details on how to harden shared clusters.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we need a small multi-tenancy security section on its own, as similar points may be valid for hard multi-tenancy - although at a lower level of the stack.

The key point being that flux support several multi-tenancy use cases, but the Platform Admin is ultimately the responsible for ensuring the correct level of isolation is enforced between the tenants, based on their own security requirements.

When onboarding tenants, platform admins have the option to assign namespaces, set
permissions and register the tenants main repositories onto clusters in a declarative manner.

The Flux CLI offers an easy way of generating all the Kubernetes manifests needed to onboard tenants:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expanding the RBAC recommendations here would be really useful.
It would be good to ensure we cater for protecting the tenant's service account from being misused.
Here's some ideas:

A) Vanila K8S

The Platform Admin would pre-create all namespaces the tenant will use ahead of time, setting access via rolebindings for all the tenant's namespaces.
All Flux objects are created at the tenant flux namespace.

flux-tenant-alpha
├── flux-tenant-alpha-serviceaccount
├── flux-tenant-alpha-rolebinding
├── podinfo-helmrepository
└── podinfo-helmrelease
apps
├── flux-tenant-alpha-rolebinding
└── podinfo

B) HNC

The Platform Admin pre-creates the tenant top level namespace, with its service account and rolebindings.
All Flux objects are created at the tenant top level namespace.
Tenants can create subnamespaces and deploy apps to it.

flux-tenant-alpha
├── flux-tenant-alpha-serviceaccount
├── flux-tenant-alpha-rolebinding
├── podinfo-helmrepository
├── podinfo-helmrelease
└── [ns] apps
         ├── flux-tenant-alpha-rolebinding
         └── podinfo

In both cases, the "deployment" service account is never placed on a namespace that is shared with other applications.

If a tenant's flux namespace needs to have mixed use (shared between applications and flux components), it would require admission controllers to block the misuse of the tenant's service account.

C) Vanila K8S + Admission Controllers

flux-tenant-alpha
├── flux-tenant-alpha-serviceaccount (i.e. kyverno policy to block misuse of this service account)
├── flux-tenant-alpha-rolebinding
├── podinfo-helmrepository
└── podinfo-helmrelease
└── podinfo

squaremo added a commit that referenced this pull request Dec 16, 2021
These were adapted from the multi-tenancy RFC:

    #2086

Signed-off-by: Michael Bridgen <michael@weave.works>
@stefanprodan stefanprodan changed the title [RFC-0001] Define Flux tenancy models [RFC-0004] Define Flux tenancy models Dec 17, 2021
stefanprodan and others added 4 commits December 17, 2021 11:58
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
The multi-tenancy implementations described rely on impersonation and
remote apply; to make this RFC stand by itself, those need to be
explained, along with the authorisation model (how Flux "decides" what
it's allowed to do).

This commit adds a summary of the authorisation model, impersonation,
and remote apply, and rejigs the headings a little to make space.

Signed-off-by: Michael Bridgen <michael@weave.works>
Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>
squaremo added a commit that referenced this pull request Dec 17, 2021
This gives a baseline for future changes, e.g., expanding where
namespace ACLs are used, switching access control to
untrusted-by-default.

The "Security considerations" section  was adapted from

    #2086

Signed-off-by: Michael Bridgen <michael@weave.works>
squaremo added a commit that referenced this pull request Dec 17, 2021
This gives a baseline for future changes, e.g., expanding where
namespace ACLs are used, switching access control to
untrusted-by-default.

The "Security considerations" section  was adapted from

    #2086

Signed-off-by: Michael Bridgen <michael@weave.works>
@stefanprodan stefanprodan changed the title [RFC-0004] Define Flux tenancy models [RFC] Define Flux tenancy models Apr 12, 2022
@stefanprodan stefanprodan marked this pull request as draft April 12, 2022 12:00
@pjbgf pjbgf mentioned this pull request Apr 20, 2022
4 tasks
souleb pushed a commit to souleb/flux2 that referenced this pull request Jul 10, 2023
This gives a baseline for future changes, e.g., expanding where
namespace ACLs are used, switching access control to
untrusted-by-default.

The "Security considerations" section  was adapted from

    fluxcd#2086

Signed-off-by: Michael Bridgen <michael@weave.works>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/rfc Feature request proposals in the RFC format
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet