docs: add multi-tenancy support discussions #1175

binbin-li · 2023-11-15T14:04:56Z

Description

What this PR does / why we need it:

It's not a final design for multi-tenancy but includes 2 options with comparison for discussion.

Which issue(s) this PR fixes (optional, using `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when the PR gets merged):

Investigating: #743

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Helm Chart Change (any edit/addition/update that is necessary for changes merged to the main branch)
This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Please also list any relevant details for your test configuration

Test A
Test B

Checklist:

Does the affected code have corresponding tests?
Are the changes documented, not just with inline documentation, but also with conceptual documentation such as an overview of a new feature, or task-based documentation like a tutorial? Consider if this change should be announced on your project blog.
Does this introduce breaking changes that would require an announcement or bumping the major version?
Do all new files have appropriate license header?

Post Merge Requirements

MAINTAINERS: manually trigger the "Publish Package" workflow after merging any PR that indicates Helm Chart Change

codecov · 2023-11-15T14:09:24Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (b6db2ee) 61.43% compared to head (442493c) 61.43%.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1175   +/-   ##
=======================================
  Coverage   61.43%   61.43%           
=======================================
  Files          97       97           
  Lines        6179     6179           
=======================================
  Hits         3796     3796           
  Misses       2051     2051           
  Partials      332      332

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

docs/discussion/Multi-Tenancy Support.md

akashsinghal · 2023-11-15T18:57:24Z

docs/discussion/Multi-Tenancy Support.md

+5. As Ratify pods are isolated by namespace, each team can only access its own logs.
+
+
+#### Cons


Another consideration is how do we handle which ED provider is invoked in the constraint template? Will there be a new ED provider per namespace? Will there be a namespace-based naming convention for the providers that must be implemented in the Constraint template to call the correct ED provider?

Or maybe the assumption is that each namespace will have its own dedicated Constraint Template + Constraint.

For each Ratify deployment, we should have a Provider CR applied. And I checked GK code base, looks like for each admission review request, it will iterate through all providers to gather feedback. And the constraint template is clustered resource, we cannot have namespaced one.

docs/discussion/Multi-Tenancy Support.md

akashsinghal · 2023-11-15T19:02:11Z

docs/discussion/Multi-Tenancy Support.md

+### Deployment per Cluster
+#### Pros
+1. More resource efficent. Since there is only one Ratify deployment in the cluster, all tenants share the same Ratify pods and dependency resources. If the load traffic increases, admin could easily scale out the Ratify service accordingly.
+2. GK will work in the current way that sends out one ED request to Ratify for each admission review.


How does mutation work? For example, some namespace may enable and another may not enable. This would require scoping mutation Assign resources to the exact namespaces that have it enabled. Thus, requiring updates to the Assign Resources every time a new tenant is added. (maybe this is ok but definitely overhead)

docs/discussion/Multi-Tenancy Support.md

akashsinghal · 2023-11-15T19:23:22Z

docs/discussion/Multi-Tenancy Support.md

+3. Users only need to deploye once when enabling Ratify feature in the cluster. Afterwards, admins can configure CRs in their own namespaces.
+4. Since GK and Ratify can be deployed to the same namespace, we can enable `cert-controller` of GK to set up mTLS automatically.
+
+#### Cons


Another con: how do we provide more granular control over some configs defined at the ratify service level like cache type and size and ttl and log level? All these are currently configured at ratify startup and not as CRs.

docs/discussion/Multi-Tenancy Support.md

FeynmanZhou · 2023-11-16T00:47:05Z

docs/discussion/Multi-Tenancy Support.md

+Ratify’s first stable version for production has been officially released, garnering increased adoption as an external data provider among a growing number of customers. In production environments, numerous clients are opting for the implementation of a multi-tenant architecture to enhance resource efficiency within their clusters. To support this scenario, Ratify needs to make some essential refactoring. Enclosed in this document are several proposed design approaches and their respective modifications, warranting further scrutiny and discussion.
+
+# Multi-tenancy category
+Firstly, the concept of multi-tenancy within a Kubernetes cluster can be broadly categorized into two distinct classifications: multi-team and multi-customer.


Would it be better to add this link as a reference for concept details? https://kubernetes.io/docs/concepts/security/multi-tenancy/

FeynmanZhou · 2023-11-16T01:48:28Z

docs/discussion/Multi-Tenancy Support.md

+3. No need to add namespace to the external data request from GK to Ratify. We can keep the same request struct.
+4. Users could easily set up ResourceQuota and LimitRange to assign appropriate resources.
+5. As Ratify pods are isolated by namespace, each team can only access its own logs.
+


I have many concerns on Deployment per Namespace. Users are really care about the resource usage and maintenance effort. That's being said users like light-weight solution and lower-maintenance solution.

In addition, Ratify as a an addmission controller, users may concern the single-point failure of the Ratify instance. Users will need to maintain multiple Ratify instances in different namespaces. This would be a pain point for users.

akashsinghal · 2023-11-23T01:08:57Z

@binbin-li I have updated the e2e tests required based on k8s bump and gk bump. Please merge main when you have a chance.

fseldow · 2023-11-29T14:18:08Z

docs/discussion/Multi-Tenancy Support.md

+From the perspective of user experience, it makes a great difference between 2 deployment strategy.
+
+### Deployment per Namespace
+1. Cluster admin or team admin must deploy a Ratify service in each required namespace.


~~how we ensure must? will we have one ratify controller to make sure each namespace has the ratify deployment? or it will be cluster admin manual operation?~~
If it is automatically, will it cause unexpected node memory/cpu usage spike when new namespace is created? The ratify default cpu/memory request is not small

It's a valid concern. A new namespace would require a new Ratify deployment with related resources. And it might get worse when multiple namespaces are created at the same time

fseldow · 2023-11-29T14:23:22Z

docs/discussion/Multi-Tenancy Support.md

+
+Notes: a Ratify service includes all dependency service, such as Redis.
+
+An overview of deployment per namespace:


Do we have detailed design on the CRD behavior change? i guess most of them will have namespaced level?

this is under discussion, and I'll have a detailed design soon.

All CRs would be updated to namespaced scoped. And we probably need some refactoring on cosign verifier and certStore to avoid mounting keys.

susanshi

Thank you for all the info Binbin. Left some suggestion that helps to finalize the design doc.

docs/discussion/Multi-Tenancy Support.md

susanshi · 2023-11-29T23:15:23Z

docs/discussion/Multi-Tenancy Support.md

+    * One solution is adding a namespace field to all Ratify logs. Although this option cannot avoid cross-tenant access without disabling `kubectl logs`, it can be used to filter logs in Kibana.
+    * The other solution is to let Ratify partition logs by path. For example, logs from namespaceA are saved to `/var/log/namespaceA` and logs from namespaceB are saved to `/var/log/namespaceB`. This solution can avoid cross-tenant access without disabling `kubectl logs`. However, it requires Ratify to configure logrus output paths for each team explicitly. However, Ratify has to dynamically configure paths for all namespaces, including creating and deleting directories.
+
+#### File system/PVC


We could add priority to the areas that requires multi-tenancy support. The FileSystem is probably higher pri item and should be in listed towards the top of the doc.

My initial plan was to have another doc for implementation details once we decide the model.

here is a brief work break down: https://hackmd.io/@H7a8_rG4SuaKwzu4NLT-9Q/By-jVTBST

docs/discussion/Multi-Tenancy Support.md

binbin-li requested review from akashsinghal, jimmyraywv, luisdlp, susanshi and toddysm as code owners November 15, 2023 14:04

binbin-li force-pushed the multi-tenancy-design branch from 36b8f2f to 6d79fc9 Compare November 15, 2023 14:11

akashsinghal assigned binbin-li Nov 15, 2023

akashsinghal reviewed Nov 15, 2023

View reviewed changes

docs/discussion/Multi-Tenancy Support.md Show resolved Hide resolved

akashsinghal reviewed Nov 15, 2023

View reviewed changes

docs/discussion/Multi-Tenancy Support.md Show resolved Hide resolved

akashsinghal reviewed Nov 15, 2023

View reviewed changes

docs/discussion/Multi-Tenancy Support.md Show resolved Hide resolved

akashsinghal reviewed Nov 15, 2023

View reviewed changes

docs/discussion/Multi-Tenancy Support.md Show resolved Hide resolved

FeynmanZhou reviewed Nov 16, 2023

View reviewed changes

docs/discussion/Multi-Tenancy Support.md Outdated Show resolved Hide resolved

FeynmanZhou reviewed Nov 16, 2023

View reviewed changes

docs: add multi-tenancy support discussions

00aa354

binbin-li changed the title ~~docs: add multi-tenancy support discussions~~ [WIP] docs: add multi-tenancy support discussions Nov 20, 2023

binbin-li force-pushed the multi-tenancy-design branch from 6d79fc9 to 9fd820a Compare November 26, 2023 15:01

binbin-li changed the title ~~[WIP] docs: add multi-tenancy support discussions~~ docs: add multi-tenancy support discussions Nov 28, 2023

binbin-li force-pushed the multi-tenancy-design branch from 9fd820a to 58eab0e Compare November 29, 2023 14:03

fseldow reviewed Nov 29, 2023

View reviewed changes

susanshi reviewed Nov 29, 2023

View reviewed changes

docs: add more user scenarios

ce2c330

binbin-li force-pushed the multi-tenancy-design branch from 58eab0e to ce2c330 Compare November 30, 2023 07:41

Merge branch 'main' into multi-tenancy-design

34f9308

binbin-li force-pushed the multi-tenancy-design branch from 684509f to 34f9308 Compare December 5, 2023 05:57

docs: add docs

57675cf

binbin-li force-pushed the multi-tenancy-design branch from 2ee1453 to 57675cf Compare December 5, 2023 14:16

binbin-li mentioned this pull request Dec 6, 2023

Multi-tenancy support #743

Closed

7 tasks

docs: add more details to roles

1338e2c

susanshi approved these changes Jan 2, 2024

View reviewed changes

Merge branch 'main' into multi-tenancy-design

442493c

binbin-li merged commit a25c791 into ratify-project:main Jan 2, 2024
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add multi-tenancy support discussions #1175

docs: add multi-tenancy support discussions #1175

binbin-li commented Nov 15, 2023 •

edited

Loading

codecov bot commented Nov 15, 2023 •

edited

Loading

akashsinghal Nov 15, 2023 •

edited

Loading

binbin-li Nov 28, 2023

akashsinghal Nov 15, 2023

akashsinghal Nov 15, 2023

FeynmanZhou Nov 16, 2023

FeynmanZhou Nov 16, 2023

akashsinghal commented Nov 23, 2023

fseldow Nov 29, 2023

binbin-li Nov 30, 2023

fseldow Nov 29, 2023

binbin-li Nov 30, 2023

binbin-li Dec 1, 2023

susanshi left a comment

susanshi Nov 29, 2023

binbin-li Nov 30, 2023

binbin-li Dec 1, 2023

		5. As Ratify pods are isolated by namespace, each team can only access its own logs.


		#### Cons


		Notes: a Ratify service includes all dependency service, such as Redis.

		An overview of deployment per namespace:

docs: add multi-tenancy support discussions #1175

docs: add multi-tenancy support discussions #1175

Conversation

binbin-li commented Nov 15, 2023 • edited Loading

Description

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, using fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when the PR gets merged):

Type of change

How Has This Been Tested?

Checklist:

Post Merge Requirements

codecov bot commented Nov 15, 2023 • edited Loading

Codecov Report

akashsinghal Nov 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akashsinghal commented Nov 23, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

susanshi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

binbin-li commented Nov 15, 2023 •

edited

Loading

Which issue(s) this PR fixes (optional, using `fixes #<issue number>(, fixes #<issue_number>, ...)` format, will close the issue(s) when the PR gets merged):

codecov bot commented Nov 15, 2023 •

edited

Loading

akashsinghal Nov 15, 2023 •

edited

Loading