Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Multiple Tenant Access Pods #1610

Closed
caitlinbestler opened this issue Mar 27, 2018 · 14 comments
Closed

Support for Multiple Tenant Access Pods #1610

caitlinbestler opened this issue Mar 27, 2018 · 14 comments
Labels

Comments

@caitlinbestler
Copy link

The goal of multiple tenant access pods is to enable access to the same backend service, presumably storage, from multiple tenants each with their own tenant scoped network access.

Kubernetes already supports the ability to define a network namespace to which only specified containers have access. This is typically enforced by VLANs or VxLANs.

A typical tenant-scoped virtual network would contain client pods, a pod implementing a user authentication service (usually AD or LDAP) and pods providing access to the storage cluster.

This feature would enable a single pre-existing cluster to provide service to multiple tenants each via a virtual frontend network defined for that client. There may also be a backend storage network defined for the storage cluster, but the backend storage network would pre-exist any client access network.

Providing a shared storage cluster enables economies of scale in providing storage. But there are no economies of scale that would be acceptable if it enabled leakage of content from one tenant to another. Tenant isolation must exist independent of whatever Access Control is provided by the storage cluster itself.

Layering of access control enables re-use of open-source daemons designed for legacy protocols. NFS and iscsi targets are traditionally designed to provide service in a single corporate intranet, not to the world wide internet. Secondly, enforcing tenant isolation in the network itself provides guaranteed isolation that is not
dependent on the storage servers. Tenant isolation is easily understood, ACLs require reading manuals.

The proposed life-cycle is as follows:

  1. Provision the storage cluster. Giving it a persistent identity. For this issue we will leave issues on how to identify the hosts for this cluster for the existing multiple storage provider issue to deal with. However the cluster was provisioned there are now a set of hosts that have been marked as eligible for scheduling Tenant Access Pods.
  2. Tenant Access networks can then be added (or dropped). Adding them involves creating tenant client pods subject to normal Kubernetes scheduling rules and Tenant Access Pods which could only be scheduled on hosts previously marked as supporting Tenant Access Pods for this specific storage cluster.
  3. When the Tenant Access Pod is activated it will establish communication with the storage cluster backend pod already running on the same host. Those pods will continue to communicate based on their own preference. They may continue to use generic IPC, establish message queues in shared memory or even via sharing files in a shared directory (although I wouldn't recommend that approach). If the storage pod is flexible enough, the Tenant Access Pod could even just identify the Virtual Network interface to the backend pod.
  4. The backend pod would be responsible for confining all work done for a Tenant Access Pod to resources owned by the matching tenant. This typically means that the set of mount points available is limited to those created for that specific tenant.
  5. Each frontend pod would identify, and provide proxy access to tenant-specific authentication services. The backend storage service pod needs to know what role each tenant user has been admitted under so that it can enforce any file/object/volume ACL rules that it is implementing. The syntax of the ACL rules will be tenant independent, but the method of validating identity before the ACL rules can be applied will be tenant specific.

Ideally, no backend storage pod could be deactivated while Tenant Access pods are still active on the same host. In general, storage cluster pods are very poor candidates to migrate because their purpose is to provide persistent storage. Providing persistent storage is incompatible with quick migration. So even without a special hook to enforce this it is probably not a major issue.

@dyusupov
Copy link
Contributor

Currently kubernetes cannot provide "multi-homed" or true "multi-tenant" networking isolation. But good news is that Network plumbers folks working on it:

https://docs.google.com/document/d/1Ny03h6IDVy_e_vmElOqR7UdTPAG_RNydhVE1Kx54kFQ/edit

Meanwhile, multi-homed network can be fairly easily constructed with Intel's "multus" CNI plugin:

https://github.com/Intel-Corp/multus-cni

@caitlinbestler
Copy link
Author

Kubernetes does not yet provide "multi-homed" or "multi-tenant" Pods. It is not clear when Intel's Multus project will provide plug-and-play Virtual NICs rather than just multiple Virtual NICs.

To support on-demand creation and deletion of Tenants we need to add and drop the Tenant Access networks. What I am proposing is doing this by having single interface Pods connect with each other through localhost IPC, rather than through Virtual NICs.

Without dynamic creation I am not certain how any number of networks that must be known when the storage cluster is provisioned could support the potentially large number of Tenants that a single storage cluster might have to support across all gateways.

@caitlinbestler
Copy link
Author

Clarification:
Where the Intel Multus project is relevant is in provisioning the Backend Pod to use a 2nd NIC for the Backend Network.

Without that it would either:

  • Have to be controlled by both NICs, wth Kubernetes still understanding that it was one host.
  • Have Kubernetes control if via the front end NIC but only provision the POD to access the backend NIC.
  • Provision the backend NIC as an anonymous non-network device that just happens to provide network services.

It would also be ideal if Kubernetes could restrict which Backend Pods any specific frontend Pod could communicate with. This could be solved by only provisioning a single Backend Pod per host and only provisioning frontend pods to that host which are authorized to communicate with it. In other words, if a
Pod does not require access to a Storage Clusgter X host then do not schedule it there.

@caitlinbestler
Copy link
Author

@travisn
Copy link
Member

travisn commented Apr 20, 2018

@caitlinbestler could you summarize how Rook can help with this design? There appears to be a number of broader design issues around networking and multi-tenancy than what Rook could help with. Rook is more about orchestrating the needs of the storage backend. But the networking and data path is ultimately outside Rook.

@caitlinbestler
Copy link
Author

While network related code would remain a Kubernetes issue, there are a few areas where specific kubernetes strategies can be of benefit to all storage clusters:

  • Enabling multiple-tenant isolated access to a shared storage cluster, as opposed to needing to create tenant specific storage clusters.
  • Provisioning high-performance low latency backend networks that where switchports can be dynamically assigned to storage networks. Of course "dynamically" here is at the rate where you add/remove a storage server from a storage cluster, which is considerably less often than a compute resource is added/removed from a compute job.
    While the high-performance low-latency backend network may be of use for some compute clusters, there are enough issues that are of common interest to storage clusters that it makes sense to work on a consensus template in a more focused group than doing so in the kubernetes community at large.

Plus, Rook's efforts to define storage clusters in a somewhat vendor neutral way should be aware of these issues, and may even mandate that the previously mentioned templates be followed if you want to be a Rook-compatible storage cluster. Vendors are told that they can schedule their network resources however they want to, but Rook will only understand what they are doing if you do as specified by some Rook published HOW-TO document.

@travisn
Copy link
Member

travisn commented Apr 20, 2018

Certainly, wherever there is a common benefit to the storage clusters, Rook either documenting or facilitating the feature is the goal. With multi-tenant access to the storage, however, I am struggling to see where the commonality is. The data protocol is likely where the tenant enforcement comes. For example, object naturally has users that present their creds to the s3 interface, and ceph rgw has a multi-tenant capability. Fundamentally, either the data protocol needs to support multi-tenancy natively, or else the networking needs to keep the tenants isolated.

As mentioned in your design discussion, another option is to have tenant access pods. In that case, Rook would orchestrate starting those access pods. I could see Rook defining a CRD that allows the multi-tenant access via the access pods. Does Nexenta have tenant access pods today? I'm not sure how commonly Ceph users would find the tenant access pods desirable since it would not give full performance access to the OSDs.

@caitlinbestler
Copy link
Author

caitlinbestler commented Apr 20, 2018 via email

@travisn
Copy link
Member

travisn commented Apr 20, 2018

Good point that NFS Ganesha is similar. Wherever there is similarity, that's where we can get the benefits of sharing the platform. And as we add more backends it will become more clear the places where there is overlap that will benefit from common orchestration.

@jbw976
Copy link
Member

jbw976 commented Feb 6, 2019

There has been recent discussion on this (related) topic for providing support for multiple network interfaces, not necessarily from the multi-tenant perspective, but from a perspective of a storage solution being able to separate it's backend (internal) traffic from its frontend (client) traffic. This is related to this issue, but perhaps would fit better in a new issue altogether.

Recent discussion on this topic: #2408 (comment)

@caitlinbestler
Copy link
Author

caitlinbestler commented Feb 6, 2019 via email

@jseguillon
Copy link

Hi there. Sorry to be late but it is unclear to me if #2048 made this available : choose which interface in Pod should route/listen for Cephs traffic ?

If nothing available "in the box", does anyone know if it's feasible if we tune routes and start command lines or is there no way to achieve this ?

(Interfaces provisioned by multus, one 1G on control plane, the other one on 10 G interface)

@caitlinbestler
Copy link
Author

caitlinbestler commented Apr 20, 2019 via email

@travisn
Copy link
Member

travisn commented Feb 6, 2020

It's still not clear to me what Rook can provide here. At the K8s and network layer there are features such as supporting ipv4/v6 dual stack, multus, etc. Each storage provider will need to implement the multi-tenancy. We can discuss again how it can be shared after one storage provider implements something here.

@travisn travisn closed this as completed Feb 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants