Skip to content

RFC: BOSH-Provided Dynamic Disks via Volume Services#1453

Open
rkoster wants to merge 2 commits intocloudfoundry:mainfrom
rkoster:rfc-bosh-dynamic-disks-volume-services
Open

RFC: BOSH-Provided Dynamic Disks via Volume Services#1453
rkoster wants to merge 2 commits intocloudfoundry:mainfrom
rkoster:rfc-bosh-dynamic-disks-volume-services

Conversation

@rkoster
Copy link
Contributor

@rkoster rkoster commented Mar 16, 2026

This PR adds the RFC "BOSH-Provided Dynamic Disks via Volume Services".

For easier viewing, you can see the full RFC as preview.

Summary

This RFC proposes a mechanism for BOSH to provide IaaS-managed persistent disks to Diego containers through the existing volume services architecture.

  • BOSH Director gains a permissions model controlling which instance groups may create, attach, detach, and delete disks
  • BOSH Agent gains subcommands that relay disk requests to the Director over NATS
  • Volume driver implements Docker Volume Plugin v1.12, translating volume operations into Agent disk commands

Diego's volman discovers this driver automatically — zero changes to Diego, CAPI, or the CF CLI.

This is a foundation technology enabling Cloud Foundry to run stateful single-container workloads — agentic coding sessions, cloud-based developer environments, and long-running AI agent processes.

Key Design Points

  • No Director credentials in workloads — Agent uses existing NATS mTLS
  • Director-enforced permissions declared in deployment manifest
  • Disk CIDs flow through Cloud Controller service bindings (same as NFS/SMB)
  • VM-level locking allows dynamic disk ops during bosh deploy

@beyhan beyhan requested review from a team, Gerg, beyhan, cweibel and stephanme and removed request for a team March 16, 2026 15:41
@beyhan beyhan added toc rfc CFF community RFC labels Mar 16, 2026
@Gerg
Copy link
Member

Gerg commented Mar 16, 2026

So, this is clearly related to #1401. As referenced in this RFC:

This approach has been prototyped in community PR #1401 and bosh PR #2652.

Is this intended as an extension of that RFC? or an alternative?

@mariash
Copy link
Member

mariash commented Mar 16, 2026

It is unexpected to see a new RFC from a new author rather than questions or suggestions on the original one - #1401. We discussed the approach at the meeting and majority agreed on it. I will reiterate the points we covered at the meeting.


1. Security: IaaS operations should not originate from workload VMs

This RFC proposes that Diego cells initiate disk create, attach, detach, and delete operations via Agent commands over NATS. Diego cells run untrusted tenant workloads. Today they have no ability to trigger IaaS resource manipulation. This proposal gives them that ability which is not secure.

In the original RFC, all IaaS disk operations are initiated from the control plane. Workload VMs are not part of the disk management request path.

2. Decision initiation: Director vs. Agent

The Agent already sends messages to the Director over NATS (heartbeats, etc.), but those are status reports. The Director is the decision-maker.

This RFC changes that. The Agent becomes the initiator of IaaS-mutating decisions: "create this disk," "attach this disk to that VM." The Director becomes an execution backend.

In the original RFC, the Director remains the decision-maker. Control-plane clients call the Director HTTP API, the Director decides how and when to execute CPI operations. The Agent's role does not change.

3. Different authorization mechanisms

The original RFC adds HTTP endpoints to the Director using existing authentication (UAA) and existing request patterns. The Agent gets Director-initiated instructions to resolve device symlinks — same kind of thing it already does.

This RFC introduces:

  • Agent: Five new subcommands that turn the Agent into an IaaS operation relay. New responsibility that does not exist today.
  • Director: A new permissions manifest property, deploy-time validation, runtime enforcement, and NATS handlers for Agent-initiated requests. A new authorization model with no precedent in BOSH.

4. Incorrect credential sprawl characterization

Appendix A states the Director HTTP API approach as requiring UAA credentials on every Diego cell. That assumes the volume driver on each cell calls the Director directly. That is not true.

A centralized control-plane component can hold Director credentials with disk management scope only and make API calls. No credentials on workload VMs.

5. Reconciliation

The Director HTTP API supports centralized reconciliation: a controller observes desired state from BBS, compares to actual disk state, and converges. Retries, race handling, and edge cases are managed in one place.

This RFC only allows retry logic in the volume driver which is not that robust, e.g.

  1. If a cell dies mid-operation, the controller can detect the incomplete state and clean up. In the per-cell model, if a cell dies after attaching a disk but before mounting, who detects that and recovers?
  2. One component owns all disk operations, so debugging and auditing happen in one place. In the per-cell model, you have to correlate logs across N cells to understand what happened.
  3. In the per-cell model, VM A's driver detaches and VM B's driver attaches independently. During app restart, a replacement LRP is started first and then old LRP is shut down. At that point volume driver on replacement LRP will be failing to attach the disk. In centralized model, the workflow is better managed, and controller can see the whole picture and issue detach first and then subsequent attach without retrying Director tasks that will keep failing.

6. "Zero Diego changes"

This RFC states "zero changes to Diego, CAPI, or the CF CLI" as a design advantage. The original RFC does not propose any changes to Diego or CAPI either — it is scoped entirely to BOSH.

Also, this RFC's own Future Work section proposes InstanceIndexedVolumes — a Diego change — for disk sets.

@rkoster
Copy link
Contributor Author

rkoster commented Mar 17, 2026

So, this is clearly related to #1401. As referenced in this RFC:

This approach has been prototyped in community PR #1401 and bosh PR #2652.

Is this intended as an extension of that RFC? or an alternative?

It is a formalisation of an alternative implementation that was discussed in the working group. Also it is taking on a bigger scope to propose a clear path of how to adopt this bosh feature in the context of diego and CF.

@rkoster rkoster closed this Mar 17, 2026
@github-project-automation github-project-automation bot moved this from Inbox to Done in CF Community Mar 17, 2026
@rkoster rkoster reopened this Mar 17, 2026
@cweibel
Copy link

cweibel commented Mar 17, 2026

The integration with the Volume driver to add support into Cloud Foundry would be a very large win for those of us who do not have access to SMB/NFS solutions on AWS.

I reread #1401, it only seems to address adding dynamic disk to BOSH for a kubernetes project that does not appear to be open sourced. While that isn't a problem itself, if there is an alternative solution which can benefit Cloud Foundry directly, that would give this proposal quite a bit of weight.

I also appreciate that the #1453 solution proposes:

  • Granularity on which instance groups (read: diego-cells) can perform create, delete, attach and detach disk operations
  • Leverages NATS for on the BOSH agents for communication
  • Would not conflict with a rolling bosh deploy of CF
  • A fairly simple integration with the existing Cloud Foundry architecture
  • I know I mentioned this at the beginning, but adding dedicated persistent storage to an CF app container layer would be a large benefit and among the highest demanded features CF app developers have been asking for on my foundations.

- Add Layer 4: Diego Changes section
  - Scope: private for exclusive-access volumes (stop-first evacuation)
  - InstanceIndexedVolumeIds spec flag (stable volume IDs across restarts)
  - These are independent: private scope triggers evacuation change,
    InstanceIndexedVolumeIds triggers volume ID suffix change

- Rewrite Summary with security-first value proposition
  - No credential distribution (Agent relay, not API credentials)
  - Per-instance blast radius (no lateral movement)
  - Manifest-driven permissions (easy onboarding)

- Rewrite Problem section with integration challenge framing
  - Who holds IaaS credentials?
  - How are operations authorized?
  - What's the blast radius of compromise?

- Expand Security Model section with detailed explanations

- Update Appendix D to acknowledge remote drivers are supported
  - Centralized proxy is technically feasible
  - Trade-offs: SPOF, per-cell auth still needed, complexity

- Update Future Work: Disk Sets to reference Layer 4
@beyhan beyhan moved this from Done to In Progress in CF Community Mar 17, 2026
@mariash
Copy link
Member

mariash commented Mar 17, 2026

@cweibel The #1401 does use volume services similar to NFS and SMB. The difference is that volume driver responsibility is to only to mount/unmount disk for container similar to NFS/SMB, compared to this RFC where volume driver in addition is responsible for attaching and detaching disks to VM, which in the original RFC will be responsibility of a separate component deployed separately from the user workload. It watches the BBS events and attaches, detaches disks by issuing HTTP API calls to Director. Having disk orchestration is better done in a centralized controller, that has a whole picture of what is happening with LRPs instead of distributed model where volume driver is only aware of the current LRP. If volume driver fails to detach for any reason, who will be responsible for cleaning up and reconciling the disk state? What will happen if Director becomes unavailable? That is why Cloud Foundry always had a centralized controller like BBS responsible for reconciliation.

In case of #1401 we have:

  • Service broker deployed as an app - responsible for creating service, binding it to application - it does not perform any disk operation. Once service is bound to application cloud controller adds volume mount information to the LRP (this is already current behavior)
  • Disk provisioner deployed as a separate component on a control plane and the only component that has access to Director API - watches BBS events for actual LRPs (from UNCLAIMED to CLAIMED) and attaches disks to the VM, watches BBS events from RUNNING to STOPPED/CRASHED or missing LRP and detaches disks from VM. In error cases it will move the disk where LRP is scheduled. If it sees that LRP is restarted on the same VM it skips unnecessary detaching/attaching.
  • Volume driver deployed on Diego Cells - responsible for disk mount and unmount only.

As you see, 1401 model is actually closer to NFS/SMB. NFS/SMB were never responsible for creating/attaching disks, they operated with existing disks. In fact, the code from NFS/SMB service broker and volume driver can be completely reused for #1401 SMB/NFS share the same service broker and volume driver code and same can be done with dynamic disks.

@cweibel
Copy link

cweibel commented Mar 17, 2026

While the end goal of #1401 may be to provide dynamic disk to a closed source solution (doesn't bother me), #1453 addresses provisioning dynamic disk and presenting them to application instances inside of a CF application. The end goals are very much different, but overlap on the requirements to have BOSH manage the life cycle of allocating disks means that I would like these two to not block each other. From what I read of #1401 there is no mention of a service broker or any other orchestration outside of the additional BOSH director API endpoints to handle dynamic disk. For me, this makes it hard to evaluate where this conflicts with adding support for persistent volumes inside of Cloud Foundry, which I would very much like to have.

@mariash
Copy link
Member

mariash commented Mar 17, 2026

To summarize this RFC key design points:

  • No Director credentials in workloads — Agent uses existing NATS mTLS The original RFC has better protection since Director credentials can only be stored in a separate component and not on the same machine as workloads. The RFC: BOSH-Provided Dynamic Disks via Volume Services #1453 opens a back door to Director on VMs that run workloads. What if ASGs would not be applied properly, then malicious application can get access and send requests to bosh agent same way as volume driver in the proposal and issue commands to detach, attach and delete disks. This is a security concern. Right now agent does not issue commands to Director.

  • Director-enforced permissions declared in deployment manifest - this might be a useful broader feature and can be a separate RFC. RFC: BOSH-Provided Dynamic Disks via Volume Services #1453 is using current authentication model via UAA with disk management scope.

  • Disk CIDs flow through Cloud Controller service bindings (same as NFS/SMB) - when using service broker, volume mount information is being passed by Cloud Controller from Service Broker in LRP. This is already how Service Broker works today. RFC: BOSH-Provided Dynamic Disks via Volume Services #1453 also works with Service Broker/Driver model.

  • VM-level locking allows dynamic disk ops during bosh deploy - this was introduced in RFC: BOSH-Provided Dynamic Disks via Volume Services #1453

@mariash
Copy link
Member

mariash commented Mar 17, 2026

@cweibel thank you for your feedback, let me add a use case for CF application in #1401

@mariash
Copy link
Member

mariash commented Mar 17, 2026

@cweibel Updated RFC with the Cloud Foundry use case https://github.com/mariash/community/blob/dynamic-disks/toc/rfc/rfc-draft-dynamic-disks.md#cloud-foundry-integration Please let me know if this will cover your concern.

@mariash
Copy link
Member

mariash commented Mar 17, 2026

@cweibel to address the rest of your concerns:

  • Granularity on which instance groups (read: diego-cells) can perform create, delete, attach and detach disk operations

Thank you for this concern and @beyhan also mentioned that, I added an explicit section "Authorization model" to cover how current model with UAA scopes will be used to limit access to disk operations - https://github.com/mariash/community/blob/dynamic-disks/toc/rfc/rfc-draft-dynamic-disks.md#authorization-model @rkoster I suggest having a separate RFC for the Director permissions model you proposed in this RFC.

  • Leverages NATS for on the BOSH agents for communication.

Could you please specify why NATS communication is preffered over an existing Director HTTP API model? The main concern I have is security, It opens a back door on Diego cells to issue IAAS commands to Director.

  • Would not conflict with a rolling bosh deploy of CF

This actually was covered in the original RFC under VM lock, how it provides coordination between VM lifecycle and disk management operations - https://github.com/mariash/community/blob/dynamic-disks/toc/rfc/rfc-draft-dynamic-disks.md#vm-lock

  • A fairly simple integration with the existing Cloud Foundry architecture.

Hopefully, CF use case I added to RFC will cover this. The difference in RFCs are the communication channel (NATs vs HTTP). Calling bosh-agent from volume driver is not secure since this is where workloads are running. Having HTTP API would allow only authorized users from separate VMs to perform requests. Volume driver can even issue these requests, although I recommend not open a path from workload VMs and do it on a control plane VM. Which will be possible with HTTP API and UAA scopes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rfc CFF community RFC toc

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

5 participants