-
Notifications
You must be signed in to change notification settings - Fork 399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Managed clusters: Support storing stateful service data on local VM temp disk + Ephemeral OS disks #1157
Comments
@juho-hanhimaki where have you read that stateful services are not supported on SF Managed Clusters? |
@olitomlinson I think you misunderstood me. I am well aware stateful services are supported on managed clusters. The problem is that managed clusters store data in the Azure Storage instead of the local VM disk. That can be suboptimal for some scenarios / users. With normal SF clusters you don't need Azure Storage because the data is replicated and kept available within the cluster itself. |
@juho-hanhimaki My apologies. I’ve not come across this limitation in the docs, can you point me to it? Many thanks! |
I don't know if there's actual documentation about this. But the managed clusters announcement blog post talks about the fact that the storage is now based on managed disks instead of the temp disk. The topic has also come up during SF community Q&As. |
Thanks @juho-hanhimaki I interpreted the blog post as additional support for Managed Disks. But, yes, additional clarity would be nice here |
Thank you for the feedback @juho-hanhimaki @olitomlinson In the preview that is currently available, the data disk for Stateful Services on a managed cluster are only utilizing managed disk. We are working to enable support which will allow you to select the specific managed disk SKU in the near future. I have added this work item to the backlog, and will update it when we have more information to share on support for using the VM temp disk for Stateful Services. |
Thanks @peterpogorski Couple of further questions
As @juho-hanhimaki mentioned, the benefits of local data performance in the cluster are huge and are pretty much one of the biggest attractions/differentiators against other orchestrators. Assuming there is a significant difference in latency here, and this is within tolerance of most customers, does this mean that read/write latency of state is no longer a killer feature for Service Fabric moving forwards?
If so, I could understand the move towards Managed Disks being able to satisfy the local performance requirements that we expect with the traditional Service Fabric temp disk model.
|
@craftyhouse what's the plan to support ephemeral disks in SFMC? |
What will the migration scenario look like once temp disk support is available? |
We are working to add support for temp disks on stateless node types. The migration will require a new node type and to migrate the workload over. We haven't seen any concerns since launch from customers around performance or latency of the managed disk, but would love to hear more if there has been any impact there. A lot of what makes SFMC work for stateful workloads relies on managed data disks and we will continue to expand those options too. |
I'm actually more concerned about the additional cost of using managed disks. How is SFMC different from plain old SF when it comes to the stateful workloads? |
I should add. It was the disk operations cost from using Standard SSD disks that caught us by surprise. Looks like we should be able that cost under control by switching to Premium SSD. In which cases we can probably live with running our stateful workloads on managed disks. I suppose support for Ephemeral OS disks is till interesting though. |
There is no difference in how service fabric runtime handles the stateful workloads whether deployed using classic or managed services. By using managed disk SFMC is able to benefit customers by: Hope that helps |
On the topic of encryption at rest SFMC support for temp disks will work with encrypted disks? Encryption at host requires enabling a |
Having stateful services use temporary disks is still a top priority for us. We can't afford to pay extra for managed disks (assuming poor IOPS/$). As I understand managed disks also hurt overall reliability since they are LRS. If a single availability zone is impacted data disks can be down, even if the cluster VM's are up in two other zones. Having VMs up with data disks down makes the whole cluster useless for any stateful workloads. Managed clusters are not very interesting to us until temp disk for data and ephemeral OS disk are supported. We don't want to depend on any external disk service. VM's have all the resources locally and service fabric coordinates services/replication. Maximum performance, cost effectivity and availability. Backups and long term storage (historical IoT data) can be done to external ZRS storage. Service availability is not dependent on external storage. |
Please direct questions to one of the forums documented here https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-support#post-a-question-to-microsoft-qa In short, as long as the restrictions do not apply to your scenario, we recommend using host encryption as called out in the documentation. |
Classic is the current supported path for this but hear you that you want to use managed clusters and will continue to think about ways to address this.
I did some rough math of VMSS skus with and without temp disk per month using the calculator. I'd be glad to discuss more offline if it would be helpful. In summary there are newer skus that are now GA that are cheaper than what was previously available as they do not have a temp disk. Example VM SKU with temp disk (100GB) dv5 d2 v5 2cpu 8GB of ram = ~$149 ~$167 total compared to $184 With dv5 and managed disk you get more ram and storage, but you are correct there is a lower ceiling for iops. We haven't heard feedback where this has come in to play with real world workloads yet, but if you have any data to share that would be helpful.
Not sure I follow this concern about availability. Each VM has a managed disk and if you have a zone resilient cluster and experience a zone down (az01 goes down), az02/03 would still be up and fully operational given the VMs and disks are localized to each zone.
|
I asked the question earlier in this thread but I don’t think it ever got an answer.
And now you’ve just confirmed with
I strongly suspect that there is confusion here from people with experience of non-managed clusters, who have not yet understood that you still use one Managed Disk per Node. A Managed Disk is not shared across all nodes. Are writes still performed and committed using the quorum semantics as per the same as non-managed clusters? It might be worth making that point very explicit in the public documentation of Managed Clusters, as it’s never been clear to me until now. I also asked the following question, but never got an answer
I would still hypothesise that the disaggregated architecture innovation would raise the ceiling latency of what is possible with Managed Disks…? Any noise coming from Azure on disaggregated architecture being utilised yet? |
Ah, I see. I agree it would be helpful to show how we wire it up to help clarify. A diagram depicting disk>vm>nodetype>cluster relationship. In text form, I have a managed cluster with two node types. NT2 here as an example: As you can see, they are unique disks. We support creating and attaching many per VM (a...z basically) with the latest preview api.
SFMC does not modify the way Service Fabric runtime behaves and leverages the exact same bits. The semantics that you are familiar with are still the same.
Ack :). Thank you
|
We haven't had the time to benchmark managed clusters yet. But cost/perf disadvantage seems obvious. SF: SF total per VM: $159 SFMC: OS disk Standard SSD E4 Data disk Premium SSD P40 SFMC total per VM: $406,5 As you can see the SFMC costs over twice as much and still has less IOPS. Our workload requires performant disk (a lot of IIoT devices constantly logging). SFMC makes no sense unless there is something obvious wrong with my napkin math. The non-temp disk variant of V5 VM is 14 dollars cheaper. 14 dollars is not enough for performant premium storage. Only shitty one. Of course we could try and see what we get from the cheapish P10 (500 IOPS) disk. But my assumption would be that our ability to process data would be severely degraded compared to V5 VM temp disk. And still we would pay few extra dollars. Considering the P10 disk itself is replicated storage, I'd find it really strange if it was on par with simple physical local storage on the VM. |
There is now an useEphemeralOSDisk property on SFMC https://learn.microsoft.com/en-us/azure/service-fabric/how-to-managed-cluster-ephemeral-os-disks The only problem is Managed data disk letter. It can not use the reserved letter C or D and it can not change after created. So we still need a managed disk for the statefull services :( |
Storing state reliably locally (with automatic replication) on the virtual machines is powerful and truly differentiating feature of the Service Fabric. Combined with ephemeral OS disks feature Service Fabric clusters can function entirely without dependency to external storage.
As the SF team should well know this kind of system has multiple benefits including:
I find it difficult to understand why managed clusters have abandoned this core differentiating feature of Service Fabric. Yes, there are tradeoffs in each model, but the customer should have the choice. I came to Service Fabric because of the powerful stateful services story which i feel is somewhat crippled in managed clusters.
Please consider enabling managed cluster scenario where all data (and OS) is kept local to the VM.
Please correct me if I have understood something wrong about managed clusters.
The text was updated successfully, but these errors were encountered: