You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Does kubeflow play well for multi cluster setup, i.e when there are multiple installations of kubeflow in multiple data center / regions connecting to same MYSQL (cross region) and GCS multi region
Kubeflow Cluster 1 - US East
kubeflow Cluster 2 - US West
MySQL State store (Active (US East) -> Passive (US West))
GCS (multi region)
Additional Loadbalancer across region level distributes the traffic, obviously all the setting needs be applied at both cluster level.
Is this setup recommended
Does it produce any inconsistency when one / more component works i.e kubeflow pipelines where steps are executed and kfp clients distribute traffic between two clusters.
Any kubeflow cluster setting change has to be applied at both cluster level independently, any other caveats in multi cluster setup. Couldn't find related details in kubeflow documentation.
Can both cluster be Active / Active or should it be Active / Passive ?
The text was updated successfully, but these errors were encountered:
@senthilsivanath Thanks for your question. East/West Active/Active replication of your entire Kubeflow data science environment will often result in cost and performance concerns. We might try to whiteboard your use cases and architect a data management solution to meet your RTO/RPO, performance & budget requirements. Arrikto is a leading code contributor to Kubeflow and its data management solution provides a full featured, standards-based, scale-out architecture, which is described here, https://www.arrikto.com/rok-data-management/. I would be glad to set-up some time to discuss with our SMEs.
@senthilsivanath checking back to see if you would like to discuss this architecture. you might check out Rok (and the Rok Registry) which are based on a K8s storage class and enable your data science environment (ML code, data, metadata, and dependencies) to be re-created in another cluster, https://www.arrikto.com/rok-data-management/