Skip to content

Commit

Permalink
[Cluster launcher] [vSphere] dynamic passthrough gpu support (#41087)
Browse files Browse the repository at this point in the history
From ESXI 7.0’s release there was an addition called Dynamic DirectPath IO which is an option for GPU passthrough.

DirectPath IO is a feature that allows a physical PCIe device to be directly mapped to a VM, similar to Dynamic DirectPath IO. The main difference between the two is that DirectPath IO allows the VM to have direct access to the physical device all the time, while Dynamic DirectPath IO allows the VM to access the physical device only when the VM is powered on.

This is to support Dynamic DirectPath IO. If you want to enable, do it like this:

provider:
    ...
    vsphere_config:    
      .....
      gpu_config:
        dynamic_pci_passthrough: True

Signed-off-by: Chen Hui <huchen@vmware.com>
  • Loading branch information
huchen2021 authored and pull[bot] committed Apr 3, 2024
1 parent daf9a49 commit 3965040
Show file tree
Hide file tree
Showing 4 changed files with 299 additions and 145 deletions.
21 changes: 21 additions & 0 deletions python/ray/autoscaler/_private/vsphere/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,17 @@ def validate_frozen_vm_configs(conf: dict):
)


def update_gpu_config_in_provider_section(
config, head_node_config, worker_node_configs
):
provider_config = config["provider"]
vsphere_config = provider_config["vsphere_config"]
if "gpu_config" in vsphere_config:
head_node_config["gpu_config"] = vsphere_config["gpu_config"]
for worker_node_config in worker_node_configs:
worker_node_config["gpu_config"] = vsphere_config["gpu_config"]


def check_and_update_frozen_vm_configs_in_provider_section(
config, head_node_config, worker_node_configs
):
Expand Down Expand Up @@ -208,6 +219,8 @@ def update_vsphere_configs(config):
config, head_node_config, worker_node_configs
)

update_gpu_config_in_provider_section(config, head_node_config, worker_node_configs)


def configure_key_pair(config):
logger.info("Configuring keys for Ray Cluster Launcher to ssh into the head node.")
Expand All @@ -231,3 +244,11 @@ def configure_key_pair(config):
config["file_mounts"][public_key_remote_path] = PUBLIC_KEY_PATH

return config


def is_dynamic_passthrough(node_config):
if "gpu_config" in node_config:
gpu_config = node_config["gpu_config"]
if gpu_config and gpu_config["dynamic_pci_passthrough"]:
return True
return False
Loading

0 comments on commit 3965040

Please sign in to comment.