Skip to content

Conversation

kiukchung
Copy link
Contributor

Summary:
TorchX's status API returns a struct that has replica.hostname field. However it is not always filled for all schedulers. meta-pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information.

This PR adds a hostnames field to monarch.tools.mesh_sepc.MeshSpec and fills it up with the hostnames returned by TorchX.

This information will be used in PR (5/n) to implement a TorchXAllocator

Differential Revision: D76847192

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jun 17, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76847192

facebook-github-bot pushed a commit that referenced this pull request Jun 17, 2025
…ion and hostnames to mesh_spec (#296)

Summary:

TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. meta-pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information.

This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX.

This information will be used in PR (5/n) to implement a `TorchXAllocator`

Differential Revision: D76847192
kiukchung added a commit that referenced this pull request Jun 18, 2025
…ion and hostnames to mesh_spec (#296)

Summary:

TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. meta-pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information.

This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX.

This information will be used in PR (5/n) to implement a `TorchXAllocator`

Reviewed By: suo

Differential Revision: D76847192
kiukchung added a commit that referenced this pull request Jun 18, 2025
…ion and hostnames to mesh_spec (#296)

Summary:

TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. meta-pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information.

This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX.

This information will be used in PR (5/n) to implement a `TorchXAllocator`

Reviewed By: suo

Differential Revision: D76847192
kiukchung added a commit that referenced this pull request Jun 18, 2025
…ion and hostnames to mesh_spec (#296)

Summary:

TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. meta-pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information.

This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX.

This information will be used in PR (5/n) to implement a `TorchXAllocator`

Reviewed By: suo

Differential Revision: D76847192
kiukchung added a commit that referenced this pull request Jun 18, 2025
…ion and hostnames to mesh_spec (#296)

Summary:

TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. meta-pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information.

This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX.

This information will be used in PR (5/n) to implement a `TorchXAllocator`

Reviewed By: suo

Differential Revision: D76847192
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76847192

kiukchung added a commit that referenced this pull request Jun 18, 2025
…ion and hostnames to mesh_spec (#296)

Summary:
Pull Request resolved: #296

TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. meta-pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information.

This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX.

This information will be used in PR (5/n) to implement a `TorchXAllocator`

Reviewed By: suo

Differential Revision: D76847192
…ion and hostnames to mesh_spec (#296)

Summary:
Pull Request resolved: #296

TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. meta-pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information.

This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX.

This information will be used in PR (5/n) to implement a `TorchXAllocator`

Reviewed By: suo

Differential Revision: D76847192
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D76847192

facebook-github-bot pushed a commit that referenced this pull request Jun 18, 2025
…ion and hostnames to mesh_spec (#296)

Summary:

TorchX's `status` API returns a struct that has `replica.hostname` field. However it is not always filled for all schedulers. meta-pytorch/torchx#1080 makes it such that the slurm scheduler in TorchX fills out the hostname information.

This PR adds a `hostnames` field to `monarch.tools.mesh_sepc.MeshSpec` and fills it up with the hostnames returned by TorchX.

This information will be used in PR (5/n) to implement a `TorchXAllocator`

Reviewed By: suo

Differential Revision: D76847192
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 844e0bb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants