Unlabeled Node Roles

The following is described under [node roles doc](https://github.com/run-ai/docs/blob/master/docs/admin/runai-setup/config/node-roles.md):

```
## Dedicated GPU & CPU Nodes

Separate nodes into those that:

* Run GPU workloads
* Run CPU workloads
* Do not run Run:ai at all. these jobs will not be monitored using the Run:ai Administration User interface. 
```

This is actually not true, all nodes in the cluster are displayed under `Nodes` tab in the Administration UI.
That includes Run:ai worker nodes, Run:ai system nodes, regular workers, and cluster masters.

All nodes containing GPUs and having DCGM exporting metrics upon them, would count as "GPU nodes" in the `Overview` dashboard.
That includes nodes that don't have the `runai-container-toolkit` & `runai-container-toolkit-exporter` DaemonSets running on them - that means that any Run:ai pod won't be scheduled upon them, but they are still counted.

```
Review nodes names using `kubectl get nodes`. For each such node run:

'```
runai-adm set node-role --gpu-worker <node-name>
'```

or 

'```
runai-adm set node-role --cpu-worker <node-name>
'```

Nodes not marked as GPU worker or CPU worker will not run Run:ai at all.
```

That's also not true, nodes that are not marked as GPU workers nor CPU workers would run **any** kind of Run:ai workload.
The same behavior will be achieved if both roles are assigned to a node.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unlabeled Node Roles #234

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unlabeled Node Roles #234

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions