Figure out how to do multi-node jobs and to compute the load of multi-node systems #6

lars-t-hansen · 2023-07-28T12:49:12Z

For the ML and light-HPC systems there's at most one node per job, but this is not true on the bigger systems - in that case, jobs can span multiple nodes. The sonar records will have the same job ID - these are SLURM jobs - so we'll collect records properly into jobs. But there's the matter of perhaps filtering and printing the node names sensibly, as well as computing and presenting the cross-node load. For system-relative load data we must also (in some way) use the capabilities of multiple systems to compute proper values, it's not enough to sum things and hope for the best.

Evolving task list:

Sabryr · 2023-07-30T21:26:26Z

This is a good observation. For multi-node jobs one more important think is whether the recruited nodes were used optimally.
- e.g 1. a user submits a job asking 4 nodes, but the jobs runs only on one node and the other 3 idle throughout .
- e.g 2. A user submits a jobs asking for 4 nodes, but only two nodes were active at any given time.
There are two network connections between each node. One is the type InfiniBand . If there were lot of data transfers between nodes, InfinibaBand should have been used. For cases where just the variable state updated, this is not important.

lars-t-hansen · 2023-07-31T07:10:54Z

I think we already have what we need to compute cross-node utilization (your first point) but sonar does not currently capture any data about communication (the second point), be it volume or topography. It is a sampling profiler and its only means of sampling is to probe system tables (via /proc and other pseudo filesystems, either directly or via ps; or via programs that have access to other statistics by probing hardware directly, such as nvidia-smi). If those types of sampling interfaces already can deliver information about communication then that's great. Otherwise we'll have to add additional system components that can collect this information, somehow. I'll file the necessary bugs on sonar to track this.

(I'll add communication volume to the set of use cases.)

lars-t-hansen · 2023-08-03T11:45:16Z

Technical quirk: with the synthesized job IDs (as on the ML nodes) there's a risk that the same PID is being used as the job ID on two different machines in an overlapping timeframe, yet these are two different jobs. It's important for sonalyze not to be confused about this. I think that in the case where we're interacting with a batch queue, there will be a command line argument to sonalyze to identify the system as such, eg to point to a data directory. The default, in the absence of such a switch, should be to treat hosts as independent.

In presenting a query that runs against the logs of multiple hosts, the same job ID may thus be shown multiple times in a listing, but it is always relative to the host. The consumer of the data must be aware of this.

~~(Edit: turns out that the current sonarlog code erroneously merges records from jobs on different hosts.)~~

lars-t-hansen · 2023-08-08T10:24:11Z

I think we already have what we need to compute cross-node utilization (your first point)

This turns out not to be true, due to a sonar bug, see #27 and #28.

lars-t-hansen · 2023-09-06T11:00:03Z

This is pretty much done now, I'm just doing some final testing and will then merge. I'll cut NordicHPC/sonar#67 loose, it doesn't need to block this bug, it's something that can come later. There are other mop-up issues too, like #54, but again, not really blocking us here.

lars-t-hansen · 2023-09-06T11:29:54Z

Fixed, for now. We'll file additional things as followup bugs.

lars-t-hansen self-assigned this Jul 28, 2023

lars-t-hansen mentioned this issue Jul 31, 2023

Log communications volume NordicHPC/sonar#67

Open

lars-t-hansen mentioned this issue Aug 3, 2023

Whether jobs with same ID are merged across hosts needs to be controlled by a flag #21

Closed

lars-t-hansen added the design label Aug 4, 2023

lars-t-hansen mentioned this issue Aug 4, 2023

Compute cross-node load and utilization #24

Closed

6 tasks

lars-t-hansen closed this as completed Sep 6, 2023

lars-t-hansen mentioned this issue Sep 7, 2023

Present inter-node communications volume #58

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure out how to do multi-node jobs and to compute the load of multi-node systems #6

Figure out how to do multi-node jobs and to compute the load of multi-node systems #6

lars-t-hansen commented Jul 28, 2023 •

edited

Loading

Sabryr commented Jul 30, 2023

lars-t-hansen commented Jul 31, 2023

lars-t-hansen commented Aug 3, 2023 •

edited

Loading

lars-t-hansen commented Aug 8, 2023

lars-t-hansen commented Sep 6, 2023

lars-t-hansen commented Sep 6, 2023

Figure out how to do multi-node jobs and to compute the load of multi-node systems #6

Figure out how to do multi-node jobs and to compute the load of multi-node systems #6

Comments

lars-t-hansen commented Jul 28, 2023 • edited Loading

Sabryr commented Jul 30, 2023

lars-t-hansen commented Jul 31, 2023

lars-t-hansen commented Aug 3, 2023 • edited Loading

lars-t-hansen commented Aug 8, 2023

lars-t-hansen commented Sep 6, 2023

lars-t-hansen commented Sep 6, 2023

lars-t-hansen commented Jul 28, 2023 •

edited

Loading

lars-t-hansen commented Aug 3, 2023 •

edited

Loading