-
Notifications
You must be signed in to change notification settings - Fork 168
/
hwloc.7in
371 lines (362 loc) · 12 KB
/
hwloc.7in
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
.\" -*- nroff -*-
.\" Copyright © 2010-2023 Inria. All rights reserved.
.\" Copyright © 2010 Université of Bordeaux
.\" Copyright © 2009-2010 Cisco Systems, Inc. All rights reserved.
.\" See COPYING in top-level directory.
.TH HWLOC "7" "%HWLOC_DATE%" "%PACKAGE_VERSION%" "%PACKAGE_NAME%"
.SH NAME
hwloc - General information about hwloc ("hardware locality").
.
.\" **************************
.\" Description Section
.\" **************************
.SH DESCRIPTION
.
hwloc provides command line tools and a C API to obtain the
hierarchical map of key computing elements, such as: NUMA memory
nodes, shared caches, processor packages, processor cores, and
processor "threads". hwloc also gathers various attributes such as
cache and memory information, and is portable across a variety of
different operating systems and platforms.
.
.
.SS Definitions
hwloc has some specific definitions for terms that are used in this
man page and other hwloc documentation.
.
.TP 5
.B hwloc CPU set:
A set of processors included in an hwloc object, expressed as a bitmask
indexed by the physical numbers of the CPUs (as announced by the OS).
The hwloc definition
of "CPU set" does not carry any of the same connotations as Linux's "CPU
set" (e.g., process affinity, cgroup, etc.).
.
.TP
.B hwloc node set:
A set of NUMA memory nodes near an hwloc object, expressed as a bitmask
indexed by the physical numbers of the NUMA nodes (as announced by the OS).
.
.TP
.B Linux CPU set:
See http://www.mjmwired.net/kernel/Documentation/cpusets.txt for a
discussion of Linux CPU sets. A
super-short-ignoring-many-details description (taken from that page)
is:
.br
.br
"Cpusets provide a mechanism for assigning a set of CPUs and Memory
Nodes to a set of tasks."
.
.TP
.B Linux Cgroup:
See http://www.mjmwired.net/kernel/Documentation/cgroups.txt for a
discussion of Linux control groups. A
super-short-ignoring-many-details description (taken from that page)
is:
.br
.br
"Control Groups provide a mechanism for aggregating/partitioning sets
of tasks, and all their future children, into hierarchical groups
with specialized behaviour."
.
.PP
To be clear, hwloc supports all of the above concepts. It is simply
worth noting that they are different things.
.
.SS Location Specification
.
Locations refer to specific regions within a topology. Before reading
the rest of this man page, it may be useful to read lstopo(1) and/or
run lstopo on your machine to see the reported topology tree. Seeing
and understanding a topology tree will definitely help in
understanding the concepts that are discussed below.
.
.PP
Locations can be specified in multiple ways:
.
.TP 10
.B Tuples:
Tuples of hwloc "objects" and associated indexes can be specified in
the form
.IR object:index .
hwloc objects represent types of mapped items (e.g., packages, cores,
etc.) in a topology tree; indexes are non-negative integers that
specify a unique physical object in a topology tree. Both concepts
are described in detail, below.
.br
.br
Some filters may be added after the type to further specify which
objects are wanted.
\fI<type>[subtype=<subtype>]\fR selects objects matching the given
type and also its subtype string attribute.
For instance \fINUMA[HBM]\fR selects NUMA nodes of subtype "HBM".
The prefix \fIsubtype=\fR may be avoided if there is no ambiguity.
\fINUMA[tier=X]\fR select NUMA nodes of tier <X> ("MemoryTier" info attribute).
\fIOS[Net]\fR also selects OS devices whose OS-device-specific type is Network.
.br
.br
Indexes may also be specified as ranges.
\fIx-y\fR enumerates from index \fIx\fR to \fIy\fR.
\fIx:y\fR enumerates \fIy\fR objects starting from index \fIx\fR
(wrapping around the end of the index range if needed).
\fIx-\fR enumerates all objects starting from index \fIx\fR.
\fIall\fR, \fIodd\fR, and \fIeven\fR are also supported for listing
all objects, or only those with odd or even indexes.
.br
.br
Chaining multiple tuples together in the more general form
.I object1:index[.object2:index2[...]]
is permissable. While the first tuple's object may appear anywhere in
the topology, the Nth tuple's object must have a shallower topology
depth than the (N+1)th tuple's object. Put simply: as you move right
in a tuple chain, objects must go deeper in the topology tree.
When using logical indexes (which is the default),
indexes specified in chained tuples are relative to the scope of the
parent object. For example, "package:0.core:1" refers to the second
core in the first package.
.br
.br
When using OS/physical indexes, the first object matching the given
index is used.
.br
.br
PCI and OS devices may also be designed using their identifier.
For example, "\fBpci=02:03.1\fR" is the PCI device with bus ID "02:03.1".
.
"\fBos=eth0\fR" is the network interface whose software name is "eth0".
.
PCI devices may also be filtered based on their vendor and/or device IDs,
for instance "\fBpci[15b3:]:2\fR" for the third Mellanox PCI device (vendor ID 0x15b3).
.
OS devices may also be filtered based on their subtype,
for instance "\fBos[gpu]:all\fR" for all GPU OS devices.
.
.TP
.B Hex:
For tools that manipulate object as sets (e.g. hwloc-calc and hwloc-bind),
locations can also be specified as hexidecimal bitmasks prefixed
.
with "0x". Commas must be used to separate the hex digits into blocks
of 8, such as "0xffc0140,0x00020110".
.
Leading zeros in each block do not need to be specified.
.
For example, "0xffc0140,0x20110" is equivalent to the prior example,
and "0x0000000f" is exactly equivalent to "0xf". Intermediate blocks
of 8 digits that are all zeoro can be left empty; "0xff0,,0x13" is
equivalent to "0xff0,0x00000000,0x13".
.
If the location is prefixed with the special string "0xf...f", then
all unspecified bits are set (as if the set were infinite). For
example, "0xf...f,0x1" sets both the first bit and all bits starting
with the 33rd. The string "0xf...f" -- with no other specified values
-- sets all bits.
.
.PP
"all" and "root" are special locations consisting in the root
object in tree. It contains the entire current topology.
.
.PP
Some tools directly operate on these objects (e.g. hwloc-info and hwloc-annotate).
They do not support hexadecimal locations because each location may
correspond to multiple objects.
For instance, there can be exactly one L3 cache per package and NUMA node,
which means it's the same location.
.
If multiple locations are given on the command-line,
these tools will operation on each location individually and consecutively.
.
.PP
Some other tools internally manipulate objects as sets (e.g. hwloc-calc and hwloc-bind).
They translate each input location into a hexidecimal location.
When I/O or Misc objects are used, they are translated into the set
of processors (or NUMA nodes) that are close to the given object
(because I/O or Misc objects do not contain processors or NUMA nodes).
.
.PP
If multiple locations are specified on the command-line (delimited by whitespace),
they are combined (the overall location is wider).
.
If prefixed with "~", the given location
will be cleared instead of added to the current list of locations. If
prefixed with "x", the given location will be and'ed instead of added
to the current list. If prefixed with "^", the given location will be
xor'ed.
.
.PP
More complex operations may be performed by using
.IR hwloc-calc
to compute intermediate values.
.
.SS hwloc Objects
.
.PP
Objects in tuples can be any of the following strings
.
(listed from "biggest" to "smallest"):
.
.TP 10
.B machine
A set of processors and memory.
.
.TP
.B numanode
A NUMA node; a set of processors around memory which the processors
can directly access.
.
If \fBnuma[hbm]\fR is used instead of \fBnumanode\fR in locations,
command-line tools only consider NUMA nodes marked as high-bandwidth memory
(subtype "HBM").
.
.TP
.B package
Typically a physical package or chip, that goes into a package,
it is a grouping of one or more processors.
.
.TP
.B l1cache ... l5cache
A data (or unified) cache.
.
.TP
.B l1icache ... l3icache
An instruction cache.
.
.TP
.B core
A single, physical processing unit which may still contain multiple
logical processors, such as hardware threads.
.
.TP
.B pu
Short for
.I processor unit
(not
.IR process !).
The smallest physical execution unit that hwloc recognizes. For
example, there may be multiple PUs on a core (e.g.,
hardware threads).
.PP
\fBosdev\fR, \fBpcidev\fR, \fBbridge\fR, and \fBmisc\fR may also be used
to specify special devices although some of them have dedicated identification
ways as explained in \fBLocation Specification\fR.
.
.PP
Finally, note that an object can be denoted by its numeric "depth" in
the topology graph.
.
.SS hwloc Indexes
Indexes are integer values that uniquely specify a given object of a
specific type. Indexes can be expressed either as
.I logical
values or
.I physical
values. Most hwloc utilities accept logical indexes by default.
Passing
.B --physical
switches to physical/OS indexes.
Both logical and physical indexes are described on this man page.
.
.PP
.I Logical
indexes are relative to the object order in the output from the
lstopo command. They always start with 0 and increment by 1 for each
successive object.
.
.PP
.I Physical
indexes are how the operating system refers to objects. Note that
while physical indexes are non-negative integer values, the hardware
and/or operating system may choose arbitrary values -- they may not
start with 0, and successive objects may not have consecutive values.
.
.PP
For example, if the first few lines of lstopo -p output are the
following:
.
Machine (47GB)
NUMANode P#0 (24GB) + Package P#0 + L3 (12MB)
L2 (256KB) + L1 (32KB) + Core P#0 + PU P#0
L2 (256KB) + L1 (32KB) + Core P#1 + PU P#0
L2 (256KB) + L1 (32KB) + Core P#2 + PU P#0
L2 (256KB) + L1 (32KB) + Core P#8 + PU P#0
L2 (256KB) + L1 (32KB) + Core P#9 + PU P#0
L2 (256KB) + L1 (32KB) + Core P#10 + PU P#0
NUMANode P#1 (24GB) + Package P#1 + L3 (12MB)
L2 (256KB) + L1 (32KB) + Core P#0 + PU P#0
L2 (256KB) + L1 (32KB) + Core P#1 + PU P#0
L2 (256KB) + L1 (32KB) + Core P#2 + PU P#0
L2 (256KB) + L1 (32KB) + Core P#8 + PU P#0
L2 (256KB) + L1 (32KB) + Core P#9 + PU P#0
L2 (256KB) + L1 (32KB) + Core P#10 + PU P#0
In this example, the first core on the second package is logically
number 6 (i.e., logically the 7th core, starting from 0). Its
physical index is 0, but note that another core
.I also
has a physical index of 0. Hence, physical indexes may only be
relevant within the scope of their parent (or set of ancestors).
In this example, to uniquely identify logical core 6 with
physical indexes, you must specify (at a minimum) both a package and a
core: package 1, core 0.
.PP
Index values, regardless of whether they are logical or physical, can
be expressed in several different forms (where X, Y, and N are
positive integers):
.
.TP 10
.B X
The object with index value X.
.
.TP
.B X-Y
All the objects with index values >= X and <= Y.
.
.TP
.B X-
All the objects with index values >= X.
.
.TP
.B X:N
N objects starting with index X, possibly wrapping around the end of
the level.
.
.TP
.B all
A special index value indicating all valid index values.
.
.TP
.B odd
A special index value indicating all valid odd index values.
.
.TP
.B even
A special index value indicating all valid even index values.
.
.PP
.IR REMEMBER :
hwloc's command line tools accept
.I logical
indexes for location values by default.
Use
.BR --physical " and " --logical
to switch from one mode to another.
.
.\" **************************
.\" See also section
.\" **************************
.SH SEE ALSO
.
hwloc's command line tool documentation: lstopo(1), hwloc-bind(1),
hwloc-calc(1), hwloc-distrib(1), hwloc-ps(1).
.
.PP
hwloc has many C API functions, each of which have their own man page.
Some top-level man pages are also provided, grouping similar functions
together. A few good places to start might include:
hwlocality_objects(3), hwlocality_types(3), hwlocality_creation(3),
hwlocality_cpuset(3), hwlocality_information(3), and
hwlocality_binding(3).
.
.PP
For a listing of all available hwloc man pages, look at all "hwloc*"
files in the man1 and man3 directories.