forked from open-mpi/hwloc
/
NEWS
1025 lines (936 loc) · 47 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Copyright © 2009 CNRS
Copyright © 2009-2014 Inria. All rights reserved.
Copyright © 2009-2013 Université Bordeaux 1
Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved.
$COPYRIGHT$
Additional copyrights may follow
$HEADER$
===========================================================================
This file contains the main features as well as overviews of specific
bug fixes (and other actions) for each version of hwloc since version
0.9 (as initially released as "libtopology", then re-branded to "hwloc"
in v0.9.1).
Version 1.10.0
--------------
* API
+ hwloc_distrib() does not ignore any objects anymore when there are
too many of them. They get merged with others instead.
Thanks to Tim Creech for reporting the issue.
+ Add hwloc_topology_export_synthetic() to export a topology to a
synthetic string without using lstopo. See the Synthetic topologies
section in the documentation.
+ Add hwloc_topology_set/get_userdata() to let the application save
a private pointer in the topology whenever it needs a way to find
its own object corresponding to a topology.
* Misc
+ Synthetic topology descriptions may now specify attributes such as
memory sizes and OS indexes. See the Synthetic topologies section
in the documentation.
- lstopo now exports in this fully-detailed format by default.
The new option --export-synthetic-flags may be used to revert
back the old format.
+ Clarify that memory sizes shown in lstopo are local by default
unless specified (total memory added in the root object).
+ Add --disable-cpuid configure flag to work around buggy processor
simulators reporting invalid CPUID information.
Thanks for Andrew Friedley for reporting the issue.
+ Fix a racy use of libltdl when manipulating multiple topologies in
different threads.
Thanks to Andra Hugo for reporting the issue and testing patches.
+ The plugin ABI has changed, this release will not load plugins
built against previous hwloc releases.
Version 1.9.1
-------------
* Fix a crash when the PCI locality is invalid. Attach to the root object
instead. Thanks to Nicolas Denoyelle for reporting the issue.
* Fix -f in lstopo manpage. Thanks to Jirka Hladky for reporting the issue.
* Fix hwloc_obj_type_sscanf() and others when strncasecmp() is not properly
available. Thanks to Nick Papior Andersen for reporting the problem.
* Mark Linux file descriptors as close-on-exec to avoid leaks on exec.
* Fix some minor memory leaks.
Version 1.9.0
-------------
* API
+ Add hwloc_obj_type_sscanf() to extend hwloc_obj_type_of_string() with
type-specific attributes such as Cache/Group depth and Cache type.
hwloc_obj_type_of_string() is moved to hwloc/deprecated.h.
+ Add hwloc_linux_get_tid_last_cpu_location() for retrieving the
last CPU where a Linux thread given by TID ran.
+ Add hwloc_distrib() to extend the old hwloc_distribute[v]() functions.
hwloc_distribute[v]() is moved to hwloc/deprecated.h.
+ Don't mix total and local memory when displaying verbose object attributes
with hwloc_obj_attr_snprintf() or in lstopo.
* Backends
+ Add CPUVendor, CPUModelNumber and CPUFamilyNumber info attributes for
x86, ia64 and Xeon Phi sockets on Linux, to extend the x86-specific
support added in v1.8.1. Requested by Ralph Castain.
+ Add many CPU- and Platform-related info attributes on ARM and POWER
platforms, in the Machine and Socket objects.
+ Add CUDA info attributes describing the number of multiprocessors and
cores and the size of the global, shared and L2 cache memories in CUDA
OS devices.
+ Add OpenCL info attributes describing the number of compute units and
the global memory size in OpenCL OS devices.
+ The synthetic backend now accepts extended types such as L2Cache, L1i or
Group3. lstopo also exports synthetic strings using these extended types.
* Tools
+ lstopo
- Do not overwrite output files by default anymore.
Pass -f or --force to enforce it.
- Display OpenCL, CUDA and Xeon Phi numbers of cores and memory sizes
in the graphical output.
- Fix export to stdout when specifying a Cairo-based output type
with --of.
+ hwloc-ps
- Add -e or --get-last-cpu-location to report where processes/threads
run instead of where they are bound.
- Report locations as likely-more-useful objects such as Cores or Sockets
instead of Caches when possible.
+ hwloc-bind
- Fix failure on Windows when not using --pid.
- Add -e as a synonym to --get-last-cpu-location.
+ hwloc-distrib
- Add --reverse to distribute using last objects first and singlify
into last bits first. Thanks to Jirka Hladky for the suggestion.
+ hwloc-info
- Report unified caches when looking for data or instruction cache
ancestor objects.
* Misc
+ Add experimental Visual Studio support under contrib/windows.
Thanks to Eloi Gaudry for his help and for providing the first draft.
+ Fix some overzealous assertions and warnings about the ordering of
objects on a level with respect to cpusets. The ordering is only
guaranteed for complete cpusets (based on the first bit in sets).
+ Fix some memory leaks when importing xml diffs and when exporting a
"too complex" entry.
Version 1.8.1
-------------
* Fix the cpuid code on Windows 64bits so that the x86 backend gets
enabled as expected and can populate CPU information.
Thanks to Robin Scher for reporting the problem.
* Add CPUVendor/CPUModelNumber/CPUFamilyNumber attributes when running
on x86 architecture. Thanks to Ralph Castain for the suggestion.
* Work around buggy BIOS reporting duplicate NUMA nodes on Linux.
Thanks to Jeff Becker for reporting the problem and testing the patch.
* Add a name to the lstopo graphical window. Thanks to Michael Prokop
for reporting the issue.
Version 1.8.0
-------------
* New components
+ Add the "linuxpci" component that always works on Linux even when
libpciaccess and libpci aren't available (and even with a modified
file-system root). By default the old "pci" component runs first
because "linuxpci" lacks device names (obj->name is always NULL).
* API
+ Add the topology difference API in hwloc/diff.h for manipulating
many similar topologies.
+ Add hwloc_topology_dup() for duplicating an entire topology.
+ hwloc.h and hwloc/helper.h have been reorganized to clarify the
documentation sections. The actual inline code has moved out of hwloc.h
into the new hwloc/inlines.h.
+ Deprecated functions are now in hwloc/deprecated.h, and not in the
official documentation anymore.
* Tools
+ Add hwloc-diff and hwloc-patch tools together with the new diff API.
+ Add hwloc-compress-dir to (de)compress an entire directory of XML files
using hwloc-diff and hwloc-patch.
+ Object colors in the graphical output of lstopo may be changed by adding
a "lstopoStyle" info attribute. See CUSTOM COLORS in the lstopo(1) manpage
for details. Thanks to Jirka Hladky for discussing the idea.
+ hwloc-gather-topology may now gather I/O-related files on Linux when
--io is given. Only the linuxpci component supports discovering I/O
objects from these extended tarballs.
+ hwloc-annotate now supports --ri to remove/replace info attributes with
a given name.
+ hwloc-info supports "root" and "all" special locations for dumping
information about the root object.
+ lstopo now supports --append-legend to append custom lines of text
to the legend in the graphical output. Thanks to Jirka Hladky for
discussing the idea.
+ hwloc-calc and friends have a more robust parsing of locations given
on the command-line and they report useful error messages about it.
+ Add --whole-system to hwloc-bind, hwloc-calc, hwloc-distances and
hwloc-distrib, and add --restrict to hwloc-bind for uniformity among
tools.
* Misc
+ Calling hwloc_topology_load() or hwloc_topology_set_*() on an already
loaded topology now returns an error (deprecated since release 1.6.1).
+ Fix the initialisation of cpusets and nodesets in Group objects added
when inserting PCI hostbridges.
+ Never merge Group objects that were added explicitly by the user with
hwloc_custom_insert_group_object_by_parent().
+ Add a sanity check during dynamic plugin loading to prevent some
crashes when hwloc is dynamically loaded by another plugin mechanisms.
+ Add --with-hwloc-plugins-path to specify the install/load directories
of plugins.
+ Add the MICSerialNumber info attribute to the root object when running
hwloc inside a Xeon Phi to match the same attribute in the MIC OS device
when running in the host.
Version 1.7.2
-------------
* Do not create invalid block OS devices on very old Linux kernel such
as RHEL4 2.6.9.
* Fix PCI subvendor/device IDs.
* Fix the management of Misc objects inserted by parent.
Thanks to Jirka Hladky for reporting the problem.
* Add a Port<n>State into attribute to OpenFabrics OS devices.
* Add a MICSerialNumber info attribute to Xeon PHI/MIC OS devices.
* Improve verbose error messages when failing to load from XML.
Version 1.7.1
-------------
* Fix a failed assertion in the distance grouping code when loading a XML
file that already contains some groups.
Thanks to Laercio Lima Pilla for reporting the problem.
* Remove unexpected Group objects when loading XML topologies with I/O
objects and NUMA distances.
Thanks to Elena Elkina for reporting the problem and testing patches.
* Fix PCI link speed discovery when using libpciaccess.
* Fix invalid libpciaccess virtual function device/vendor IDs when using
SR-IOV PCI devices on Linux.
* Fix GL component build with old NVCtrl releases.
Thanks to Jirka Hladky for reporting the problem.
* Fix embedding breakage caused by libltdl.
Thanks to Pavan Balaji for reporting the problem.
* Always use the system-wide libltdl instead of shipping one inside hwloc.
* Document issues when enabling plugins while embedding hwloc in another
project, in the documentation section Embedding hwloc in Other Software.
* Add a FAQ entry "How to get useful topology information on NetBSD?"
in the documentation.
* Somes fixes in the renaming code for embedding.
* Miscellaneous minor build fixes.
Version 1.7.0
-------------
* New operating system backends
+ Add BlueGene/Q compute node kernel (CNK) support. See the FAQ in the
documentation for details. Thanks to Jeff Hammond, Christopher Samuel
and Erik Schnetter for their help.
+ Add NetBSD support, thanks to Aleksej Saushev.
* New I/O device discovery
+ Add co-processor OS devices such as "mic0" for Intel Xeon Phi (MIC)
on Linux. Thanks to Jerome Vienne for helping.
+ Add co-processor OS devices such as "cuda0" for NVIDIA CUDA-capable GPUs.
+ Add co-processor OS devices such as "opencl0d0" for OpenCL GPU devices
on the AMD OpenCL implementation.
+ Add GPU OS devices such as ":0.0" for NVIDIA X11 displays.
+ Add GPU OS devices such as "nvml0" for NVIDIA GPUs.
Thanks to Marwan Abdellah and Stefan Eilemann for helping.
These new OS devices have some string info attributes such as CoProcType,
GPUModel, etc. to better identify them.
See the I/O Devices and Attributes documentation sections for details.
* New components
+ Add the "opencl", "cuda", "nvml" and "gl" components for I/O device
discovery.
+ "nvml" also improves the discovery of NVIDIA GPU PCIe link speed.
All of these new components may be built as plugins. They may also be
disabled entirely by passing --disable-opencl/cuda/nvml/gl to configure.
See the I/O Devices, Components and Plugins, and FAQ documentation
sections for details.
* API
+ Add hwloc_topology_get_flags().
+ Add hwloc/plugins.h for building external plugins.
See the Adding new discovery components and plugins section.
* Interoperability
+ Add hwloc/opencl.h, hwloc/nvml.h, hwloc/gl.h and hwloc/intel-mic.h
to retrieve the locality of OS devices that correspond to AMD OpenCL
GPU devices or indexes, to NVML devices or indexes, to NVIDIA X11
displays, or to Intel Xeon Phi (MIC) device indexes.
+ Add new helpers in hwloc/cuda.h and hwloc/cudart.h to convert
between CUDA devices or indexes and hwloc OS devices.
+ Add hwloc_ibv_get_device_osdev() and clarify the requirements
of the OpenFabrics Verbs helpers in hwloc/openfabrics-verbs.h.
* Tools
+ hwloc-info is not only a synonym of lstopo -s anymore, it also
dumps information about objects given on the command-line.
* Documentation
+ Add a section "Existing components and plugins".
+ Add a list of common OS devices in section "Software devices".
+ Add a new FAQ entry "Why is lstopo slow?" about lstopo slowness
issues because of GPUs.
+ Clarify the documentation of inline helpers in hwloc/myriexpress.h
and hwloc/openfabrics-verbs.h.
* Misc
+ Improve cache detection on AIX.
+ The HWLOC_COMPONENTS variable now excludes the components whose
names are prefixed with '-'.
+ lstopo --ignore PU now works when displaying the topology in
graphical and textual mode (not when exporting to XML).
+ Make sure I/O options always appear in lstopo usage, not only when
using pciutils/libpci.
+ Remove some unneeded Linux specific includes from some interoperability
headers.
+ Fix some inconsistencies in hwloc-distrib and hwloc-assembler-remote
manpages. Thanks to Guy Streeter for the report.
+ Fix a memory leak on AIX when getting memory binding.
+ Fix many small memory leaks on Linux.
+ The `libpci' component is now called `pci' but the old name is still
accepted in the HWLOC_COMPONENTS variable for backward compatibility.
Version 1.6.2
-------------
* Use libpciaccess instead of pciutils/libpci by default for I/O discovery.
pciutils/libpci is only used if --enable-libpci is given to configure
because its GPL license may taint hwloc. See the Installation section
in the documentation for details.
* Fix get_cpubind on Solaris when bound to a single PU with
processor_bind(). Thanks to Eugene Loh for reporting the problem
and providing a patch.
Version 1.6.1
-------------
* Fix some crash or buggy detection in the x86 backend when Linux
cgroups/cpusets restrict the available CPUs.
* Fix the pkg-config output with --libs --static.
Thanks to Erik Schnetter for reporting one of the problems.
* Fix the output of hwloc-calc -H --hierarchical when using logical
indexes in the output.
* Calling hwloc_topology_load() multiple times on the same topology
is officially deprecated. hwloc will warn in such cases.
* Add some documentation about existing plugins/components, package
dependencies, and I/O devices specification on the command-line.
Version 1.6.0
-------------
* Major changes
+ Reorganize the backend infrastructure to support dynamic selection
of components and dynamic loading of plugins. For details, see the
new documentation section Components and plugins.
- The HWLOC_COMPONENTS variable lets one replace the default discovery
components.
- Dynamic loading of plugins may be enabled with --enable-plugins
(except on AIX and Windows). It will build libxml2 and libpci
support as separated modules. This helps reducing the dependencies
of the core hwloc library when distributed as a binary package.
* Backends
+ Add CPUModel detection on Darwin and x86/FreeBSD.
Thanks to Robin Scher for providing ways to implement this.
+ The x86 backend now adds CPUModel info attributes to socket objects
created by other backends that do not natively support this attribute.
+ Fix detection on FreeBSD in case of cpuset restriction. Thanks to
Sebastian Kuzminsky for reporting the problem.
* XML
+ Add hwloc_topology_set_userdata_import/export_callback(),
hwloc_export_obj_userdata() and _userdata_base64() to let
applications specify how to save/restore the custom data they placed
in the userdata private pointer field of hwloc objects.
* Tools
+ Add hwloc-annotate program to add string info attributes to XML
topologies.
+ Add --pid-cmd to hwloc-ps to append the output of a command to each
PID line. May be used for showing Open MPI process ranks, see the
hwloc-ps(1) manpage for details.
+ hwloc-bind now exits with an error if binding fails; the executable
is not launched unless binding suceeeded or --force was given.
+ Add --quiet to hwloc-calc and hwloc-bind to hide non-fatal error
messages.
+ Fix command-line pid support in windows tools.
+ All programs accept --verbose as a synonym to -v.
* Misc
+ Fix some DIR descriptor leaks on Linux.
+ Fix I/O device lists when some were filtered out after a XML import.
+ Fix the removal of I/O objects when importing a I/O-enabled XML topology
without any I/O topology flag.
+ When merging objects with HWLOC_IGNORE_TYPE_KEEP_STRUCTURE or
lstopo --merge, compare object types before deciding which one of two
identical object to remove (e.g. keep sockets in favor of caches).
+ Add some GUID- and LID-related info attributes to OpenFabrics
OS devices.
+ Only add CPUType socket attributes on Solaris/Sparc. Other cases
don't report reliable information (Solaris/x86), and a replacement
is available as the Architecture string info in the Machine object.
+ Add missing Backend string info on Solaris in most cases.
+ Document object attributes and string infos in a new Attributes
section in the documentation.
+ Add a section about Synthetic topologies in the documentation.
Version 1.5.2 (some of these changes are in v1.6.2 but not in v1.6)
-------------
* Use libpciaccess instead of pciutils/libpci by default for I/O discovery.
pciutils/libpci is only used if --enable-libpci is given to configure
because its GPL license may taint hwloc. See the Installation section
in the documentation for details.
* Fix get_cpubind on Solaris when bound to a single PU with
processor_bind(). Thanks to Eugene Loh for reporting the problem
and providing a patch.
* Fix some DIR descriptor leaks on Linux.
* Fix I/O device lists when some were filtered out after a XML import.
* Add missing Backend string info on Solaris in most cases.
* Fix the removal of I/O objects when importing a I/O-enabled XML topology
without any I/O topology flag.
* Fix the output of hwloc-calc -H --hierarchical when using logical
indexes in the output.
* Fix the pkg-config output with --libs --static.
Thanks to Erik Schnetter for reporting one of the problems.
Version 1.5.1
-------------
* Fix block OS device detection on Linux kernel 3.3 and later.
Thanks to Guy Streeter for reporting the problem and testing the fix.
* Fix the cpuid code in the x86 backend (for FreeBSD). Thanks to
Sebastian Kuzminsky for reporting problems and testing patches.
* Fix 64bit detection on FreeBSD.
* Fix some corner cases in the management of the thissystem flag with
respect to topology flags and environment variables.
* Fix some corner cases in command-line parsing checks in hwloc-distrib
and hwloc-distances.
* Make sure we do not miss some block OS devices on old Linux kernels
when a single PCI device has multiple IDE hosts/devices behind it.
* Do not disable I/O devices or instruction caches in hwloc-assembler output.
Version 1.5.0
-------------
* Backends
+ Do not limit the number of processors to 1024 on Solaris anymore.
+ Gather total machine memory on FreeBSD.
+ XML topology files do not depend on the locale anymore. Float numbers
such as NUMA distances or PCI link speeds now always use a dot as a
decimal separator.
+ Add instruction caches detection on Linux, AIX, Windows and Darwin.
+ Add get_last_cpu_location() support for the current thread on AIX.
+ Support binding on AIX when threads or processes were bound with
bindprocessor(). Thanks to Hendryk Bockelmann for reporting the issue
and testing patches, and to Farid Parpia for explaining the binding
interfaces.
+ Improve AMD topology detection in the x86 backend (for FreeBSD) using
the topoext feature.
* API
+ Increase HWLOC_API_VERSION to 0x00010500 so that API changes may be
detected at build-time.
+ Add a cache type attribute describind Data, Instruction and Unified
caches. Caches with different types but same depth (for instance L1d
and L1i) are placed on different levels.
+ Add hwloc_get_cache_type_depth() to retrieve the hwloc level depth of
of the given cache depth and type, for instance L1i or L2.
It helps disambiguating the case where hwloc_get_type_depth() returns
HWLOC_TYPE_DEPTH_MULTIPLE.
+ Instruction caches are ignored unless HWLOC_TOPOLOGY_FLAG_ICACHES is
passed to hwloc_topology_set_flags() before load.
+ Add hwloc_ibv_get_device_osdev_by_name() OpenFabrics helper in
openfabrics-verbs.h to find the hwloc OS device object corresponding to
an OpenFabrics device.
* Tools
+ Add lstopo-no-graphics, a lstopo built without graphical support to
avoid dependencies on external libraries such as Cairo and X11. When
supported, graphical outputs are only available in the original lstopo
program.
- Packagers splitting lstopo and lstopo-no-graphics into different
packages are advised to use the alternatives system so that lstopo
points to the best available binary.
+ Instruction caches are enabled in lstopo by default. Use --no-icaches
to disable them.
+ Add -t/--threads to show threads in hwloc-ps.
* Removal of obsolete components
+ Remove the old cpuset interface (hwloc/cpuset.h) which is deprecated and
superseded by the bitmap API (hwloc/bitmap.h) since v1.1.
hwloc_cpuset and nodeset types are still defined, but all hwloc_cpuset_*
compatibility wrappers are now gone.
+ Remove Linux libnuma conversion helpers for the deprecated and
broken nodemask_t interface.
+ Remove support for "Proc" type name, it was superseded by "PU" in v1.0.
+ Remove hwloc-mask symlinks, it was replaced by hwloc-calc in v1.0.
* Misc
+ Fix PCIe 3.0 link speed computation.
+ Non-printable characters are dropped from strings during XML export.
+ Fix importing of escaped characters with the minimalistic XML backend.
+ Assert hwloc_is_thissystem() in several I/O related helpers.
+ Fix some memory leaks in the x86 backend for FreeBSD.
+ Minor fixes to ease native builds on Windows.
+ Limit the number of retries when operating on all threads within a
process on Linux if the list of threads is heavily getting modified.
Version 1.4.3
-------------
* This release is only meant to fix the pciutils license issue when upgrading
to hwloc v1.5 or later is not possible. It contains several other minor
fixes but ignores many of them that are only in v1.5 or later.
* Use libpciaccess instead of pciutils/libpci by default for I/O discovery.
pciutils/libpci is only used if --enable-libpci is given to configure
because its GPL license may taint hwloc. See the Installation section
in the documentation for details.
* Fix PCIe 3.0 link speed computation.
* Fix importing of escaped characters with the minimalistic XML backend.
* Fix a memory leak in the x86 backend.
Version 1.4.2
-------------
* Fix build on Solaris 9 and earlier when fabsf() is not a compiler
built-in. Thanks to Igor Galić for reporting the problem.
* Fix support for more than 32 processors on Windows. Thanks to Hartmut
Kaiser for reporting the problem.
* Fix process-wide binding and cpulocation routines on Linux when some
threads disappear in the meantime. Thanks to Vlad Roubtsov for reporting
the issue.
* Make installed scripts executable. Thanks to Jirka Hladky for reporting
the problem.
* Fix libtool revision management when building for Windows. This fix was
also released as hwloc v1.4.1.1 Windows builds. Thanks to Hartmut Kaiser
for reporting the problem.
* Fix the __hwloc_inline keyword in public headers when compiling with a
C++ compiler.
* Add Port info attribute to network OS devices inside OpenFabrics PCI
devices so as to identify which interface corresponds to which port.
* Document requirements for interoperability helpers: I/O devices discovery
is required for some of them; the topology must match the current host
for most of them.
Version 1.4.1
-------------
* This release contains all changes from v1.3.2.
* Fix hwloc_alloc_membind, thanks Karl Napf for reporting the issue.
* Fix memory leaks in some get_membind() functions.
* Fix helpers converting from Linux libnuma to hwloc (hwloc/linux-libnuma.h)
in case of out-of-order NUMA node ids.
* Fix some overzealous assertions in the distance grouping code.
* Workaround BIOS reporting empty I/O locality in CUDA and OpenFabrics
helpers on Linux. Thanks to Albert Solernou for reporting the problem.
* Install a valgrind suppressions file hwloc-valgrind.supp (see the FAQ).
* Fix memory binding documentation. Thanks to Karl Napf for reporting the
issues.
Version 1.4.0 (does not contain all v1.3.2 changes)
-------------
* Major features
+ Add "custom" interface and "assembler" tools to build multi-node
topology. See the Multi-node Topologies section in the documentation
for details.
* Interface improvements
+ Add symmetric_subtree object attribute to ease assumptions when consulting
regular symmetric topologies.
+ Add a CPUModel and CPUType info attribute to Socket objects on Linux
and Solaris.
+ Add hwloc_get_obj_index_inside_cpuset() to retrieve the "logical" index
of an object within a subtree of the topology.
+ Add more NVIDIA CUDA helpers in cuda.h and cudart.h to find hwloc objects
corresponding to CUDA devices.
* Discovery improvements
+ Add a group object above partial distance matrices to make sure
the matrices are available in the final topology, except when this
new object would contradict the existing hierarchy.
+ Grouping by distances now also works when loading from XML.
+ Fix some corner cases in object insertion, for instance when dealing
with NUMA nodes without any CPU.
* Backends
+ Implement hwloc_get_area_membind() on Linux.
+ Honor I/O topology flags when importing from XML.
+ Further improve XML-related error checking and reporting.
+ Hide synthetic topology error messages unless HWLOC_SYNTHETIC_VERBOSE=1.
* Tools
+ Add synthetic exporting of symmetric topologies to lstopo.
+ lstopo --horiz and --vert can now be applied to some specific object types.
+ lstopo -v -p now displays distance matrices with physical indexes.
+ Add hwloc-distances utility to list distances.
* Documentation
+ Fix and/or document the behavior of most inline functions in hwloc/helper.h
when the topology contains some I/O or Misc objects.
+ Backend documentation enhancements.
* Bug fixes
+ Fix missing last bit in hwloc_linux_get_thread_cpubind().
Thanks to Carolina Gómez-Tostón Gutiérrez for reporting the issue.
+ Fix FreeBSD build without cpuid support.
+ Fix several Windows build issues.
+ Fix inline keyword definition in public headers.
+ Fix dependencies in the embedded library.
+ Improve visibility support detection. Thanks to Dave Love for providing
the patch.
+ Remove references to internal symbols in the tools.
Version 1.3.3
-------------
* This release is only meant to fix the pciutils license issue when upgrading
to hwloc v1.4 or later is not possible. It contains several other minor
fixes but ignores many of them that are only in v1.4 or later.
* Use libpciaccess instead of pciutils/libpci by default for I/O discovery.
pciutils/libpci is only used if --enable-libpci is given to configure
because its GPL license may taint hwloc. See the Installation section
in the documentation for details.
Version 1.3.2
-------------
* Fix missing last bit in hwloc_linux_get_thread_cpubind().
Thanks to Carolina Gómez-Tostón Gutiérrez for reporting the issue.
* Fix build with -mcmodel=medium. Thanks to Devendar Bureddy for reporting
the issue.
* Fix build with Solaris Studio 12 compiler when XML is disabled.
Thanks to Paul H. Hargrove for reporting the problem.
* Fix installation with old GNU sed, for instance on Red Hat 8.
Thanks to Paul H. Hargrove for reporting the problem.
* Fix PCI locality when Linux cgroups restrict the available CPUs.
* Fix floating point issue when grouping by distance on mips64 architecture.
Thanks to Paul H. Hargrove for reporting the problem.
* Fix conversion from/to Linux libnuma when some NUMA nodes have no memory.
* Fix support for gccfss compilers with broken ffs() support. Thanks to
Paul H. Hargrove for reporting the problem and providing a patch.
* Fix FreeBSD build without cpuid support.
* Fix several Windows build issues.
* Fix inline keyword definition in public headers.
* Fix dependencies in the embedded library.
* Detect when a compiler such as xlc may not report compile errors
properly, causing some configure checks to be wrong. Thanks to
Paul H. Hargrove for reporting the problem and providing a patch.
* Improve visibility support detection. Thanks to Dave Love for providing
the patch.
* Remove references to internal symbols in the tools.
* Fix installation on systems with limited command-line size.
Thanks to Paul H. Hargrove for reporting the problem.
* Further improve XML-related error checking and reporting.
Version 1.3.1
-------------
* Fix pciutils detection with pkg-config when not installed in standard
directories.
* Fix visibility options detection with the Solaris Studio compiler.
Thanks to Igor Galić and Terry Dontje for reporting the problems.
* Fix support for old Linux sched.h headers such as those found
on Red Hat 8. Thanks to Paul H. Hargrove for reporting the problems.
* Fix inline and attribute support for Solaris compilers. Thanks to
Dave Love for reporting the problems.
* Print a short summary at the end of the configure output. Thanks to
Stefan Eilemann for the suggestion.
* Add --disable-libnuma configure option to disable libnuma-based
memory binding support on Linux. Thanks to Rayson Ho for the
suggestion.
* Make hwloc's configure script properly obey $PKG_CONFIG. Thanks to
Nathan Phillip Brink for raising the issue.
* Silence some harmless pciutils warnings, thanks to Paul H. Hargrove
for reporting the problem.
* Fix the documentation with respect to hwloc_pid_t and hwloc_thread_t
being either pid_t and pthread_t on Unix, or HANDLE on Windows.
Version 1.3.0
-------------
* Major features
+ Add I/O devices and bridges to the topology using the pciutils
library. Only enabled after setting the relevant flag with
hwloc_topology_set_flags() before hwloc_topology_load(). See the
I/O Devices section in the documentation for details.
* Discovery improvements
+ Add associativity to the cache attributes.
+ Add support for s390/z11 "books" on Linux.
+ Add the HWLOC_GROUPING_ACCURACY environment variable to relax
distance-based grouping constraints. See the Environment Variables
section in the documentation for details about grouping behavior
and configuration.
+ Allow user-given distance matrices to remove or replace those
discovered by the OS backend.
* XML improvements
+ XML is now always supported: a minimalistic custom import/export
code is used when libxml2 is not available. It is only guaranteed
to read XML files generated by hwloc.
+ hwloc_topology_export_xml() and export_xmlbuffer() now return an
integer.
+ Add hwloc_free_xmlbuffer() to free the buffer allocated by
hwloc_topology_export_xmlbuffer().
+ Hide XML topology error messages unless HWLOC_XML_VERBOSE=1.
* Minor API updates
+ Add hwloc_obj_add_info to customize object info attributes.
* Tools
+ lstopo now displays I/O devices by default. Several options are
added to configure the I/O discovery.
+ hwloc-calc and hwloc-bind now accept I/O devices as input.
+ Add --restrict option to hwloc-calc and hwloc-distribute.
+ Add --sep option to change the output field separator in hwloc-calc.
+ Add --whole-system option to hwloc-ps.
Version 1.2.2
-------------
* Fix build on AIX 5.2, thanks Utpal Kumar Ray for the report.
* Fix XML import of very large page sizes or counts on 32bits platform,
thanks to Karsten Hopp for the RedHat ticket.
* Fix crash when administrator limitations such as Linux cgroup require
to restrict distance matrices. Thanks to Ake Sandgren for reporting the
problem.
* Fix the removal of objects such as AMD Magny-Cours dual-node sockets
in case of administrator restrictions.
* Improve error reporting and messages in case of wrong synthetic topology
description.
* Several other minor internal fixes and documentation improvements.
Version 1.2.1
-------------
* Improve support of AMD Bulldozer "Compute-Unit" modules by detecting
logical processors with different core IDs on Linux.
* Fix hwloc-ps crash when listing processes from another Linux cpuset.
Thanks to Carl Smith for reporting the problem.
* Fix build on AIX and Solaris. Thanks to Carl Smith and Andreas Kupries
for reporting the problems.
* Fix cache size detection on Darwin. Thanks to Erkcan Özcan for reporting
the problem.
* Make configure fail if --enable-xml or --enable-cairo is given and
proper support cannot be found. Thanks to Andreas Kupries for reporting
the XML problem.
* Fix spurious L1 cache detection on AIX. Thanks to Hendryk Bockelmann
for reporting the problem.
* Fix hwloc_get_last_cpu_location(THREAD) on Linux. Thanks to Gabriele
Fatigati for reporting the problem.
* Fix object distance detection on Solaris.
* Add pthread_self weak symbol to ease static linking.
* Minor documentation fixes.
Version 1.2.0
-------------
* Major features
+ Expose latency matrices in the API as an array of distance structures
within objects. Add several helpers to find distances.
+ Add hwloc_topology_set_distance_matrix() and environment variables
to provide a matrix of distances between a given set of objects.
+ Add hwloc_get_last_cpu_location() and hwloc_get_proc_last_cpu_location()
to retrieve the processors where a process or thread recently ran.
- Add the corresponding --get-last-cpu-location option to hwloc-bind.
+ Add hwloc_topology_restrict() to restrict an existing topology to a
given cpuset.
- Add the corresponding --restrict option to lstopo.
* Minor API updates
+ Add hwloc_bitmap_list_sscanf/snprintf/asprintf to convert between bitmaps
and strings such as 4-5,7-9,12,15-
+ hwloc_bitmap_set/clr_range() now support infinite ranges.
+ Clarify the difference between inserting Misc objects by cpuset or by
parent.
+ hwloc_insert_misc_object_by_cpuset() now returns NULL in case of error.
* Discovery improvements
+ x86 backend (for freebsd): add x2APIC support
+ Support standard device-tree phandle, to get better support on e.g. ARM
systems providing it.
+ Detect cache size on AIX. Thanks Christopher and IBM.
+ Improve grouping to support asymmetric topologies.
* Tools
+ Command-line tools now support "all" and "root" special locations
consisting in the entire topology, as well as type names with depth
attributes such as L2 or Group4.
+ hwloc-calc improvements:
- Add --number-of/-N option to report the number of objects of a given
type or depth.
- -I is now equivalent to --intersect for listing the indexes of
objects of a given type or depth that intersects the input.
- Add -H to report the output as a hierarchical combination of types
and depths.
+ Add --thissystem to lstopo.
+ Add lstopo-win, a console-less lstopo variant on Windows.
* Miscellaneous
+ Remove C99 usage from code base.
+ Rename hwloc-gather-topology.sh into hwloc-gather-topology
+ Fix AMD cache discovery on freebsd when there is no L3 cache, thanks
Andriy Gapon for the fix.
Version 1.1.2
-------------
* Fix a segfault in the distance-based grouping code when some objects
are not placed in any group. Thanks to Bernd Kallies for reporting
the problem and providing a patch.
* Fix the command-line parsing of hwloc-bind --mempolicy interleave.
Thanks to Guy Streeter for reporting the problem.
* Stop truncating the output in hwloc_obj_attr_snprintf() and in the
corresponding lstopo output. Thanks to Guy Streeter for reporting the
problem.
* Fix object levels ordering in synthetic topologies.
* Fix potential incoherency between device tree and kernel information,
when SMT is disabled on Power machines.
* Fix and document the behavior of hwloc_topology_set_synthetic() in case
of invalid argument. Thanks to Guy Streeter for reporting the problem.
* Add some verbose error message reporting when it looks like the OS
gives erroneous information.
* Do not include unistd.h and stdint.h in public headers on Windows.
* Move config.h files into their own subdirectories to avoid name
conflicts when AC_CONFIG_HEADERS adds -I's for them.
* Remove the use of declaring variables inside "for" loops.
* Some other minor fixes.
* Many minor documentation fixes.
Version 1.1.1
-------------
* Add hwloc_get_api_version() which returns the version of hwloc used
at runtime. Thanks to Guy Streeter for the suggestion.
* Fix the number of hugepages reported for NUMA nodes on Linux.
* Fix hwloc_bitmap_to_ulong() right after allocating the bitmap.
Thanks to Bernd Kallies for reporting the problem.
* Fix hwloc_bitmap_from_ith_ulong() to properly zero the first ulong.
Thanks to Guy Streeter for reporting the problem.
* Fix hwloc_get_membind_nodeset() on Linux.
Thanks to Bernd Kallies for reporting the problem and providing a patch.
* Fix some file descriptor leaks in the Linux discovery.
* Fix the minimum width of NUMA nodes, caches and the legend in the graphical
lstopo output. Thanks to Jirka Hladky for reporting the problem.
* Various fixes to bitmap conversion from/to taskset-strings.
* Fix and document snprintf functions behavior when the buffer size is too
small or zero. Thanks to Guy Streeter for reporting the problem.
* Fix configure to avoid spurious enabling of the cpuid backend.
Thanks to Tim Anderson for reporting the problem.
* Cleanup error management in hwloc-gather-topology.sh.
Thanks to Jirka Hladky for reporting the problem and providing a patch.
* Add a manpage and usage for hwloc-gather-topology.sh on Linux.
Thanks to Jirka Hladky for providing a patch.
* Memory binding documentation enhancements.
Version 1.1.0
-------------
* API
+ Increase HWLOC_API_VERSION to 0x00010100 so that API changes may be
detected at build-time.
+ Add a memory binding interface.
+ The cpuset API (hwloc/cpuset.h) is now deprecated. It is replaced by
the bitmap API (hwloc/bitmap.h) which offers the same features with more
generic names since it applies to CPU sets, node sets and more.
Backward compatibility with the cpuset API and ABI is still provided but
it will be removed in a future release.
Old types (hwloc_cpuset_t, ...) are still available as a way to clarify
what kind of hwloc_bitmap_t each API function manipulates.
Upgrading to the new API only requires to replace hwloc_cpuset_ function
calls with the corresponding hwloc_bitmap_ calls, with the following
renaming exceptions:
- hwloc_cpuset_cpu -> hwloc_bitmap_only
- hwloc_cpuset_all_but_cpu -> hwloc_bitmap_allbut
- hwloc_cpuset_from_string -> hwloc_bitmap_sscanf
+ Add an `infos' array in each object to store couples of info names and
values. It enables generic storage of things like the old dmi board infos
that were previously stored in machine specific attributes.
+ Add linesize cache attribute.
* Features
+ Bitmaps (and thus CPU sets and node sets) are dynamically (re-)allocated,
the maximal number of CPUs (HWLOC_NBMAXCPUS) has been removed.
+ Improve the distance-based grouping code to better support irregular
distance matrices.
+ Add support for device-tree to get cache information (useful on Power
architectures).
* Helpers
+ Add NVIDIA CUDA helpers in cuda.h and cudart.h to ease interoperability
with CUDA Runtime and Driver APIs.
+ Add Myrinet Express helper in myriexpress.h to ease interoperability.
* Tools
+ lstopo now displays physical/OS indexes by default in graphical mode
(use -l to switch back to logical indexes). The textual output still uses
logical by default (use -p to switch to physical indexes).
+ lstopo prefixes logical indexes with `L#' and physical indexes with `P#'.
Physical indexes are also printed as `P#N' instead of `phys=N' within
object attributes (in parentheses).
+ Add a legend at the bottom of the lstopo graphical output, use --no-legend
to remove it.
+ Add hwloc-ps to list process' bindings.
+ Add --membind and --mempolicy options to hwloc-bind.
+ Improve tools command-line options by adding a generic --input option
(and more) which replaces the old --xml, --synthetic and --fsys-root.
+ Cleanup lstopo output configuration by adding --output-format.
+ Add --intersect in hwloc-calc, and replace --objects with --largest.
+ Add the ability to work on standard input in hwloc-calc.
+ Add --from, --to and --at in hwloc-distrib.
+ Add taskset-specific functions and command-line tools options to
manipulate CPU set strings in the format of the taskset program.
+ Install hwloc-gather-topology.sh on Linux.
Version 1.0.3
-------------
* Fix support for Linux cpuset when emulated by a cgroup mount point.
* Remove unneeded runtime dependency on libibverbs.so in the library and
all utils programs.
* Fix hwloc_cpuset_to_linux_libnuma_ulongs in case of non-linear OS-indexes
for NUMA nodes.
* lstopo now displays physical/OS indexes by default in graphical mode
(use -l to switch back to logical indexes). The textual output still uses
logical by default (use -p to switch to physical indexes).
Version 1.0.2
-------------
* Public headers can now be included directly from C++ programs.
* Solaris fix for non-contiguous cpu numbers. Thanks to Rolf vandeVaart for
reporting the issue.
* Darwin 10.4 fix. Thanks to Olivier Cessenat for reporting the issue.
* Revert 1.0.1 patch that ignored sockets with unknown ID values since it
only slightly helped POWER7 machines with old Linux kernels while it
prevents recent kernels from getting the complete POWER7 topology.
* Fix hwloc_get_common_ancestor_obj().
* Remove arch-specific bits in public headers.
* Some fixes in the lstopo graphical output.
* Various man page clarifications and minor updates.
Version 1.0.1
-------------
* Various Solaris fixes. Thanks to Yannick Martin for reporting the issue.
* Fix "non-native" builds on x86 platforms (e.g., when building 32
bit executables with compilers that natively build 64 bit).
* Ignore sockets with unknown ID values (which fixes issues on POWER7
machines). Thanks to Greg Bauer for reporting the issue.
* Various man page clarifications and minor updates.
* Fixed memory leaks in hwloc_setup_group_from_min_distance_clique().
* Fix cache type filtering on MS Windows 7. Thanks to Αλέξανδρος
Παπαδογιαννάκ for reporting the issue.
* Fixed warnings when compiling with -DNDEBUG.
Version 1.0.0
-------------
* The ABI of the library has changed.
* Backend updates
+ Add FreeBSD support.
+ Add x86 cpuid based backend.
+ Add Linux cgroup support to the Linux cpuset code.
+ Support binding of entire multithreaded process on Linux.
+ Fix and enable Group support in Windows.
+ Cleanup XML export/import.
* Objects
+ HWLOC_OBJ_PROC is renamed into HWLOC_OBJ_PU for "Processing Unit",
its stringified type name is now "PU".
+ Use new HWLOC_OBJ_GROUP objects instead of MISC when grouping
objects according to NUMA distances or arbitrary OS aggregation.
+ Rework memory attributes.
+ Add different cpusets in each object to specify processors that
are offline, unavailable, ...
+ Cleanup the storage of object names and DMI infos.
* Features
+ Add support for looking up specific PID topology information.
+ Add hwloc_topology_export_xml() to export the topology in a XML file.
+ Add hwloc_topology_get_support() to retrieve the supported features
for the current topology context.
+ Support non-SYSTEM object as the root of the tree, use MACHINE in
most common cases.
+ Add hwloc_get_*cpubind() routines to retrieve the current binding
of processes and threads.
* API
+ Add HWLOC_API_VERSION to help detect the currently used API version.
+ Add missing ending "e" to *compare* functions.
+ Add several routines to emulate PLPA functions.
+ Rename and rework the cpuset and/or/xor/not/clear operators to output
their result in a dedicated argument instead of modifying one input.
+ Deprecate hwloc_obj_snprintf() in favor of hwloc_obj_type/attr_snprintf().
+ Clarify the use of parent and ancestor in the API, do not use father.
+ Replace hwloc_get_system_obj() with hwloc_get_root_obj().
+ Return -1 instead of HWLOC_OBJ_TYPE_MAX in the API since the latter
isn't public.
+ Relax constraints in hwloc_obj_type_of_string().
+ Improve displaying of memory sizes.
+ Add 0x prefix to cpuset strings.
* Tools
+ lstopo now displays logical indexes by default, use --physical to
revert back to OS/physical indexes.
+ Add colors in the lstopo graphical outputs to distinguish between online,
offline, reserved, ... objects.
+ Extend lstopo to show cpusets, filter objects by type, ...
+ Renamed hwloc-mask into hwloc-calc which supports many new options.
* Documentation
+ Add a hwloc(7) manpage containing general information.
+ Add documentation about how to switch from PLPA to hwloc.
+ Cleanup the distributed documentation files.
* Miscellaneous
+ Many compilers warning fixes.
+ Cleanup the ABI by using the visibility attribute.
+ Add project embedding support.
Version 0.9.4 (unreleased)
--------------------------
* Fix reseting colors to normal in lstopo -.txt output.
* Fix Linux pthread_t binding error report.
Version 0.9.3
-------------
* Fix autogen.sh to work with Autoconf 2.63.
* Fix various crashes in particular conditions:
- xml files with root attributes
- offline CPUs
- partial sysfs support
- unparseable /proc/cpuinfo
- ignoring NUMA level while Misc level have been generated
* Tweak documentation a bit
* Do not require the pthread library for binding the current thread on Linux
* Do not erroneously consider the sched_setaffinity prototype is the old version
when there is actually none.
* Fix _syscall3 compilation on archs for which we do not have the
sched_setaffinity system call number.
* Fix AIX binding.
* Fix libraries dependencies: now only lstopo depends on libtermcap, fix
binutils-gold link
* Have make check always build and run hwloc-hello.c
* Do not limit size of a cpuset.
Version 0.9.2
-------------
* Trivial documentation changes.
Version 0.9.1