Skip to content

Conversation

@jorickert
Copy link

No description provided.

sarnex and others added 30 commits January 9, 2025 15:59
…ary form (llvm#120266)

Currently we produce SPIR-V text with `spirv-dis` but assemble it with
`llvm-spirv`. The SPIR-V text format is different between the tools so
the assemble fails. Use `spirv-as` for assembly as it uses the same
format.

---------

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Make fast math the default for HLSL
Repair test cases broken by this change.
Closes llvm#108597
…#122113)

Rearrange the DepDistanceAndSizeInfo struct in preparation to scale
strides. getDependenceDistanceStrideAndSize now returns the data of
CommonStride, MaxStride, and clarifies when to retry with runtime
checks, in place of (unscaled) strides.
The test contained some non-const accesses when we're really trying to test
that the method is const.
Testing is failing in HEAD, Integration issue with the SPIR-V OpenMP
change and the SPIR-V change adding the assembler job.

I notice there is a consistency issue in the binding name "SPIRV" vs
"SPIR-V", I will fix that in a separate commit so we unbreak build ASAP.

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
GDB/MI requires unique names for each child, otherwise fails with
"Duplicate variable object name". Also wrapped containers printers
were flattened for cleaner visualization in IDEs and CLI.

Fixes llvm#62340
…llvm#121494)

This commit is a further incremental step toward moving the whole
mlir-vulkan-runner MLIR pass pipeline into mlir-opt (see llvm#73457). The
previous step was b225b3adf7b78387c9fcb97a3ff0e0a1e26eafe2, which moved
all device passes prior to SPIR-V serialization into a new mlir-opt test
pass, `-test-vulkan-runner-pipeline`.

This commit changes how SPIR-V serialization is accomplished for Vulkan
runner tests. Until now, this was done by the Vulkan-specific
ConvertGpuLaunchFuncToVulkanLaunchFunc pass. With this commit, this
responsibility is removed from that pass, and is instead done with the
existing generic GpuModuleToBinaryPass. In addition, the SPIR-V
serialization step is no longer done inside mlir-vulkan-runner, but
rather inside mlir-opt (in the `-test-vulkan-runner-pipeline` pass).
Both of these changes represent a greater alignment between
mlir-vulkan-runner and the other GPU integration tests. Notably, the IR
shapes produced by the mlir-opt pipelines for the Vulkan and SYCL
runners are now much more similar, with both using a gpu.binary op for
the serialized SPIR-V kernel.

In order to enable this, this commit includes these supporting changes:

- ConvertToSPIRVPass is enhanced to support producing the IR shape where
a spirv.module is nested inside a gpu.module, since this is what
GpuModuleToBinaryPass expects.
- ConvertGPULaunchFuncToVulkanLaunchFunc is changed to remove its SPIR-V
serialization functionality, and instead now extracts the SPIR-V from a
gpu.binary operation (as produced by ConvertToSPIRVPass).
- `-test-vulkan-runner-pipeline` now attaches SPIR-V target information
required by GpuModuleToBinaryPass.
- The WebGPU pass option, which had been removed from mlir-vulkan-runner
in the previous commit in this series, is restored as an option to
`-test-vulkan-runner-pipeline` instead, so that the WebGPU pass
continues being inserted into the pipeline just before SPIR-V
serialization.
To match output order of update_test_checks.py.
Fixing failed conflict resolution in llvm#122229.
…llvm#122147)

I think SDNodeIterator primarily exists because GraphTraits requires an
iterator that dereferences to SDNode*. op_iterator dereferences to
SDUse* which is implicitly convertible to SDValue.

This piece of code can use SDValue instead of SDNode* so we should
prefer to use the the more common op_iterator.
)

With the new benchmark phase, `dry-run-measurement`, llvm-exegesis can
run everything except the actual snippet execution. It is useful when we
want to test some parts of the code between the `assemble-measured-code`
and `measure` phase without actually running on native platforms.
These two clauses just take a 'var-list' and specify where the variables
should be copied from/to.  This patch implements the AST nodes for them
and ensures they properly take a var-list.
This completes the implementation of 'update' by implementing its last
restriction. This restriction requires at least 1 of the 'self', 'host',
  or 'device' clauses.
TimerGroup's dtor accesses this field.

once_flag has a trivial destructor so it doesn't really matter, but msan
checks that it's not accessed after destruction.
…Is (llvm#122167)

Adding test case to verify that SwiftReturnOwnership works correctly for
ObjC functions and methods as well.

rdar://142504115
It was superceeded by the emitBuiltinWithOneOverloadedType() some
time ago.
llvm#104683)

The radix sort (LSD) algorithm allows to speed up std::stable_sort
dramatically in case we sort integers.
The speed up varies from a relatively small to x10 times, depending on
type of sorted elements and the initial state of the sorted array.

```
Running ./libcxx/test/benchmarks/stable_sort.bench.out
Run on (12 X 2600 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB
  L1 Instruction 32 KiB
  L2 Unified 256 KiB (x6)
  L3 Unified 12288 KiB
Load Average: 3.48, 3.38, 3.08
---------------------------------------------------------------------------
Benchmark                                               After        Before 
---------------------------------------------------------------------------
BM_StableSort_int8_Random_1                           3.39 ns       3.58 ns 
BM_StableSort_int8_Random_4                           21.1 ns       21.9 ns 
BM_StableSort_int8_Random_16                           142 ns        147 ns 
BM_StableSort_int8_Random_64                           893 ns        903 ns 
BM_StableSort_int8_Random_256                          409 ns       5810 ns 
BM_StableSort_int8_Random_1024                        1235 ns      29973 ns 
BM_StableSort_int8_Random_4096                        4410 ns     141880 ns 
BM_StableSort_int8_Random_16384                      18044 ns     620540 ns 
BM_StableSort_int8_Random_65536                     144030 ns    2592013 ns 
BM_StableSort_int8_Random_262144                    858350 ns   10935814 ns 
BM_StableSort_int8_Random_524288                   2929988 ns   27060729 ns 
BM_StableSort_int8_Random_1048576                  6058292 ns   49622720 ns 
BM_StableSort_int8_Ascending_1                        3.42 ns       3.92 ns 
BM_StableSort_int8_Ascending_4                        5.86 ns       8.08 ns 
BM_StableSort_int8_Ascending_16                       10.6 ns       12.0 ns 
BM_StableSort_int8_Ascending_64                       28.9 ns       30.6 ns 
BM_StableSort_int8_Ascending_256                       415 ns        391 ns 
BM_StableSort_int8_Ascending_1024                     1666 ns       2309 ns 
BM_StableSort_int8_Ascending_4096                     7748 ns      12269 ns 
BM_StableSort_int8_Ascending_16384                   40588 ns      60181 ns 
BM_StableSort_int8_Ascending_65536                  178843 ns     298221 ns 
BM_StableSort_int8_Ascending_262144                 919959 ns    1402692 ns 
BM_StableSort_int8_Ascending_524288                2397397 ns    3036984 ns 
BM_StableSort_int8_Ascending_1048576               5080043 ns    7218581 ns 
BM_StableSort_int8_Descending_1                       3.44 ns       3.53 ns 
BM_StableSort_int8_Descending_4                       7.94 ns       8.29 ns 
BM_StableSort_int8_Descending_16                      59.6 ns       57.7 ns 
BM_StableSort_int8_Descending_64                      1051 ns       1027 ns 
BM_StableSort_int8_Descending_256                      422 ns       4718 ns 
BM_StableSort_int8_Descending_1024                    1676 ns      21044 ns 
BM_StableSort_int8_Descending_4096                    7766 ns      64827 ns 
BM_StableSort_int8_Descending_16384                  40230 ns      93981 ns 
BM_StableSort_int8_Descending_65536                 190978 ns     421151 ns 
BM_StableSort_int8_Descending_262144               1055141 ns    1918927 ns 
BM_StableSort_int8_Descending_524288               2875115 ns    3809153 ns 
BM_StableSort_int8_Descending_1048576              5854135 ns    8713690 ns 
BM_StableSort_int8_SingleElement_1                    3.52 ns       3.46 ns 
BM_StableSort_int8_SingleElement_4                    6.25 ns       5.79 ns 
BM_StableSort_int8_SingleElement_16                   10.7 ns       11.4 ns 
BM_StableSort_int8_SingleElement_64                   29.3 ns       30.3 ns 
BM_StableSort_int8_SingleElement_256                   858 ns        380 ns 
BM_StableSort_int8_SingleElement_1024                 3036 ns       2231 ns 
BM_StableSort_int8_SingleElement_4096                11580 ns      11866 ns 
BM_StableSort_int8_SingleElement_16384               44956 ns      59621 ns 
BM_StableSort_int8_SingleElement_65536              182006 ns     297853 ns 
BM_StableSort_int8_SingleElement_262144             962181 ns    1432857 ns 
BM_StableSort_int8_SingleElement_524288            2256687 ns    2975707 ns 
BM_StableSort_int8_SingleElement_1048576           4522556 ns    6949948 ns 
BM_StableSort_int8_PipeOrgan_1                        3.26 ns       3.64 ns 
BM_StableSort_int8_PipeOrgan_4                        6.21 ns       6.58 ns 
BM_StableSort_int8_PipeOrgan_16                       23.7 ns       25.4 ns 
BM_StableSort_int8_PipeOrgan_64                        250 ns        248 ns 
BM_StableSort_int8_PipeOrgan_256                       414 ns       2498 ns 
BM_StableSort_int8_PipeOrgan_1024                     1697 ns      10946 ns 
BM_StableSort_int8_PipeOrgan_4096                     7840 ns      37238 ns 
BM_StableSort_int8_PipeOrgan_16384                   41402 ns      74805 ns 
BM_StableSort_int8_PipeOrgan_65536                  180107 ns     357891 ns 
BM_StableSort_int8_PipeOrgan_262144                 988273 ns    1647296 ns 
BM_StableSort_int8_PipeOrgan_524288                2547374 ns    3245991 ns 
BM_StableSort_int8_PipeOrgan_1048576               5128783 ns    7342444 ns 
BM_StableSort_int8_QuickSortAdversary_1               3.14 ns       4.01 ns 
BM_StableSort_int8_QuickSortAdversary_4               6.05 ns       7.02 ns 
BM_StableSort_int8_QuickSortAdversary_16              10.5 ns       11.9 ns 
BM_StableSort_int8_QuickSortAdversary_64               520 ns        516 ns 
BM_StableSort_int8_QuickSortAdversary_256              920 ns        386 ns 
BM_StableSort_int8_QuickSortAdversary_1024            3083 ns       2299 ns 
BM_StableSort_int8_QuickSortAdversary_4096           11659 ns      12295 ns 
BM_StableSort_int8_QuickSortAdversary_16384          45721 ns      60931 ns 
BM_StableSort_int8_QuickSortAdversary_65536         186334 ns     295423 ns 
BM_StableSort_int8_QuickSortAdversary_262144        946262 ns    1399973 ns 
BM_StableSort_int8_QuickSortAdversary_524288       2282004 ns    2832266 ns 
BM_StableSort_int8_QuickSortAdversary_1048576      4691123 ns    6963253 ns 
BM_StableSort_uint8_Random_1                          3.11 ns       3.44 ns 
BM_StableSort_uint8_Random_4                          21.9 ns       23.1 ns 
BM_StableSort_uint8_Random_16                          154 ns        171 ns 
BM_StableSort_uint8_Random_64                         1000 ns       1051 ns 
BM_StableSort_uint8_Random_256                         402 ns       6498 ns 
BM_StableSort_uint8_Random_1024                       1176 ns      35310 ns 
BM_StableSort_uint8_Random_4096                       4415 ns     164087 ns 
BM_StableSort_uint8_Random_16384                     17849 ns     686769 ns 
BM_StableSort_uint8_Random_65536                    146109 ns    2932051 ns 
BM_StableSort_uint8_Random_262144                   876710 ns   12163988 ns 
BM_StableSort_uint8_Random_524288                  2858089 ns   26458830 ns 
BM_StableSort_uint8_Random_1048576                 5766942 ns   54836214 ns 
BM_StableSort_uint8_Ascending_1                       3.11 ns       3.43 ns 
BM_StableSort_uint8_Ascending_4                       6.18 ns       7.24 ns 
BM_StableSort_uint8_Ascending_16                      14.5 ns       17.0 ns 
BM_StableSort_uint8_Ascending_64                      50.7 ns       59.2 ns 
BM_StableSort_uint8_Ascending_256                      395 ns        536 ns 
BM_StableSort_uint8_Ascending_1024                    1752 ns       2956 ns 
BM_StableSort_uint8_Ascending_4096                    7785 ns      15146 ns 
BM_StableSort_uint8_Ascending_16384                  41442 ns      74136 ns 
BM_StableSort_uint8_Ascending_65536                 180879 ns     354261 ns 
BM_StableSort_uint8_Ascending_262144                945880 ns    1674256 ns 
BM_StableSort_uint8_Ascending_524288               2287832 ns    3138581 ns 
BM_StableSort_uint8_Ascending_1048576              4630290 ns    7296278 ns 
BM_StableSort_uint8_Descending_1                      3.19 ns       3.63 ns 
BM_StableSort_uint8_Descending_4                      9.60 ns       11.5 ns 
BM_StableSort_uint8_Descending_16                     78.3 ns       86.0 ns 
BM_StableSort_uint8_Descending_64                     1265 ns       1308 ns 
BM_StableSort_uint8_Descending_256                     395 ns       6556 ns 
BM_StableSort_uint8_Descending_1024                   1712 ns      24669 ns 
BM_StableSort_uint8_Descending_4096                   7748 ns      83407 ns 
BM_StableSort_uint8_Descending_16384                 40779 ns     104043 ns 
BM_StableSort_uint8_Descending_65536                181560 ns     467680 ns 
BM_StableSort_uint8_Descending_262144              1146627 ns    2102769 ns 
BM_StableSort_uint8_Descending_524288              2874096 ns    4572229 ns 
BM_StableSort_uint8_Descending_1048576             5873195 ns   10170663 ns 
BM_StableSort_uint8_SingleElement_1                   3.28 ns       3.58 ns 
BM_StableSort_uint8_SingleElement_4                   6.44 ns       7.40 ns 
BM_StableSort_uint8_SingleElement_16                  14.9 ns       16.4 ns 
BM_StableSort_uint8_SingleElement_64                  51.2 ns       52.9 ns 
BM_StableSort_uint8_SingleElement_256                  876 ns        490 ns 
BM_StableSort_uint8_SingleElement_1024                3041 ns       2750 ns 
BM_StableSort_uint8_SingleElement_4096               11947 ns      14326 ns 
BM_StableSort_uint8_SingleElement_16384              46669 ns      69984 ns 
BM_StableSort_uint8_SingleElement_65536             197903 ns     328961 ns 
BM_StableSort_uint8_SingleElement_262144           1031466 ns    1551436 ns 
BM_StableSort_uint8_SingleElement_524288           2447672 ns    3049553 ns 
BM_StableSort_uint8_SingleElement_1048576          4793087 ns    7615245 ns 
BM_StableSort_uint8_PipeOrgan_1                       3.38 ns       3.56 ns 
BM_StableSort_uint8_PipeOrgan_4                       7.16 ns       8.70 ns 
BM_StableSort_uint8_PipeOrgan_16                      31.7 ns       35.3 ns 
BM_StableSort_uint8_PipeOrgan_64                       326 ns        366 ns 
BM_StableSort_uint8_PipeOrgan_256                      409 ns       2942 ns 
BM_StableSort_uint8_PipeOrgan_1024                    1994 ns      12571 ns 
BM_StableSort_uint8_PipeOrgan_4096                    8086 ns      46278 ns 
BM_StableSort_uint8_PipeOrgan_16384                  41749 ns      79813 ns 
BM_StableSort_uint8_PipeOrgan_65536                 180697 ns     375120 ns 
BM_StableSort_uint8_PipeOrgan_262144               1004899 ns    1676143 ns 
BM_StableSort_uint8_PipeOrgan_524288               2456081 ns    3333949 ns 
BM_StableSort_uint8_PipeOrgan_1048576              5030857 ns    7591303 ns 
BM_StableSort_uint8_QuickSortAdversary_1              3.12 ns       3.46 ns 
BM_StableSort_uint8_QuickSortAdversary_4              7.25 ns       6.83 ns 
BM_StableSort_uint8_QuickSortAdversary_16             14.6 ns       16.2 ns 
BM_StableSort_uint8_QuickSortAdversary_64              650 ns        665 ns 
BM_StableSort_uint8_QuickSortAdversary_256             395 ns       2982 ns 
BM_StableSort_uint8_QuickSortAdversary_1024           3125 ns       2583 ns 
BM_StableSort_uint8_QuickSortAdversary_4096          11797 ns      13929 ns 
BM_StableSort_uint8_QuickSortAdversary_16384         45803 ns      66513 ns 
BM_StableSort_uint8_QuickSortAdversary_65536        190745 ns     313467 ns 
BM_StableSort_uint8_QuickSortAdversary_262144       974646 ns    1469014 ns 
BM_StableSort_uint8_QuickSortAdversary_524288      2317553 ns    3022065 ns 
BM_StableSort_uint8_QuickSortAdversary_1048576     4898703 ns    6854079 ns 
BM_StableSort_int16_Random_1                          3.94 ns       3.49 ns 
BM_StableSort_int16_Random_4                          20.8 ns       23.2 ns 
BM_StableSort_int16_Random_16                          133 ns        163 ns 
BM_StableSort_int16_Random_64                          903 ns        953 ns 
BM_StableSort_int16_Random_256                        5638 ns       6258 ns 
BM_StableSort_int16_Random_1024                       3056 ns      34587 ns 
BM_StableSort_int16_Random_4096                      10596 ns     168397 ns 
BM_StableSort_int16_Random_16384                     49908 ns     753031 ns 
BM_StableSort_int16_Random_65536                    444605 ns    3838368 ns 
BM_StableSort_int16_Random_262144                  2419345 ns   15657285 ns 
BM_StableSort_int16_Random_524288                  7984040 ns   32726933 ns 
BM_StableSort_int16_Random_1048576                16092424 ns   67999766 ns 
BM_StableSort_int16_Ascending_1                       3.40 ns       3.43 ns 
BM_StableSort_int16_Ascending_4                       5.45 ns       5.79 ns 
BM_StableSort_int16_Ascending_16                      12.0 ns       15.3 ns 
BM_StableSort_int16_Ascending_64                      39.6 ns       52.6 ns 
BM_StableSort_int16_Ascending_256                      470 ns        550 ns 
BM_StableSort_int16_Ascending_1024                    1686 ns       2707 ns 
BM_StableSort_int16_Ascending_4096                    5676 ns      14165 ns 
BM_StableSort_int16_Ascending_16384                  21413 ns      69483 ns 
BM_StableSort_int16_Ascending_65536                  88010 ns     334466 ns 
BM_StableSort_int16_Ascending_262144                567239 ns    1570620 ns 
BM_StableSort_int16_Ascending_524288               1553063 ns    3424666 ns 
BM_StableSort_int16_Ascending_1048576              3145577 ns    8499649 ns 
BM_StableSort_int16_Descending_1                      3.22 ns       3.54 ns 
BM_StableSort_int16_Descending_4                      6.85 ns       10.2 ns 
BM_StableSort_int16_Descending_16                     62.7 ns       62.2 ns 
BM_StableSort_int16_Descending_64                     1138 ns       1036 ns 
BM_StableSort_int16_Descending_256                    5541 ns       4696 ns 
BM_StableSort_int16_Descending_1024                   3046 ns      19577 ns 
BM_StableSort_int16_Descending_4096                  10962 ns      79149 ns 
BM_StableSort_int16_Descending_16384                 58182 ns     327709 ns 
BM_StableSort_int16_Descending_65536                447025 ns    1424896 ns 
BM_StableSort_int16_Descending_262144              1104973 ns    5921903 ns 
BM_StableSort_int16_Descending_524288              2547840 ns   17956789 ns 
BM_StableSort_int16_Descending_1048576             5093555 ns   17044318 ns 
BM_StableSort_int16_SingleElement_1                   3.56 ns       3.96 ns 
BM_StableSort_int16_SingleElement_4                   5.75 ns       6.72 ns 
BM_StableSort_int16_SingleElement_16                  12.4 ns       16.1 ns 
BM_StableSort_int16_SingleElement_64                  36.9 ns       54.4 ns 
BM_StableSort_int16_SingleElement_256                  473 ns        557 ns 
BM_StableSort_int16_SingleElement_1024                1828 ns       2826 ns 
BM_StableSort_int16_SingleElement_4096                6239 ns      14252 ns 
BM_StableSort_int16_SingleElement_16384              23695 ns      70369 ns 
BM_StableSort_int16_SingleElement_65536              93281 ns     361641 ns 
BM_StableSort_int16_SingleElement_262144            599078 ns    1640216 ns 
BM_StableSort_int16_SingleElement_524288           1659678 ns    3343087 ns 
BM_StableSort_int16_SingleElement_1048576          3184033 ns    7770271 ns 
BM_StableSort_int16_PipeOrgan_1                       3.75 ns       3.76 ns 
BM_StableSort_int16_PipeOrgan_4                       5.94 ns       7.74 ns 
BM_StableSort_int16_PipeOrgan_16                      26.7 ns       25.9 ns 
BM_StableSort_int16_PipeOrgan_64                       300 ns        263 ns 
BM_StableSort_int16_PipeOrgan_256                     2769 ns       2760 ns 
BM_StableSort_int16_PipeOrgan_1024                    2996 ns      10544 ns 
BM_StableSort_int16_PipeOrgan_4096                   11641 ns      44750 ns 
BM_StableSort_int16_PipeOrgan_16384                  57224 ns     200464 ns 
BM_StableSort_int16_PipeOrgan_65536                 416873 ns     887631 ns 
BM_StableSort_int16_PipeOrgan_262144                843264 ns    3588669 ns 
BM_StableSort_int16_PipeOrgan_524288               2027741 ns   11056924 ns 
BM_StableSort_int16_PipeOrgan_1048576              4223773 ns   13261276 ns 
BM_StableSort_int16_QuickSortAdversary_1              3.83 ns       3.68 ns 
BM_StableSort_int16_QuickSortAdversary_4              5.55 ns       6.93 ns 
BM_StableSort_int16_QuickSortAdversary_16             12.3 ns       15.2 ns 
BM_StableSort_int16_QuickSortAdversary_64              646 ns        632 ns 
BM_StableSort_int16_QuickSortAdversary_256            2751 ns       2542 ns 
BM_StableSort_int16_QuickSortAdversary_1024           3028 ns      16901 ns 
BM_StableSort_int16_QuickSortAdversary_4096          10862 ns      80222 ns 
BM_StableSort_int16_QuickSortAdversary_16384         57753 ns     317281 ns 
BM_StableSort_int16_QuickSortAdversary_65536         94064 ns     328502 ns 
BM_StableSort_int16_QuickSortAdversary_262144       557796 ns    1613208 ns 
BM_StableSort_int16_QuickSortAdversary_524288      1518451 ns    3479740 ns 
BM_StableSort_int16_QuickSortAdversary_1048576     3165129 ns    7655880 ns 
BM_StableSort_uint16_Random_1                         3.26 ns       3.44 ns 
BM_StableSort_uint16_Random_4                         21.1 ns       22.2 ns 
BM_StableSort_uint16_Random_16                         157 ns        156 ns 
BM_StableSort_uint16_Random_64                         955 ns        947 ns 
BM_StableSort_uint16_Random_256                       5886 ns       6097 ns 
BM_StableSort_uint16_Random_1024                      2787 ns      30776 ns 
BM_StableSort_uint16_Random_4096                      9973 ns     155652 ns 
BM_StableSort_uint16_Random_16384                    48628 ns     741072 ns 
BM_StableSort_uint16_Random_65536                   439609 ns    3478966 ns 
BM_StableSort_uint16_Random_262144                 2336983 ns   15197642 ns 
BM_StableSort_uint16_Random_524288                 7888701 ns   34234254 ns 
BM_StableSort_uint16_Random_1048576               14865180 ns   68516386 ns 
BM_StableSort_uint16_Ascending_1                      3.33 ns       4.00 ns 
BM_StableSort_uint16_Ascending_4                      5.79 ns       6.64 ns 
BM_StableSort_uint16_Ascending_16                     14.9 ns       15.5 ns 
BM_StableSort_uint16_Ascending_64                     50.2 ns       52.5 ns 
BM_StableSort_uint16_Ascending_256                     538 ns        546 ns 
BM_StableSort_uint16_Ascending_1024                   1645 ns       2652 ns 
BM_StableSort_uint16_Ascending_4096                   5559 ns      14517 ns 
BM_StableSort_uint16_Ascending_16384                 22803 ns      70275 ns 
BM_StableSort_uint16_Ascending_65536                 83109 ns     333446 ns 
BM_StableSort_uint16_Ascending_262144               562667 ns    1568670 ns 
BM_StableSort_uint16_Ascending_524288              1564646 ns    3059839 ns 
BM_StableSort_uint16_Ascending_1048576             3178826 ns    7048327 ns 
BM_StableSort_uint16_Descending_1                     3.34 ns       3.93 ns 
BM_StableSort_uint16_Descending_4                     8.75 ns       9.73 ns 
BM_StableSort_uint16_Descending_16                    55.9 ns       55.5 ns 
BM_StableSort_uint16_Descending_64                    1021 ns       1035 ns 
BM_StableSort_uint16_Descending_256                   4752 ns       4931 ns 
BM_StableSort_uint16_Descending_1024                  2982 ns      19727 ns 
BM_StableSort_uint16_Descending_4096                 10432 ns      83165 ns 
BM_StableSort_uint16_Descending_16384                56593 ns     326131 ns 
BM_StableSort_uint16_Descending_65536               439134 ns    1371346 ns 
BM_StableSort_uint16_Descending_262144             1220925 ns    5735665 ns 
BM_StableSort_uint16_Descending_524288             2767234 ns   16758330 ns 
BM_StableSort_uint16_Descending_1048576            5673769 ns   17541715 ns 
BM_StableSort_uint16_SingleElement_1                  3.53 ns       3.73 ns 
BM_StableSort_uint16_SingleElement_4                  6.27 ns       5.81 ns 
BM_StableSort_uint16_SingleElement_16                 14.8 ns       15.1 ns 
BM_StableSort_uint16_SingleElement_64                 51.5 ns       50.9 ns 
BM_StableSort_uint16_SingleElement_256                 536 ns        540 ns 
BM_StableSort_uint16_SingleElement_1024               1669 ns       2690 ns 
BM_StableSort_uint16_SingleElement_4096               5840 ns      14230 ns 
BM_StableSort_uint16_SingleElement_16384             22468 ns      68524 ns 
BM_StableSort_uint16_SingleElement_65536             89845 ns     332187 ns 
BM_StableSort_uint16_SingleElement_262144           590736 ns    1550868 ns 
BM_StableSort_uint16_SingleElement_524288          1573677 ns    3095703 ns 
BM_StableSort_uint16_SingleElement_1048576         3183421 ns    8251180 ns 
BM_StableSort_uint16_PipeOrgan_1                      3.70 ns       3.64 ns 
BM_StableSort_uint16_PipeOrgan_4                      7.01 ns       6.81 ns 
BM_StableSort_uint16_PipeOrgan_16                     25.7 ns       26.4 ns 
BM_StableSort_uint16_PipeOrgan_64                      283 ns        277 ns 
BM_StableSort_uint16_PipeOrgan_256                    2562 ns       2852 ns 
BM_StableSort_uint16_PipeOrgan_1024                   2863 ns      10892 ns 
BM_StableSort_uint16_PipeOrgan_4096                  10585 ns      45668 ns 
BM_StableSort_uint16_PipeOrgan_16384                 59151 ns     194358 ns 
BM_StableSort_uint16_PipeOrgan_65536                508579 ns     854692 ns 
BM_StableSort_uint16_PipeOrgan_262144               901294 ns    3606346 ns 
BM_StableSort_uint16_PipeOrgan_524288              2192498 ns   10449279 ns 
BM_StableSort_uint16_PipeOrgan_1048576             4204368 ns   11956606 ns 
BM_StableSort_uint16_QuickSortAdversary_1             3.20 ns       3.63 ns 
BM_StableSort_uint16_QuickSortAdversary_4             5.30 ns       6.38 ns 
BM_StableSort_uint16_QuickSortAdversary_16            14.5 ns       15.3 ns 
BM_StableSort_uint16_QuickSortAdversary_64             575 ns        611 ns 
BM_StableSort_uint16_QuickSortAdversary_256           2423 ns       2577 ns 
BM_StableSort_uint16_QuickSortAdversary_1024          2794 ns      16854 ns 
BM_StableSort_uint16_QuickSortAdversary_4096         10511 ns      75952 ns 
BM_StableSort_uint16_QuickSortAdversary_16384        56214 ns     333824 ns 
BM_StableSort_uint16_QuickSortAdversary_65536       422512 ns    1354867 ns 
BM_StableSort_uint16_QuickSortAdversary_262144      583301 ns    1564443 ns 
BM_StableSort_uint16_QuickSortAdversary_524288     1584319 ns    3265575 ns 
BM_StableSort_uint16_QuickSortAdversary_1048576    3197732 ns    7945245 ns 
BM_StableSort_int32_Random_1                          3.81 ns       3.70 ns 
BM_StableSort_int32_Random_4                          20.8 ns       23.4 ns 
BM_StableSort_int32_Random_16                          134 ns        161 ns 
BM_StableSort_int32_Random_64                          895 ns        984 ns 
BM_StableSort_int32_Random_256                        5640 ns       5897 ns 
BM_StableSort_int32_Random_1024                       6994 ns      32118 ns 
BM_StableSort_int32_Random_4096                      27367 ns     168960 ns 
BM_StableSort_int32_Random_16384                    183261 ns     843240 ns 
BM_StableSort_int32_Random_65536                    950914 ns    3953588 ns 
BM_StableSort_int32_Random_262144                  3673311 ns   16790171 ns 
BM_StableSort_int32_Random_524288                 11515700 ns   36023098 ns 
BM_StableSort_int32_Random_1048576                24492515 ns   78116028 ns 
BM_StableSort_int32_Ascending_1                       3.31 ns       4.48 ns 
BM_StableSort_int32_Ascending_4                       5.96 ns       6.99 ns 
BM_StableSort_int32_Ascending_16                      13.0 ns       16.0 ns 
BM_StableSort_int32_Ascending_64                      36.7 ns       53.0 ns 
BM_StableSort_int32_Ascending_256                      391 ns        471 ns 
BM_StableSort_int32_Ascending_1024                    2705 ns       2682 ns 
BM_StableSort_int32_Ascending_4096                    8773 ns      14231 ns 
BM_StableSort_int32_Ascending_16384                  34709 ns      70625 ns 
BM_StableSort_int32_Ascending_65536                 142907 ns     344482 ns 
BM_StableSort_int32_Ascending_262144                745483 ns    1591418 ns 
BM_StableSort_int32_Ascending_524288               1873701 ns    3190305 ns 
BM_StableSort_int32_Ascending_1048576              3851590 ns    7570095 ns 
BM_StableSort_int32_Descending_1                      3.22 ns       4.23 ns 
BM_StableSort_int32_Descending_4                      7.58 ns       11.2 ns 
BM_StableSort_int32_Descending_16                     63.9 ns       58.6 ns 
BM_StableSort_int32_Descending_64                     1133 ns       1017 ns 
BM_StableSort_int32_Descending_256                    4850 ns       4464 ns 
BM_StableSort_int32_Descending_1024                   7023 ns      18954 ns 
BM_StableSort_int32_Descending_4096                  28550 ns      75163 ns 
BM_StableSort_int32_Descending_16384                200880 ns     341104 ns 
BM_StableSort_int32_Descending_65536               1095910 ns    1398021 ns 
BM_StableSort_int32_Descending_262144              3818864 ns    5695486 ns 
BM_StableSort_int32_Descending_524288              5606779 ns   17593982 ns 
BM_StableSort_int32_Descending_1048576            16416366 ns   26649503 ns 
BM_StableSort_int32_SingleElement_1                   3.81 ns       3.71 ns 
BM_StableSort_int32_SingleElement_4                   6.57 ns       6.61 ns 
BM_StableSort_int32_SingleElement_16                  14.0 ns       15.8 ns 
BM_StableSort_int32_SingleElement_64                  38.7 ns       53.5 ns 
BM_StableSort_int32_SingleElement_256                  386 ns        554 ns 
BM_StableSort_int32_SingleElement_1024                2761 ns       3046 ns 
BM_StableSort_int32_SingleElement_4096                9179 ns      15188 ns 
BM_StableSort_int32_SingleElement_16384              34794 ns      70119 ns 
BM_StableSort_int32_SingleElement_65536             135190 ns     354755 ns 
BM_StableSort_int32_SingleElement_262144            760995 ns    1644072 ns 
BM_StableSort_int32_SingleElement_524288           1969575 ns    3343419 ns 
BM_StableSort_int32_SingleElement_1048576          4423816 ns    8346971 ns 
BM_StableSort_int32_PipeOrgan_1                       3.79 ns       3.63 ns 
BM_StableSort_int32_PipeOrgan_4                       6.21 ns       6.73 ns 
BM_StableSort_int32_PipeOrgan_16                      27.5 ns       26.0 ns 
BM_StableSort_int32_PipeOrgan_64                       291 ns        265 ns 
BM_StableSort_int32_PipeOrgan_256                     2557 ns       2518 ns 
BM_StableSort_int32_PipeOrgan_1024                    6765 ns      10976 ns 
BM_StableSort_int32_PipeOrgan_4096                   26373 ns      44537 ns 
BM_StableSort_int32_PipeOrgan_16384                 201466 ns     188582 ns 
BM_StableSort_int32_PipeOrgan_65536                1148533 ns     802368 ns 
BM_StableSort_int32_PipeOrgan_262144               2255177 ns    3477829 ns 
BM_StableSort_int32_PipeOrgan_524288               3947015 ns   10356637 ns 
BM_StableSort_int32_PipeOrgan_1048576             10274312 ns   16405366 ns 
BM_StableSort_int32_QuickSortAdversary_1              3.32 ns       4.36 ns 
BM_StableSort_int32_QuickSortAdversary_4              5.98 ns       7.44 ns 
BM_StableSort_int32_QuickSortAdversary_16             13.0 ns       16.3 ns 
BM_StableSort_int32_QuickSortAdversary_64              657 ns        616 ns 
BM_StableSort_int32_QuickSortAdversary_256            2569 ns       2483 ns 
BM_StableSort_int32_QuickSortAdversary_1024           6898 ns      19635 ns 
BM_StableSort_int32_QuickSortAdversary_4096          27092 ns      75108 ns 
BM_StableSort_int32_QuickSortAdversary_16384        190379 ns     316463 ns 
BM_StableSort_int32_QuickSortAdversary_65536       1109040 ns    1319018 ns 
BM_StableSort_int32_QuickSortAdversary_262144      4361925 ns    5472779 ns 
BM_StableSort_int32_QuickSortAdversary_524288      6528215 ns   17538983 ns 
BM_StableSort_int32_QuickSortAdversary_1048576    18345325 ns   27223926 ns 
BM_StableSort_uint32_Random_1                         3.67 ns       3.82 ns 
BM_StableSort_uint32_Random_4                         22.3 ns       21.8 ns 
BM_StableSort_uint32_Random_16                         155 ns        153 ns 
BM_StableSort_uint32_Random_64                         946 ns        976 ns 
BM_StableSort_uint32_Random_256                       5824 ns       6019 ns 
BM_StableSort_uint32_Random_1024                      4525 ns      32764 ns 
BM_StableSort_uint32_Random_4096                     17223 ns     158608 ns 
BM_StableSort_uint32_Random_16384                   134821 ns     748525 ns 
BM_StableSort_uint32_Random_65536                   716644 ns    3453325 ns 
BM_StableSort_uint32_Random_262144                 3628062 ns   16065414 ns 
BM_StableSort_uint32_Random_524288                10971334 ns   36567712 ns 
BM_StableSort_uint32_Random_1048576               22688377 ns   77533497 ns 
BM_StableSort_uint32_Ascending_1                      3.57 ns       3.44 ns 
BM_StableSort_uint32_Ascending_4                      5.73 ns       5.33 ns 
BM_StableSort_uint32_Ascending_16                     14.5 ns       14.0 ns 
BM_StableSort_uint32_Ascending_64                     50.3 ns       51.3 ns 
BM_StableSort_uint32_Ascending_256                     465 ns        467 ns 
BM_StableSort_uint32_Ascending_1024                   3042 ns       2530 ns 
BM_StableSort_uint32_Ascending_4096                   9842 ns      12207 ns 
BM_StableSort_uint32_Ascending_16384                 37994 ns      61726 ns 
BM_StableSort_uint32_Ascending_65536                148890 ns     294385 ns 
BM_StableSort_uint32_Ascending_262144               855080 ns    1422167 ns 
BM_StableSort_uint32_Ascending_524288              2154903 ns    3203018 ns 
BM_StableSort_uint32_Ascending_1048576             5002518 ns    7563817 ns 
BM_StableSort_uint32_Descending_1                     3.51 ns       3.40 ns 
BM_StableSort_uint32_Descending_4                     9.09 ns       7.95 ns 
BM_StableSort_uint32_Descending_16                    54.8 ns       74.4 ns 
BM_StableSort_uint32_Descending_64                    1003 ns       1305 ns 
BM_StableSort_uint32_Descending_256                   4545 ns       5300 ns 
BM_StableSort_uint32_Descending_1024                  4361 ns      21884 ns 
BM_StableSort_uint32_Descending_4096                 16018 ns      90534 ns 
BM_StableSort_uint32_Descending_16384               146274 ns     381943 ns 
BM_StableSort_uint32_Descending_65536               938248 ns    1536806 ns 
BM_StableSort_uint32_Descending_262144             3899300 ns    6387843 ns 
BM_StableSort_uint32_Descending_524288             5808157 ns   21959858 ns 
BM_StableSort_uint32_Descending_1048576           17520047 ns   26351912 ns 
BM_StableSort_uint32_SingleElement_1                  4.03 ns       3.97 ns 
BM_StableSort_uint32_SingleElement_4                  6.55 ns       6.41 ns 
BM_StableSort_uint32_SingleElement_16                 15.6 ns       15.8 ns 
BM_StableSort_uint32_SingleElement_64                 52.3 ns       58.7 ns 
BM_StableSort_uint32_SingleElement_256                 473 ns        485 ns 
BM_StableSort_uint32_SingleElement_1024               3020 ns       2407 ns 
BM_StableSort_uint32_SingleElement_4096               9998 ns      12527 ns 
BM_StableSort_uint32_SingleElement_16384             38072 ns      62228 ns 
BM_StableSort_uint32_SingleElement_65536            153706 ns     295662 ns 
BM_StableSort_uint32_SingleElement_262144           836532 ns    1477099 ns 
BM_StableSort_uint32_SingleElement_524288          2144900 ns    3157204 ns 
BM_StableSort_uint32_SingleElement_1048576         4995525 ns    7617233 ns 
BM_StableSort_uint32_PipeOrgan_1                      4.02 ns       3.99 ns 
BM_StableSort_uint32_PipeOrgan_4                      6.97 ns       6.84 ns 
BM_StableSort_uint32_PipeOrgan_16                     26.1 ns       29.7 ns 
BM_StableSort_uint32_PipeOrgan_64                      266 ns        333 ns 
BM_StableSort_uint32_PipeOrgan_256                    2462 ns       2892 ns 
BM_StableSort_uint32_PipeOrgan_1024                   4291 ns      12431 ns 
BM_StableSort_uint32_PipeOrgan_4096                  15638 ns      51449 ns 
BM_StableSort_uint32_PipeOrgan_16384                154563 ns     217460 ns 
BM_StableSort_uint32_PipeOrgan_65536                907724 ns     925873 ns 
BM_StableSort_uint32_PipeOrgan_262144              2394580 ns    4103575 ns 
BM_StableSort_uint32_PipeOrgan_524288              4177145 ns   13947158 ns 
BM_StableSort_uint32_PipeOrgan_1048576            11848224 ns   18807297 ns 
BM_StableSort_uint32_QuickSortAdversary_1             3.50 ns       3.43 ns 
BM_StableSort_uint32_QuickSortAdversary_4             5.88 ns       4.96 ns 
BM_StableSort_uint32_QuickSortAdversary_16            14.6 ns       14.0 ns 
BM_StableSort_uint32_QuickSortAdversary_64             576 ns        715 ns 
BM_StableSort_uint32_QuickSortAdversary_256           2353 ns       2797 ns 
BM_StableSort_uint32_QuickSortAdversary_1024          4176 ns      21775 ns 
BM_StableSort_uint32_QuickSortAdversary_4096         15565 ns      96188 ns 
BM_StableSort_uint32_QuickSortAdversary_16384       149092 ns     398332 ns 
BM_StableSort_uint32_QuickSortAdversary_65536       902488 ns    1552393 ns 
BM_StableSort_uint32_QuickSortAdversary_262144     3946517 ns    6560414 ns 
BM_StableSort_uint32_QuickSortAdversary_524288     6247114 ns   22420977 ns 
BM_StableSort_uint32_QuickSortAdversary_1048576   19892446 ns   26529576 ns 
BM_StableSort_int64_Random_1                          3.83 ns       3.98 ns 
BM_StableSort_int64_Random_4                          21.1 ns       24.0 ns 
BM_StableSort_int64_Random_16                          129 ns        136 ns 
BM_StableSort_int64_Random_64                          890 ns        906 ns 
BM_StableSort_int64_Random_256                        5542 ns       5901 ns 
BM_StableSort_int64_Random_1024                      16085 ns      33112 ns 
BM_StableSort_int64_Random_4096                      63895 ns     162181 ns 
BM_StableSort_int64_Random_16384                    348827 ns     790045 ns 
BM_StableSort_int64_Random_65536                   1488237 ns    3557506 ns 
BM_StableSort_int64_Random_262144                  8195713 ns   16315808 ns 
BM_StableSort_int64_Random_524288                 16586833 ns   38274075 ns 
BM_StableSort_int64_Random_1048576                40346644 ns   79182089 ns 
BM_StableSort_int64_Ascending_1                       3.76 ns       3.55 ns 
BM_StableSort_int64_Ascending_4                       5.82 ns       6.19 ns 
BM_StableSort_int64_Ascending_16                      11.7 ns       11.8 ns 
BM_StableSort_int64_Ascending_64                      32.9 ns       36.8 ns 
BM_StableSort_int64_Ascending_256                      415 ns        550 ns 
BM_StableSort_int64_Ascending_1024                    5352 ns       3347 ns 
BM_StableSort_int64_Ascending_4096                   17516 ns      19134 ns 
BM_StableSort_int64_Ascending_16384                  64147 ns      91099 ns 
BM_StableSort_int64_Ascending_65536                 322126 ns     434009 ns 
BM_StableSort_int64_Ascending_262144               1554669 ns    2057056 ns 
BM_StableSort_int64_Ascending_524288               3656527 ns    5016650 ns 
BM_StableSort_int64_Ascending_1048576             10469979 ns   12908613 ns 
BM_StableSort_int64_Descending_1                      4.09 ns       3.35 ns 
BM_StableSort_int64_Descending_4                      9.13 ns       8.01 ns 
BM_StableSort_int64_Descending_16                     76.8 ns       92.9 ns 
BM_StableSort_int64_Descending_64                     1336 ns       1417 ns 
BM_StableSort_int64_Descending_256                    5525 ns       5674 ns 
BM_StableSort_int64_Descending_1024                  17461 ns      22558 ns 
BM_StableSort_int64_Descending_4096                  64285 ns     102360 ns 
BM_StableSort_int64_Descending_16384                336946 ns     388940 ns 
BM_StableSort_int64_Descending_65536                837912 ns    1662169 ns 
BM_StableSort_int64_Descending_262144              3680806 ns    7494323 ns 
BM_StableSort_int64_Descending_524288             11023784 ns   24935033 ns 
BM_StableSort_int64_Descending_1048576            20023568 ns   33220712 ns 
BM_StableSort_int64_SingleElement_1                   3.37 ns       3.98 ns 
BM_StableSort_int64_SingleElement_4                   5.32 ns       6.92 ns 
BM_StableSort_int64_SingleElement_16                  10.9 ns       13.3 ns 
BM_StableSort_int64_SingleElement_64                  32.1 ns       43.8 ns 
BM_StableSort_int64_SingleElement_256                  420 ns        541 ns 
BM_StableSort_int64_SingleElement_1024                5689 ns       3381 ns 
BM_StableSort_int64_SingleElement_4096               19199 ns      17989 ns 
BM_StableSort_int64_SingleElement_16384              75754 ns      91963 ns 
BM_StableSort_int64_SingleElement_65536             357106 ns     500326 ns 
BM_StableSort_int64_SingleElement_262144           1672975 ns    2417734 ns 
BM_StableSort_int64_SingleElement_524288           3642891 ns    5200878 ns 
BM_StableSort_int64_SingleElement_1048576         11172007 ns   13729511 ns 
BM_StableSort_int64_PipeOrgan_1                       3.38 ns       3.94 ns 
BM_StableSort_int64_PipeOrgan_4                       5.73 ns       6.44 ns 
BM_StableSort_int64_PipeOrgan_16                      27.5 ns       29.0 ns 
BM_StableSort_int64_PipeOrgan_64                       310 ns        321 ns 
BM_StableSort_int64_PipeOrgan_256                     2761 ns       2918 ns 
BM_StableSort_int64_PipeOrgan_1024                   16105 ns      12525 ns 
BM_StableSort_int64_PipeOrgan_4096                   65289 ns      59990 ns 
BM_StableSort_int64_PipeOrgan_16384                 341757 ns     270636 ns 
BM_StableSort_int64_PipeOrgan_65536                 587452 ns    1126132 ns 
BM_StableSort_int64_PipeOrgan_262144               2837955 ns    5034180 ns 
BM_StableSort_int64_PipeOrgan_524288               6617313 ns   15267354 ns 
BM_StableSort_int64_PipeOrgan_1048576             15208796 ns   23162989 ns 
BM_StableSort_int64_QuickSortAdversary_1              3.77 ns       3.45 ns 
BM_StableSort_int64_QuickSortAdversary_4              5.55 ns       5.20 ns 
BM_StableSort_int64_QuickSortAdversary_16             12.5 ns       11.5 ns 
BM_StableSort_int64_QuickSortAdversary_64              646 ns        750 ns 
BM_StableSort_int64_QuickSortAdversary_256            2655 ns       3539 ns 
BM_StableSort_int64_QuickSortAdversary_1024          16373 ns      22349 ns 
BM_StableSort_int64_QuickSortAdversary_4096          62306 ns      97248 ns 
BM_StableSort_int64_QuickSortAdversary_16384        321755 ns     388084 ns 
BM_StableSort_int64_QuickSortAdversary_65536       1374694 ns    1596091 ns 
BM_StableSort_int64_QuickSortAdversary_262144      4374661 ns    6894139 ns 
BM_StableSort_int64_QuickSortAdversary_524288     12736074 ns   23932229 ns 
BM_StableSort_int64_QuickSortAdversary_1048576    22615219 ns   33355629 ns 
BM_StableSort_uint64_Random_1                         3.82 ns       3.49 ns 
BM_StableSort_uint64_Random_4                         22.4 ns       23.4 ns 
BM_StableSort_uint64_Random_16                         154 ns        146 ns 
BM_StableSort_uint64_Random_64                         924 ns        926 ns 
BM_StableSort_uint64_Random_256                       5864 ns       5913 ns 
BM_StableSort_uint64_Random_1024                      7168 ns      31746 ns 
BM_StableSort_uint64_Random_4096                     27668 ns     154224 ns 
BM_StableSort_uint64_Random_16384                   219526 ns     755205 ns 
BM_StableSort_uint64_Random_65536                   965251 ns    3490165 ns 
BM_StableSort_uint64_Random_262144                 6262162 ns   15889589 ns 
BM_StableSort_uint64_Random_524288                12530078 ns   36458581 ns 
BM_StableSort_uint64_Random_1048576               38462191 ns   75168445 ns 
BM_StableSort_uint64_Ascending_1                      3.30 ns       3.35 ns 
BM_StableSort_uint64_Ascending_4                      5.65 ns       5.84 ns 
BM_StableSort_uint64_Ascending_16                     14.7 ns       12.6 ns 
BM_StableSort_uint64_Ascending_64                     55.3 ns       34.6 ns 
BM_StableSort_uint64_Ascending_256                     513 ns        533 ns 
BM_StableSort_uint64_Ascending_1024                   5541 ns       3189 ns 
BM_StableSort_uint64_Ascending_4096                  17706 ns      20326 ns 
BM_StableSort_uint64_Ascending_16384                 66420 ns      93757 ns 
BM_StableSort_uint64_Ascending_65536                341425 ns     435016 ns 
BM_StableSort_uint64_Ascending_262144              1595691 ns    2088317 ns 
BM_StableSort_uint64_Ascending_524288              3808703 ns    5092832 ns 
BM_StableSort_uint64_Ascending_1048576            11060417 ns   13023250 ns 
BM_StableSort_uint64_Descending_1                     3.29 ns       3.35 ns 
BM_StableSort_uint64_Descending_4                     8.65 ns       7.92 ns 
BM_StableSort_uint64_Descending_16                    54.7 ns       80.2 ns 
BM_StableSort_uint64_Descending_64                    1028 ns       1307 ns 
BM_StableSort_uint64_Descending_256                   4521 ns       5635 ns 
BM_StableSort_uint64_Descending_1024                  7122 ns      23323 ns 
BM_StableSort_uint64_Descending_4096                 30538 ns      95892 ns 
BM_StableSort_uint64_Descending_16384               195565 ns     392721 ns 
BM_StableSort_uint64_Descending_65536               852002 ns    1720358 ns 
BM_StableSort_uint64_Descending_262144             3737884 ns    7484130 ns 
BM_StableSort_uint64_Descending_524288            11159345 ns   25690770 ns 
BM_StableSort_uint64_Descending_1048576           20648864 ns   33057383 ns 
BM_StableSort_uint64_SingleElement_1                  3.62 ns       4.10 ns 
BM_StableSort_uint64_SingleElement_4                  6.73 ns       6.64 ns 
BM_StableSort_uint64_SingleElement_16                 14.9 ns       11.3 ns 
BM_StableSort_uint64_SingleElement_64                 52.0 ns       33.0 ns 
BM_StableSort_uint64_SingleElement_256                 511 ns        582 ns 
BM_StableSort_uint64_SingleElement_1024               6499 ns       3287 ns 
BM_StableSort_uint64_SingleElement_4096              22190 ns      17616 ns 
BM_StableSort_uint64_SingleElement_16384             84378 ns      86885 ns 
BM_StableSort_uint64_SingleElement_65536            466257 ns     457144 ns 
BM_StableSort_uint64_SingleElement_262144          1993687 ns    2361999 ns 
BM_StableSort_uint64_SingleElement_524288          4759565 ns    5096771 ns 
BM_StableSort_uint64_SingleElement_1048576        12426111 ns   13468453 ns 
BM_StableSort_uint64_PipeOrgan_1                      3.73 ns       3.94 ns 
BM_StableSort_uint64_PipeOrgan_4                      7.18 ns       7.54 ns 
BM_StableSort_uint64_PipeOrgan_16                     25.2 ns       29.1 ns 
BM_StableSort_uint64_PipeOrgan_64                      260 ns        321 ns 
BM_StableSort_uint64_PipeOrgan_256                    2468 ns       2970 ns 
BM_StableSort_uint64_PipeOrgan_1024                   7025 ns      12912 ns 
BM_StableSort_uint64_PipeOrgan_4096                  28968 ns      53379 ns 
BM_StableSort_uint64_PipeOrgan_16384                194156 ns     239790 ns 
BM_StableSort_uint64_PipeOrgan_65536                599491 ns     993800 ns 
BM_StableSort_uint64_PipeOrgan_262144              2648585 ns    4689680 ns 
BM_StableSort_uint64_PipeOrgan_524288              7621109 ns   15401808 ns 
BM_StableSort_uint64_PipeOrgan_1048576            15608814 ns   23484821 ns 
BM_StableSort_uint64_QuickSortAdversary_1             3.38 ns       3.54 ns 
BM_StableSort_uint64_QuickSortAdversary_4             5.50 ns       6.03 ns 
BM_StableSort_uint64_QuickSortAdversary_16            14.2 ns       11.0 ns 
BM_StableSort_uint64_QuickSortAdversary_64             597 ns        688 ns 
BM_StableSort_uint64_QuickSortAdversary_256           2446 ns       2818 ns 
BM_StableSort_uint64_QuickSortAdversary_1024          7266 ns      20319 ns 
BM_StableSort_uint64_QuickSortAdversary_4096         31155 ns      89112 ns 
BM_StableSort_uint64_QuickSortAdversary_16384       201033 ns     390574 ns 
BM_StableSort_uint64_QuickSortAdversary_65536       871014 ns    1685639 ns 
BM_StableSort_uint64_QuickSortAdversary_262144     3978535 ns    7265830 ns 
BM_StableSort_uint64_QuickSortAdversary_524288    10279721 ns   25350004 ns 
BM_StableSort_uint64_QuickSortAdversary_1048576   20256585 ns   33054393 ns 
```
## Changes
- Delete DirectX length intrinsic
- Delete HLSL length lang builtin
- Implement length algorithm entirely in the header.

## History
- In the past if an HLSL intrinsic lowered to either a spirv op code or
a DXIL opcode we represented it with intrinsics

## Why we are moving away?
- To make HLSL apis more portable the team decided that it makes sense
for some intrinsics to be defined only in the header.
- Since there tends to be more SPIRV opcodes than DXIL opcodes the plan
is to support SPIRV opcodes either with target specific builtins or via
pattern matching.
…lvm#120916)

Also use getPointerAlignment when trying to use alignment and
dereferenceable assumptions. This catches cases where dereferencable is
known via the assumption but alignment is known via getPointerAlignment
(e.g. via argument attribute or align of 1)

PR: llvm#120916
llvm#122151 added this test with an
invalid SEW. Use a valid SEW here.
OpenACC data clause operations previously required that the variable
operand implemented PointerLikeType interface. This was a reasonable
constraint because the dialects currently mixed with `acc` do use
pointers to represent variables. However, this forces the "pointer"
abstraction to be exposed too early and some cases are not cleanly
representable through this approach (more specifically FIR's `fix.box`
abstraction).

Thus, relax this by allowing a variable to be a type which implements
either `PointerLikeType` interface or `MappableType` interface.
…uilds (llvm#120914)

The changes in llvm#87822 introduced a regression where Flang could no
longer be built standalone without explicitly specifying all of
LLVM_DIR, CLANG_DIR and MLIR_DIR. Restore the earlier logic that used
these paths as hints, and supported finding system-wide LLVM install via
default paths. Instead, make paths absolute after locating the packages,
using the paths CMake determined.

-----

@vzakhari, could you confirm that this doesn't break your use case?
…lvm#122316)

This is a NFC. Duplicate mc test file for gfx12 vop3c/vop3cx to
true16/fake16 mode and update it with +real-true16/-real-true16 flag.

This is for the upcoming true16 changes
artagnon and others added 28 commits January 9, 2025 20:18
Pre-commit some tests in preparation to teach ValueTracking's
implied-cond about samesign.
…lvm#120327)

The `sycl_kernel_entry_point` attribute is used to declare a function that
defines a pattern for an offload kernel entry point. The attribute requires
a single type argument that specifies a class type that meets the requirements
for a SYCL kernel name as described in section 5.2, "Naming of kernels", of
the SYCL 2020 specification. A unique kernel name type is required for each
function declared with the attribute. The attribute may not first appear on a
declaration that follows a definition of the function. The function is
required to have a non-deduced `void` return type. The function must not be
a non-static member function, be deleted or defaulted, be declared with the
`constexpr` or `consteval` specifiers, be declared with the `[[noreturn]]`
attribute, be a coroutine, or accept variadic arguments.

Diagnostics are not yet provided for the following:
- Use of a type as a kernel name that does not satisfy the forward
  declarability requirements specified in section 5.2, "Naming of kernels",
  of the SYCL 2020 specification.
- Use of a type as a parameter of the attributed function that does not
  satisfy the kernel parameter requirements specified in section 4.12.4,
  "Rules for parameter passing to kernels", of the SYCL 2020 specification
  (each such function parameter constitutes a kernel parameter).
- Use of language features that are not permitted in device functions as
  specified in section 5.4, "Language restrictions for device functions",
  of the SYCL 2020 specification.

There are several issues noted by various FIXME comments.
- The diagnostic generated for kernel name conflicts needs additional work
  to better detail the relevant source locations; such as the location of
  each declaration as well as the original source of each kernel name.
- A number of the tests illustrate spurious errors being produced due to
  attributes that appertain to function templates being instantiated too
  early (during overload resolution as opposed to after an overload is
  selected).

Included changes allow the `SYCLKernelEntryPointAttr` attribute to be
marked as invalid if a `sycl_kernel_entry_point` attribute is used incorrectly.
This is intended to prevent trying to emit an offload kernel entry point
without having to mark the associated function as invalid since doing so
would affect overload resolution; which this attribute should not do.
Unfortunately, Clang eagerly instantiates attributes that appertain to
functions with the result that errors might be issued for function
declarations that are never selected by overload resolution. Tests have
been added to demonstrate this. Further work will be needed to address
these issues (for this and other attributes).
Summary:
This isn't used anymore, I moved the GPU extensions into `offload/`.
After llvm#120563 malloc_size also
needs intercepting on Apple platforms, otherwise all type-sanitized
binaries crash on startup with an objc error:
realized class 0x12345 has corrupt data pointer: malloc_size(0x567) = 0

PR: llvm#122133
llvm#122371)

…llvm#121991)"

This reverts commit f8f8598.

This breaks ARMv7 and s390x buildbot with the following message:
```
llvm-exegesis error: No available targets are compatible with triple "armv8l-unknown-linux-gnueabihf"
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/tcwg-buildbot/worker/clang-armv7-2stage/stage2/bin/FileCheck /home/tcwg-buildbot/worker/clang-armv7-2stage/llvm/llvm/test/tools/llvm-exegesis/dry-run-measurement.test
```
…m#120662)

The Clang tablegen built-in function prototype parser has the `__bf16`
type missing. This patch adds the missing type to the parser.
…pace declaration of a negative test. (llvm#122375)

Commit 1a73654 added a missing
diagnostic for incorrect placement of an attribute in a namespace
declaration. This change corrects a SYCL test that inadvertently
exercised the `sycl_kernel_entry_point` attribute in the wrong
declaration location.
These make cross compiling the test suite more difficult, as you need
the
sysroot to contain these headers and libraries cross compiled for your
target.
It's straightforward to stick with the corresponding C headers.
If the pointer to be checked is statically known to be zero, the tag
check will always pass since:
1) the tag is zero
2) shadow memory for address 0 is initialized to 0 and never updated.
We can therefore elide the tag check.

We perform the elision in two places:
1) the HWASan pass
2) when lowering the CHECK_MEMACCESS intrinsic. Conceivably, the HWASan
pass may encounter a "cannot currently statically prove to be null"
pointer (and is therefore unable to omit the intrinsic) that later
optimization passes convert into a statically known-null pointer. As a
last line of defense, we perform elision here too.

This also updates the tests from
llvm#122186
This patch improves the linker’s ability to estimate stub reachability
in the `TextOutputSection::estimateStubsInRangeVA` function. It does so
by including thunks that have already been placed ahead of the current
call site address when calculating the threshold for direct stub calls.

Before this fix, the estimation process overlooked existing forward
thunks. This could result in some thunks not being inserted where
needed. In rare situations, particularly with large and specially
arranged codebases, this might lead to branch instructions being out of
range, causing linking errors.

Although this patch successfully addresses the problem, it is not
feasible to create a test for this issue. The specific layout and order
of thunk creation required to reproduce the corner case are too complex,
making test creation impractical.

Example error messages the issue could generate:
```
ld64.lld: error: banana.o:(symbol OUTLINED_FUNCTION_24949_3875): relocation BRANCH26 is out of range: 134547892 is not in [-134217728, 134217727]; references objc_autoreleaseReturnValue
ld64.lld: error: main.o:(symbol _main+0xc): relocation BRANCH26 is out of range: 134544132 is not in [-134217728, 134217727]; references objc_release
```
Looks like the smart pointer I removed was being used to free the underlying
object.

Fixes: llvm#122369
This allows changing the way pretty-printed code is formatted.
These were created with operator new (see `Test::createCallable`), so operator
delete should be used instead of free().

Fixes: llvm#122369
Fixes: llvm#122378
When creating a gSYM, `llvm-gsymutil` operates by default in
multi-threaded mode. If merged functions are present, determinism cannot
be guaranteed if parallel processing is enabled.

So, for our merged functions tests, we disable multi-threading during
gSYM creation.
1. add `static` for internal linkage functions
2. remove `clang` prefix for `QualType`
…er (llvm#122365)

The caller `AsmPrinter::emitJumpTableInfo` checks [1] `MJTI` is not a
null pointer before calling `emitJumpTableEntry` or
`emitJumpTableSizesSection`.

This patch updates callee function's signature to accept const
reference, this way it's explicit `MJTI` won't be nullptr inside the
callee.

[1]
https://github.com/llvm/llvm-project/blob/9d5299eb61a64cd4df5fefa0299b0cf8d917978f/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp#L2857
This should be reverted once the crash reported in llvm#118710 has been
analyzed.
@jorickert jorickert merged commit 9e66222 into bump_to_a7591767 Apr 14, 2025
12 checks passed
@jorickert jorickert deleted the bump_to_d0373dbe branch April 14, 2025 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.