Skip to content

Update */Manifest.toml#100

Merged
mergify[bot] merged 62 commits intomasterfrom
create-pull-request/pkg-update
Jun 28, 2020
Merged

Update */Manifest.toml#100
mergify[bot] merged 62 commits intomasterfrom
create-pull-request/pkg-update

Conversation

@tkf-bot
Copy link
Copy Markdown
Collaborator

@tkf-bot tkf-bot commented Mar 28, 2020

Commit Message

Update */Manifest.toml

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmarks:
    • Target: 28 Mar 2020 - 00:21
    • Baseline: 28 Mar 2020 - 00:26
  • Package commits:
    • Target: 0c2094
    • Baseline: 0d6d6b
  • Julia commits:
    • Target: b8e9a9
    • Baseline: b8e9a9
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["findfirst", "0%", "tx"] 1.39 (5%) ❌ 1.00 (1%)
["findfirst", "0%", "tx-noterm"] 0.93 (5%) ✅ 1.00 (1%)
["findfirst", "10%", "base"] 1.74 (5%) ❌ 1.00 (1%)
["findfirst", "10%", "tx"] 1.16 (5%) ❌ 1.00 (1%)
["findfirst", "10%", "tx-noterm"] 1.07 (5%) ❌ 0.94 (1%) ✅
["findfirst", "10%", "tx-seq"] 1.16 (5%) ❌ 1.00 (1%)
["findfirst", "20%", "base"] 1.72 (5%) ❌ 1.00 (1%)
["findfirst", "20%", "tx"] 1.16 (5%) ❌ 1.00 (1%)
["findfirst", "20%", "tx-noterm"] 1.11 (5%) ❌ 1.00 (1%)
["findfirst", "20%", "tx-seq"] 1.16 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "base"] 1.57 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx"] 1.16 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx-noterm"] 1.10 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "base"] 1.50 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx"] 1.20 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx-noterm"] 1.18 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx-seq"] 1.16 (5%) ❌ 1.00 (1%)
["findfirst", "50%", "base"] 1.54 (5%) ❌ 1.00 (1%)
["findfirst", "50%", "tx"] 1.15 (5%) ❌ 1.00 (1%)
["foreach", "base", "A .= B .+ B'"] 1.08 (5%) ❌ 1.00 (1%)
["foreach", "base", "A .= B .+ C"] 1.12 (5%) ❌ 1.00 (1%)
["foreach", "broadcast", "A .= B .+ B'"] 1.25 (5%) ❌ 1.00 (1%)
["foreach", "tx", "A .= B .+ B'"] 1.21 (5%) ❌ 1.00 (1%)
["foreach", "tx", "A .= B .+ C"] 1.06 (5%) ❌ 1.00 (1%)
["foreach_seq", "base", "Vector"] 1.38 (5%) ❌ 1.00 (1%)
["foreach_seq", "tx", "Transpose"] 1.22 (5%) ❌ 1.00 (1%)
["foreach_seq", "tx", "Vector"] 1.34 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "man"] 1.43 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 1.42 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 1.24 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 1.40 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "man"] 1.40 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 61197.57 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 59434.08 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 61203.67 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.30 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.31 (5%) ❌ 1.00 (1%)
["sort", "F64 (narrow)", "Base"] 1.16 (5%) ❌ 1.00 (1%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 1.17 (5%) ❌ 1.00 (1%)
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 1.14 (5%) ❌ 1.00 (1%)
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 1.06 (5%) ❌ 1.00 (1%)
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 1.13 (5%) ❌ 1.00 (1%)
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 1.15 (5%) ❌ 1.00 (1%)
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 1.15 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "Base"] 1.17 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 1.35 (5%) ❌ 1.00 (1%)
["sort", "reversed", "Base"] 1.16 (5%) ❌ 1.00 (1%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.17 (5%) ❌ 1.00 (1%)
["sort", "reversed", "ThreadsX.QuickSort"] 1.17 (5%) ❌ 1.00 (1%)
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.15 (5%) ❌ 1.00 (1%)
["sort", "sorted", "Base"] 1.21 (5%) ❌ 1.00 (1%)
["sort", "sorted", "ThreadsX.QuickSort"] 1.24 (5%) ❌ 1.00 (1%)
["unique", "rand(1:10, 1000000)", "base"] 1.17 (5%) ❌ 1.00 (1%)
["unique", "rand(1:10, 1000000)", "tx"] 1.12 (5%) ❌ 1.00 (1%)
["unique", "rand(1:1000, 1000000)", "base"] 1.15 (5%) ❌ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Target

Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.0.0-1032-azure #34-Ubuntu SMP Mon Feb 10 19:37:25 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      41448 s          0 s       2246 s      40396 s          0 s
       #2  2095 MHz      65841 s          0 s       2817 s      15947 s          0 s
       
  Memory: 6.782737731933594 GB (2671.3828125 MB free)
  Uptime: 862.0 sec
  Load Avg:  1.2685546875  1.33251953125  0.916015625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline

Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.0.0-1032-azure #34-Ubuntu SMP Mon Feb 10 19:37:25 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      63766 s          0 s       3127 s      51859 s          0 s
       #2  2095 MHz      89668 s          0 s       3461 s      26147 s          0 s
       
  Memory: 6.782737731933594 GB (3071.38671875 MB free)
  Uptime: 1211.0 sec
  Load Avg:  1.3525390625  1.40087890625  1.08251953125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Target result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Mar 2020 - 0:21
  • Package commit: 0c2094
  • Julia commit: b8e9a9
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 2.200 ns (5%)
["findfirst", "0%", "tx"] 25.502 μs (5%) 11.97 KiB (1%) 219
["findfirst", "0%", "tx-noterm"] 17.901 μs (5%) 11.97 KiB (1%) 218
["findfirst", "0%", "tx-seq"] 182.200 ns (5%) 544 bytes (1%) 14
["findfirst", "10%", "base"] 101.705 μs (5%)
["findfirst", "10%", "tx"] 87.505 μs (5%) 14.36 KiB (1%) 266
["findfirst", "10%", "tx-noterm"] 238.213 μs (5%) 35.09 KiB (1%) 641
["findfirst", "10%", "tx-seq"] 102.005 μs (5%) 560 bytes (1%) 15
["findfirst", "20%", "base"] 203.511 μs (5%)
["findfirst", "20%", "tx"] 165.209 μs (5%) 21.33 KiB (1%) 393
["findfirst", "20%", "tx-noterm"] 264.914 μs (5%) 28.30 KiB (1%) 521
["findfirst", "20%", "tx-seq"] 204.111 μs (5%) 560 bytes (1%) 15
["findfirst", "30%", "base"] 278.220 μs (5%)
["findfirst", "30%", "tx"] 223.315 μs (5%) 28.27 KiB (1%) 520
["findfirst", "30%", "tx-noterm"] 297.221 μs (5%) 28.31 KiB (1%) 522
["findfirst", "30%", "tx-seq"] 263.418 μs (5%) 560 bytes (1%) 15
["findfirst", "40%", "base"] 354.122 μs (5%)
["findfirst", "40%", "tx"] 313.718 μs (5%) 35.30 KiB (1%) 650
["findfirst", "40%", "tx-noterm"] 362.120 μs (5%) 35.28 KiB (1%) 648
["findfirst", "40%", "tx-seq"] 407.524 μs (5%) 560 bytes (1%) 15
["findfirst", "50%", "base"] 454.430 μs (5%)
["findfirst", "50%", "tx"] 358.422 μs (5%) 37.67 KiB (1%) 696
["findfirst", "50%", "tx-noterm"] 403.125 μs (5%) 53.91 KiB (1%) 993
["findfirst", "50%", "tx-seq"] 438.628 μs (5%) 560 bytes (1%) 15
["foreach", "base", "A .= B .+ B'"] 289.874 ms (5%) 25.588 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 227.475 ms (5%) 25.606 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 8.767 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 6.488 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 4.500 ms (5%) 25.92 KiB (1%) 359
["foreach", "tx", "A .= B .+ C"] 3.378 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 626.523 μs (5%)
["foreach_seq", "base", "Transpose"] 1.768 ms (5%)
["foreach_seq", "base", "Vector"] 857.031 μs (5%)
["foreach_seq", "tx", "Matrix"] 636.524 μs (5%)
["foreach_seq", "tx", "Transpose"] 1.218 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 834.530 μs (5%)
["foreach_seq_double", "cartesian", "man"] 24.401 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 24.401 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 25.601 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 23.801 μs (5%)
["foreach_seq_double", "linear", "man"] 61.299 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 61.198 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 59.434 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 61.204 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 910.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 920.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 2.300 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 2.300 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.479 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.702 ms (5%) 1.19 MiB (1%) 534
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 569.224 μs (5%) 965.14 KiB (1%) 1228
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 587.324 μs (5%) 1.02 MiB (1%) 1248
["sort", "F64 (wide)", "Base"] 5.399 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 4.384 ms (5%) 1.19 MiB (1%) 564
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 3.341 ms (5%) 1.01 MiB (1%) 2149
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 3.450 ms (5%) 1.39 MiB (1%) 2196
["sort", "I64 (narrow)", "Base"] 113.305 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 118.905 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 105.505 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 118.304 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 6.198 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 3.731 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 3.623 ms (5%) 1.01 MiB (1%) 2236
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 3.150 ms (5%) 1.40 MiB (1%) 2272
["sort", "reversed", "Base"] 886.635 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.166 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 931.838 μs (5%) 998.77 KiB (1%) 1872
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.293 ms (5%) 1.36 MiB (1%) 1904
["sort", "sorted", "Base"] 907.136 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 828.132 μs (5%) 1.18 MiB (1%) 431
["sort", "sorted", "ThreadsX.QuickSort"] 925.036 μs (5%) 998.77 KiB (1%) 1872
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.004 ms (5%) 1.36 MiB (1%) 1903
["unique", "rand(1:10, 1000000)", "base"] 9.300 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 4.726 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 8.235 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 4.341 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.0.0-1032-azure #34-Ubuntu SMP Mon Feb 10 19:37:25 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      41448 s          0 s       2246 s      40396 s          0 s
       #2  2095 MHz      65841 s          0 s       2817 s      15947 s          0 s
       
  Memory: 6.782737731933594 GB (2671.3828125 MB free)
  Uptime: 862.0 sec
  Load Avg:  1.2685546875  1.33251953125  0.916015625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Mar 2020 - 0:26
  • Package commit: 0d6d6b
  • Julia commit: b8e9a9
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 2.200 ns (5%)
["findfirst", "0%", "tx"] 18.300 μs (5%) 11.97 KiB (1%) 219
["findfirst", "0%", "tx-noterm"] 19.300 μs (5%) 11.97 KiB (1%) 218
["findfirst", "0%", "tx-seq"] 178.629 ns (5%) 544 bytes (1%) 14
["findfirst", "10%", "base"] 58.406 μs (5%)
["findfirst", "10%", "tx"] 75.208 μs (5%) 14.36 KiB (1%) 266
["findfirst", "10%", "tx-noterm"] 221.925 μs (5%) 37.44 KiB (1%) 685
["findfirst", "10%", "tx-seq"] 87.810 μs (5%) 560 bytes (1%) 15
["findfirst", "20%", "base"] 118.013 μs (5%)
["findfirst", "20%", "tx"] 142.916 μs (5%) 21.33 KiB (1%) 393
["findfirst", "20%", "tx-noterm"] 238.726 μs (5%) 28.27 KiB (1%) 519
["findfirst", "20%", "tx-seq"] 175.720 μs (5%) 560 bytes (1%) 15
["findfirst", "30%", "base"] 177.006 μs (5%)
["findfirst", "30%", "tx"] 192.607 μs (5%) 28.27 KiB (1%) 520
["findfirst", "30%", "tx-noterm"] 270.110 μs (5%) 28.30 KiB (1%) 521
["findfirst", "30%", "tx-seq"] 263.309 μs (5%) 560 bytes (1%) 15
["findfirst", "40%", "base"] 235.728 μs (5%)
["findfirst", "40%", "tx"] 261.631 μs (5%) 35.30 KiB (1%) 650
["findfirst", "40%", "tx-noterm"] 308.035 μs (5%) 35.31 KiB (1%) 650
["findfirst", "40%", "tx-seq"] 350.842 μs (5%) 560 bytes (1%) 15
["findfirst", "50%", "base"] 294.310 μs (5%)
["findfirst", "50%", "tx"] 312.739 μs (5%) 37.67 KiB (1%) 696
["findfirst", "50%", "tx-noterm"] 401.849 μs (5%) 53.91 KiB (1%) 993
["findfirst", "50%", "tx-seq"] 438.554 μs (5%) 560 bytes (1%) 15
["foreach", "base", "A .= B .+ B'"] 267.219 ms (5%) 28.235 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 203.833 ms (5%) 27.927 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 7.014 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 6.332 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 3.715 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 3.195 ms (5%) 12.73 KiB (1%) 123
["foreach_seq", "base", "Matrix"] 621.045 μs (5%)
["foreach_seq", "base", "Transpose"] 1.790 ms (5%)
["foreach_seq", "base", "Vector"] 621.545 μs (5%)
["foreach_seq", "tx", "Matrix"] 628.744 μs (5%)
["foreach_seq", "tx", "Transpose"] 998.471 μs (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 620.944 μs (5%)
["foreach_seq_double", "cartesian", "man"] 17.102 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 17.202 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 20.701 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 17.002 μs (5%)
["foreach_seq_double", "linear", "man"] 43.772 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 0.001 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 0.001 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 0.001 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 700.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 700.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 2.300 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 2.300 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.133 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.303 ms (5%) 1.19 MiB (1%) 534
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 501.142 μs (5%) 965.11 KiB (1%) 1226
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 554.747 μs (5%) 1.02 MiB (1%) 1248
["sort", "F64 (wide)", "Base"] 5.400 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 4.394 ms (5%) 1.19 MiB (1%) 564
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 2.961 ms (5%) 1.01 MiB (1%) 2147
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 3.435 ms (5%) 1.39 MiB (1%) 2196
["sort", "I64 (narrow)", "Base"] 112.910 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 103.709 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 102.909 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 102.809 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 5.300 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 3.631 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 2.690 ms (5%) 1.01 MiB (1%) 2239
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 3.237 ms (5%) 1.40 MiB (1%) 2273
["sort", "reversed", "Base"] 763.763 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.000 ms (5%) 1.18 MiB (1%) 434
["sort", "reversed", "ThreadsX.QuickSort"] 794.165 μs (5%) 998.72 KiB (1%) 1869
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.128 ms (5%) 1.36 MiB (1%) 1902
["sort", "sorted", "Base"] 748.960 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 798.263 μs (5%) 1.18 MiB (1%) 431
["sort", "sorted", "ThreadsX.QuickSort"] 746.059 μs (5%) 998.75 KiB (1%) 1871
["sort", "sorted", "ThreadsX.StableQuickSort"] 961.074 μs (5%) 1.36 MiB (1%) 1902
["unique", "rand(1:10, 1000000)", "base"] 7.978 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 4.203 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 7.170 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 4.363 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.0.0-1032-azure #34-Ubuntu SMP Mon Feb 10 19:37:25 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      63766 s          0 s       3127 s      51859 s          0 s
       #2  2095 MHz      89668 s          0 s       3461 s      26147 s          0 s
       
  Memory: 6.782737731933594 GB (3071.38671875 MB free)
  Uptime: 1211.0 sec
  Load Avg:  1.3525390625  1.40087890625  1.08251953125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Stepping:            4
CPU MHz:             2095.193
BogoMIPS:            4190.38
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves
Cpu Property Value
Brand Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Vendor :Intel
Architecture :Skylake
Model Family: 0x06, Model: 0x55, Stepping: 0x04, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 1024, 36608) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 512 bit = 64 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmarks:
    • Target: 28 Jun 2020 - 04:10
    • Baseline: 28 Jun 2020 - 04:16
  • Package commits:
    • Target: f6b2fd
    • Baseline: a573a0
  • Julia commits:
    • Target: 44fa15
    • Baseline: 44fa15
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["findfirst", "0%", "tx-seq"] 1.21 (5%) ❌ 1.03 (1%) ❌
["findfirst", "10%", "base"] 1.49 (5%) ❌ 1.00 (1%)
["findfirst", "10%", "tx"] 1.14 (5%) ❌ 1.01 (1%)
["findfirst", "10%", "tx-noterm"] 1.08 (5%) ❌ 1.00 (1%)
["findfirst", "10%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "20%", "base"] 1.71 (5%) ❌ 1.00 (1%)
["findfirst", "20%", "tx"] 1.10 (5%) ❌ 1.00 (1%)
["findfirst", "20%", "tx-noterm"] 1.07 (5%) ❌ 1.25 (1%) ❌
["findfirst", "20%", "tx-seq"] 1.14 (5%) ❌ 1.03 (1%) ❌
["findfirst", "30%", "base"] 1.49 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx"] 1.13 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "40%", "base"] 1.71 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx"] 1.13 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "50%", "base"] 1.56 (5%) ❌ 1.00 (1%)
["findfirst", "50%", "tx-noterm"] 0.95 (5%) ✅ 1.00 (1%)
["findfirst", "50%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["foreach", "broadcast", "A .= B .+ B'"] 1.22 (5%) ❌ 1.00 (1%)
["foreach", "tx", "A .= B .+ B'"] 1.09 (5%) ❌ 1.00 (1%)
["foreach_seq", "base", "Matrix"] 1.18 (5%) ❌ 1.00 (1%)
["foreach_seq", "base", "Transpose"] 1.13 (5%) ❌ 1.00 (1%)
["foreach_seq", "base", "Vector"] 1.18 (5%) ❌ 1.00 (1%)
["foreach_seq", "tx", "Transpose"] 1.07 (5%) ❌ 1.00 (1%)
["foreach_seq", "tx", "Vector"] 1.31 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 51424.80 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 51736.73 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 49290.65 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.19 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.33 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 1.14 (5%) ❌ 1.00 (1%)
["sort", "F64 (narrow)", "Base"] 0.90 (5%) ✅ 1.00 (1%)
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 1.11 (5%) ❌ 1.00 (1%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 1.06 (5%) ❌ 1.00 (1%)
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 1.06 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "Base"] 1.14 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 1.12 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 1.12 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 1.08 (5%) ❌ 1.00 (1%)
["sort", "sorted", "ThreadsX.QuickSort"] 1.07 (5%) ❌ 1.00 (1%)
["unique", "rand(1:10, 1000000)", "tx"] 1.15 (5%) ❌ 1.00 (1%)
["unique", "rand(1:1000, 1000000)", "tx"] 1.09 (5%) ❌ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Target

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      59157 s          0 s       2791 s      23203 s          0 s
       #2  2095 MHz      48517 s          0 s       2789 s      33995 s          0 s
       
  Memory: 6.764884948730469 GB (2087.2890625 MB free)
  Uptime: 871.0 sec
  Load Avg:  1.32177734375  1.390625  0.93115234375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      82970 s          0 s       3712 s      34788 s          0 s
       #2  2095 MHz      72682 s          0 s       3415 s      45494 s          0 s
       
  Memory: 6.764884948730469 GB (2383.78125 MB free)
  Uptime: 1235.0 sec
  Load Avg:  1.23779296875  1.36328125  1.078125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Target result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 4:10
  • Package commit: f6b2fd
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 3.000 ns (5%)
["findfirst", "0%", "tx"] 25.302 μs (5%) 11.98 KiB (1%) 220
["findfirst", "0%", "tx-noterm"] 20.901 μs (5%) 12.02 KiB (1%) 221
["findfirst", "0%", "tx-seq"] 250.774 ns (5%) 560 bytes (1%) 15
["findfirst", "10%", "base"] 116.807 μs (5%)
["findfirst", "10%", "tx"] 100.406 μs (5%) 14.44 KiB (1%) 271
["findfirst", "10%", "tx-noterm"] 268.116 μs (5%) 33.00 KiB (1%) 612
["findfirst", "10%", "tx-seq"] 68.804 μs (5%) 576 bytes (1%) 16
["findfirst", "20%", "base"] 233.613 μs (5%)
["findfirst", "20%", "tx"] 179.110 μs (5%) 21.42 KiB (1%) 399
["findfirst", "20%", "tx-noterm"] 271.115 μs (5%) 35.42 KiB (1%) 658
["findfirst", "20%", "tx-seq"] 156.809 μs (5%) 576 bytes (1%) 16
["findfirst", "30%", "base"] 305.223 μs (5%)
["findfirst", "30%", "tx"] 252.719 μs (5%) 28.34 KiB (1%) 525
["findfirst", "30%", "tx-noterm"] 298.022 μs (5%) 28.41 KiB (1%) 528
["findfirst", "30%", "tx-seq"] 204.815 μs (5%) 576 bytes (1%) 16
["findfirst", "40%", "base"] 467.230 μs (5%)
["findfirst", "40%", "tx"] 346.122 μs (5%) 35.39 KiB (1%) 656
["findfirst", "40%", "tx-noterm"] 360.923 μs (5%) 35.41 KiB (1%) 656
["findfirst", "40%", "tx-seq"] 272.618 μs (5%) 576 bytes (1%) 16
["findfirst", "50%", "base"] 532.538 μs (5%)
["findfirst", "50%", "tx"] 355.825 μs (5%) 37.84 KiB (1%) 707
["findfirst", "50%", "tx-noterm"] 448.730 μs (5%) 54.08 KiB (1%) 1004
["findfirst", "50%", "tx-seq"] 340.724 μs (5%) 576 bytes (1%) 16
["foreach", "base", "A .= B .+ B'"] 301.017 ms (5%) 26.668 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 233.089 ms (5%) 26.979 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 8.892 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 6.487 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 4.365 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 3.258 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 737.831 μs (5%)
["foreach_seq", "base", "Transpose"] 2.131 ms (5%)
["foreach_seq", "base", "Vector"] 738.531 μs (5%)
["foreach_seq", "tx", "Matrix"] 742.331 μs (5%)
["foreach_seq", "tx", "Transpose"] 1.133 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 821.234 μs (5%)
["foreach_seq_double", "cartesian", "man"] 19.600 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 20.400 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 21.501 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 20.001 μs (5%)
["foreach_seq_double", "linear", "man"] 49.392 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 51.425 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 51.737 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 49.291 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.190 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.200 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 3.078 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 3.078 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.209 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.826 ms (5%) 1.19 MiB (1%) 535
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 629.130 μs (5%) 965.09 KiB (1%) 1225
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 594.927 μs (5%) 1.02 MiB (1%) 1245
["sort", "F64 (wide)", "Base"] 7.189 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 5.804 ms (5%) 1.19 MiB (1%) 564
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 3.958 ms (5%) 1.01 MiB (1%) 2148
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 4.524 ms (5%) 1.39 MiB (1%) 2196
["sort", "I64 (narrow)", "Base"] 128.907 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 119.106 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 118.406 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 117.605 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 7.149 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.758 ms (5%) 1.19 MiB (1%) 555
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 3.510 ms (5%) 1.01 MiB (1%) 2238
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 4.108 ms (5%) 1.40 MiB (1%) 2269
["sort", "reversed", "Base"] 699.432 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.156 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 874.140 μs (5%) 998.75 KiB (1%) 1871
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.350 ms (5%) 1.36 MiB (1%) 1903
["sort", "sorted", "Base"] 668.430 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 866.838 μs (5%) 1.18 MiB (1%) 431
["sort", "sorted", "ThreadsX.QuickSort"] 903.840 μs (5%) 998.75 KiB (1%) 1871
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.062 ms (5%) 1.36 MiB (1%) 1903
["unique", "rand(1:10, 1000000)", "base"] 10.272 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 5.534 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 9.521 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 6.160 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      59157 s          0 s       2791 s      23203 s          0 s
       #2  2095 MHz      48517 s          0 s       2789 s      33995 s          0 s
       
  Memory: 6.764884948730469 GB (2087.2890625 MB free)
  Uptime: 871.0 sec
  Load Avg:  1.32177734375  1.390625  0.93115234375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 4:16
  • Package commit: a573a0
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 3.000 ns (5%)
["findfirst", "0%", "tx"] 24.501 μs (5%) 11.97 KiB (1%) 219
["findfirst", "0%", "tx-noterm"] 20.800 μs (5%) 11.97 KiB (1%) 218
["findfirst", "0%", "tx-seq"] 207.545 ns (5%) 544 bytes (1%) 14
["findfirst", "10%", "base"] 78.503 μs (5%)
["findfirst", "10%", "tx"] 87.809 μs (5%) 14.36 KiB (1%) 266
["findfirst", "10%", "tx-noterm"] 248.625 μs (5%) 32.89 KiB (1%) 602
["findfirst", "10%", "tx-seq"] 68.803 μs (5%) 560 bytes (1%) 15
["findfirst", "20%", "base"] 137.013 μs (5%)
["findfirst", "20%", "tx"] 162.516 μs (5%) 21.34 KiB (1%) 394
["findfirst", "20%", "tx-noterm"] 253.424 μs (5%) 28.31 KiB (1%) 522
["findfirst", "20%", "tx-seq"] 137.013 μs (5%) 560 bytes (1%) 15
["findfirst", "30%", "base"] 204.608 μs (5%)
["findfirst", "30%", "tx"] 223.409 μs (5%) 28.27 KiB (1%) 520
["findfirst", "30%", "tx-noterm"] 293.111 μs (5%) 28.28 KiB (1%) 520
["findfirst", "30%", "tx-seq"] 204.608 μs (5%) 560 bytes (1%) 15
["findfirst", "40%", "base"] 273.211 μs (5%)
["findfirst", "40%", "tx"] 307.512 μs (5%) 35.31 KiB (1%) 651
["findfirst", "40%", "tx-noterm"] 345.514 μs (5%) 35.31 KiB (1%) 650
["findfirst", "40%", "tx-seq"] 272.310 μs (5%) 560 bytes (1%) 15
["findfirst", "50%", "base"] 341.614 μs (5%)
["findfirst", "50%", "tx"] 347.514 μs (5%) 37.70 KiB (1%) 698
["findfirst", "50%", "tx-noterm"] 472.519 μs (5%) 53.91 KiB (1%) 993
["findfirst", "50%", "tx-seq"] 340.413 μs (5%) 560 bytes (1%) 15
["foreach", "base", "A .= B .+ B'"] 293.353 ms (5%) 32.016 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 224.898 ms (5%) 30.661 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 7.290 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 6.229 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 4.020 ms (5%) 25.92 KiB (1%) 359
["foreach", "tx", "A .= B .+ C"] 3.200 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 626.143 μs (5%)
["foreach_seq", "base", "Transpose"] 1.894 ms (5%)
["foreach_seq", "base", "Vector"] 626.643 μs (5%)
["foreach_seq", "tx", "Matrix"] 727.349 μs (5%)
["foreach_seq", "tx", "Transpose"] 1.063 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 625.942 μs (5%)
["foreach_seq_double", "cartesian", "man"] 19.701 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 19.501 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 21.601 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 20.101 μs (5%)
["foreach_seq_double", "linear", "man"] 49.393 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 0.001 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 0.001 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 0.001 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.000 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 900.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 3.000 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 2.700 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.455 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.724 ms (5%) 1.19 MiB (1%) 535
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 565.244 μs (5%) 965.14 KiB (1%) 1228
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 592.046 μs (5%) 1.02 MiB (1%) 1247
["sort", "F64 (wide)", "Base"] 6.904 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 5.495 ms (5%) 1.19 MiB (1%) 564
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 3.722 ms (5%) 1.01 MiB (1%) 2147
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 4.487 ms (5%) 1.39 MiB (1%) 2196
["sort", "I64 (narrow)", "Base"] 127.311 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 117.210 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 116.110 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 118.209 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 6.263 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.236 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 3.130 ms (5%) 1.01 MiB (1%) 2238
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 3.799 ms (5%) 1.40 MiB (1%) 2270
["sort", "reversed", "Base"] 697.253 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.109 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 868.766 μs (5%) 998.75 KiB (1%) 1871
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.306 ms (5%) 1.36 MiB (1%) 1904
["sort", "sorted", "Base"] 663.649 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 909.466 μs (5%) 1.18 MiB (1%) 431
["sort", "sorted", "ThreadsX.QuickSort"] 846.562 μs (5%) 998.73 KiB (1%) 1870
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.089 ms (5%) 1.36 MiB (1%) 1903
["unique", "rand(1:10, 1000000)", "base"] 9.820 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 4.800 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 9.247 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 5.666 ms (5%) 1.07 MiB (1%) 1185

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      82970 s          0 s       3712 s      34788 s          0 s
       #2  2095 MHz      72682 s          0 s       3415 s      45494 s          0 s
       
  Memory: 6.764884948730469 GB (2383.78125 MB free)
  Uptime: 1235.0 sec
  Load Avg:  1.23779296875  1.36328125  1.078125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Stepping:            4
CPU MHz:             2095.195
BogoMIPS:            4190.39
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves
Cpu Property Value
Brand Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Vendor :Intel
Architecture :Skylake
Model Family: 0x06, Model: 0x55, Stepping: 0x04, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 1024, 36608) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 512 bit = 64 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@tkf
Copy link
Copy Markdown
Owner

tkf commented Jun 28, 2020

45e2127 and a105f08 are equivalent but the actions are not run for the former (presumably) because it was created via a bot.

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmarks:
    • Target: 28 Jun 2020 - 08:27
    • Baseline: 28 Jun 2020 - 08:33
  • Package commits:
    • Target: 92741a
    • Baseline: 7362ea
  • Julia commits:
    • Target: 44fa15
    • Baseline: 44fa15
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["findfirst", "0%", "tx-noterm"] 1.12 (5%) ❌ 1.00 (1%)
["findfirst", "0%", "tx-seq"] 1.05 (5%) ❌ 1.03 (1%) ❌
["findfirst", "10%", "tx-noterm"] 1.00 (5%) 0.88 (1%) ✅
["findfirst", "10%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "20%", "tx-noterm"] 1.03 (5%) 1.20 (1%) ❌
["findfirst", "20%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "30%", "tx-noterm"] 1.11 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "40%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "50%", "tx-noterm"] 1.01 (5%) 0.92 (1%) ✅
["findfirst", "50%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.36 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.46 (5%) ❌ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Target

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2394 MHz      55709 s          0 s       2581 s      31474 s          0 s
       #2  2394 MHz      53016 s          0 s       2985 s      34671 s          0 s
       
  Memory: 6.764884948730469 GB (2154.42578125 MB free)
  Uptime: 922.0 sec
  Load Avg:  1.34423828125  1.291015625  0.87158203125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, haswell)

Baseline

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2394 MHz      77596 s          0 s       3239 s      45302 s          0 s
       #2  2394 MHz      78760 s          0 s       3834 s      44446 s          0 s
       
  Memory: 6.764884948730469 GB (2446.9453125 MB free)
  Uptime: 1288.0 sec
  Load Avg:  1.24267578125  1.3291015625  1.0322265625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, haswell)

Target result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 8:27
  • Package commit: 92741a
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 3.200 ns (5%)
["findfirst", "0%", "tx"] 25.100 μs (5%) 11.97 KiB (1%) 219
["findfirst", "0%", "tx-noterm"] 24.000 μs (5%) 12.02 KiB (1%) 221
["findfirst", "0%", "tx-seq"] 244.749 ns (5%) 560 bytes (1%) 15
["findfirst", "10%", "base"] 68.200 μs (5%)
["findfirst", "10%", "tx"] 73.600 μs (5%) 14.44 KiB (1%) 271
["findfirst", "10%", "tx-noterm"] 191.300 μs (5%) 33.00 KiB (1%) 609
["findfirst", "10%", "tx-seq"] 68.700 μs (5%) 576 bytes (1%) 16
["findfirst", "20%", "base"] 135.900 μs (5%)
["findfirst", "20%", "tx"] 131.300 μs (5%) 21.42 KiB (1%) 399
["findfirst", "20%", "tx-noterm"] 199.201 μs (5%) 28.41 KiB (1%) 528
["findfirst", "20%", "tx-seq"] 136.400 μs (5%) 576 bytes (1%) 16
["findfirst", "30%", "base"] 203.601 μs (5%)
["findfirst", "30%", "tx"] 177.901 μs (5%) 28.34 KiB (1%) 525
["findfirst", "30%", "tx-noterm"] 223.801 μs (5%) 28.41 KiB (1%) 528
["findfirst", "30%", "tx-seq"] 204.300 μs (5%) 576 bytes (1%) 16
["findfirst", "40%", "base"] 271.900 μs (5%)
["findfirst", "40%", "tx"] 250.101 μs (5%) 35.39 KiB (1%) 656
["findfirst", "40%", "tx-noterm"] 241.001 μs (5%) 35.42 KiB (1%) 657
["findfirst", "40%", "tx-seq"] 272.000 μs (5%) 576 bytes (1%) 16
["findfirst", "50%", "base"] 339.801 μs (5%)
["findfirst", "50%", "tx"] 282.101 μs (5%) 37.84 KiB (1%) 707
["findfirst", "50%", "tx-noterm"] 308.301 μs (5%) 49.45 KiB (1%) 921
["findfirst", "50%", "tx-seq"] 339.901 μs (5%) 576 bytes (1%) 16
["foreach", "base", "A .= B .+ B'"] 381.991 ms (5%) 26.560 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 244.282 ms (5%) 27.969 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 18.395 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 7.900 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 9.632 ms (5%) 25.92 KiB (1%) 359
["foreach", "tx", "A .= B .+ C"] 6.511 ms (5%) 12.73 KiB (1%) 123
["foreach_seq", "base", "Matrix"] 651.501 μs (5%)
["foreach_seq", "base", "Transpose"] 2.306 ms (5%)
["foreach_seq", "base", "Vector"] 651.201 μs (5%)
["foreach_seq", "tx", "Matrix"] 656.101 μs (5%)
["foreach_seq", "tx", "Transpose"] 1.028 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 651.101 μs (5%)
["foreach_seq_double", "cartesian", "man"] 23.200 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 23.100 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 22.900 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 23.200 μs (5%)
["foreach_seq_double", "linear", "man"] 100.636 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 100.000 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 97.031 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 100.751 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 2.044 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 2.050 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 2.978 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 2.989 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.465 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.794 ms (5%) 1.19 MiB (1%) 534
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 1.651 ms (5%) 965.11 KiB (1%) 1226
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 1.634 ms (5%) 1.02 MiB (1%) 1246
["sort", "F64 (wide)", "Base"] 5.941 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 5.244 ms (5%) 1.19 MiB (1%) 562
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 5.104 ms (5%) 1.01 MiB (1%) 2144
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 5.910 ms (5%) 1.39 MiB (1%) 2194
["sort", "I64 (narrow)", "Base"] 145.100 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 148.701 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 149.001 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 149.100 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 6.023 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.545 ms (5%) 1.19 MiB (1%) 553
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 4.289 ms (5%) 1.01 MiB (1%) 2236
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 5.239 ms (5%) 1.40 MiB (1%) 2272
["sort", "reversed", "Base"] 772.501 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.285 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 1.181 ms (5%) 998.75 KiB (1%) 1871
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.602 ms (5%) 1.36 MiB (1%) 1904
["sort", "sorted", "Base"] 718.902 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 911.901 μs (5%) 1.18 MiB (1%) 431
["sort", "sorted", "ThreadsX.QuickSort"] 1.193 ms (5%) 998.75 KiB (1%) 1871
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.337 ms (5%) 1.36 MiB (1%) 1904
["unique", "rand(1:10, 1000000)", "base"] 9.655 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 5.126 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 8.828 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 5.520 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2394 MHz      55709 s          0 s       2581 s      31474 s          0 s
       #2  2394 MHz      53016 s          0 s       2985 s      34671 s          0 s
       
  Memory: 6.764884948730469 GB (2154.42578125 MB free)
  Uptime: 922.0 sec
  Load Avg:  1.34423828125  1.291015625  0.87158203125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, haswell)

Baseline result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 8:33
  • Package commit: 7362ea
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 3.200 ns (5%)
["findfirst", "0%", "tx"] 24.300 μs (5%) 11.95 KiB (1%) 218
["findfirst", "0%", "tx-noterm"] 21.500 μs (5%) 11.97 KiB (1%) 218
["findfirst", "0%", "tx-seq"] 232.194 ns (5%) 544 bytes (1%) 14
["findfirst", "10%", "base"] 68.400 μs (5%)
["findfirst", "10%", "tx"] 72.100 μs (5%) 14.36 KiB (1%) 266
["findfirst", "10%", "tx-noterm"] 190.701 μs (5%) 37.53 KiB (1%) 690
["findfirst", "10%", "tx-seq"] 68.600 μs (5%) 560 bytes (1%) 15
["findfirst", "20%", "base"] 136.201 μs (5%)
["findfirst", "20%", "tx"] 131.700 μs (5%) 21.34 KiB (1%) 394
["findfirst", "20%", "tx-noterm"] 192.701 μs (5%) 23.64 KiB (1%) 435
["findfirst", "20%", "tx-seq"] 136.500 μs (5%) 560 bytes (1%) 15
["findfirst", "30%", "base"] 204.201 μs (5%)
["findfirst", "30%", "tx"] 174.801 μs (5%) 28.28 KiB (1%) 521
["findfirst", "30%", "tx-noterm"] 202.301 μs (5%) 28.30 KiB (1%) 521
["findfirst", "30%", "tx-seq"] 204.301 μs (5%) 560 bytes (1%) 15
["findfirst", "40%", "base"] 271.802 μs (5%)
["findfirst", "40%", "tx"] 250.702 μs (5%) 35.31 KiB (1%) 651
["findfirst", "40%", "tx-noterm"] 243.301 μs (5%) 35.33 KiB (1%) 651
["findfirst", "40%", "tx-seq"] 272.001 μs (5%) 560 bytes (1%) 15
["findfirst", "50%", "base"] 339.702 μs (5%)
["findfirst", "50%", "tx"] 275.802 μs (5%) 37.67 KiB (1%) 696
["findfirst", "50%", "tx-noterm"] 304.702 μs (5%) 53.83 KiB (1%) 990
["findfirst", "50%", "tx-seq"] 339.802 μs (5%) 560 bytes (1%) 15
["foreach", "base", "A .= B .+ B'"] 400.623 ms (5%) 37.565 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 243.054 ms (5%) 36.683 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 18.336 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 7.781 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 9.804 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 6.620 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 651.202 μs (5%)
["foreach_seq", "base", "Transpose"] 2.220 ms (5%)
["foreach_seq", "base", "Vector"] 651.203 μs (5%)
["foreach_seq", "tx", "Matrix"] 655.602 μs (5%)
["foreach_seq", "tx", "Transpose"] 983.604 μs (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 651.402 μs (5%)
["foreach_seq_double", "cartesian", "man"] 23.101 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 22.800 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 22.600 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 23.100 μs (5%)
["foreach_seq_double", "linear", "man"] 100.742 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 100.000 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 100.000 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 100.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.500 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.400 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 2.900 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 2.900 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.372 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.790 ms (5%) 1.19 MiB (1%) 535
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 1.637 ms (5%) 965.11 KiB (1%) 1226
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 1.647 ms (5%) 1.02 MiB (1%) 1245
["sort", "F64 (wide)", "Base"] 5.935 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 5.237 ms (5%) 1.19 MiB (1%) 564
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 5.075 ms (5%) 1.01 MiB (1%) 2145
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 5.921 ms (5%) 1.39 MiB (1%) 2195
["sort", "I64 (narrow)", "Base"] 143.501 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 147.101 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 147.201 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 147.000 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 6.038 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.497 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 4.225 ms (5%) 1.01 MiB (1%) 2237
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 4.997 ms (5%) 1.40 MiB (1%) 2271
["sort", "reversed", "Base"] 773.204 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.265 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 1.192 ms (5%) 998.75 KiB (1%) 1871
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.629 ms (5%) 1.36 MiB (1%) 1904
["sort", "sorted", "Base"] 715.503 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 900.404 μs (5%) 1.18 MiB (1%) 430
["sort", "sorted", "ThreadsX.QuickSort"] 1.191 ms (5%) 998.77 KiB (1%) 1872
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.341 ms (5%) 1.36 MiB (1%) 1903
["unique", "rand(1:10, 1000000)", "base"] 9.532 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 5.169 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 8.873 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 5.548 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2394 MHz      77596 s          0 s       3239 s      45302 s          0 s
       #2  2394 MHz      78760 s          0 s       3834 s      44446 s          0 s
       
  Memory: 6.764884948730469 GB (2446.9453125 MB free)
  Uptime: 1288.0 sec
  Load Avg:  1.24267578125  1.3291015625  1.0322265625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, haswell)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Stepping:            2
CPU MHz:             2394.453
BogoMIPS:            4788.90
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            30720K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single pti fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt md_clear
Cpu Property Value
Brand Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Vendor :Intel
Architecture :Haswell
Model Family: 0x06, Model: 0x3f, Stepping: 0x02, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 256, 30720) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmarks:
    • Target: 28 Jun 2020 - 09:01
    • Baseline: 28 Jun 2020 - 09:08
  • Package commits:
    • Target: 1b4b6f
    • Baseline: 7362ea
  • Julia commits:
    • Target: 44fa15
    • Baseline: 44fa15
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["findfirst", "0%", "tx"] 0.85 (5%) ✅ 1.00 (1%)
["findfirst", "0%", "tx-seq"] 1.06 (5%) ❌ 1.03 (1%) ❌
["findfirst", "10%", "tx-noterm"] 1.02 (5%) 1.21 (1%) ❌
["findfirst", "10%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "20%", "tx-noterm"] 1.07 (5%) ❌ 0.85 (1%) ✅
["findfirst", "20%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "30%", "base"] 1.06 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx-noterm"] 0.92 (5%) ✅ 1.00 (1%)
["findfirst", "30%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "40%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "50%", "tx-noterm"] 1.01 (5%) 1.28 (1%) ❌
["findfirst", "50%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["foreach", "tx", "A .= B .+ B'"] 0.91 (5%) ✅ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 0.92 (5%) ✅ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 0.89 (5%) ✅ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 45502.53 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 45805.86 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 43639.39 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.17 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.33 (5%) ❌ 1.00 (1%)
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.24 (5%) ❌ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Target

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      49712 s          0 s       2562 s      31921 s          0 s
       #2  2095 MHz      57024 s          0 s       2882 s      23988 s          0 s
       
  Memory: 6.7648773193359375 GB (2300.4140625 MB free)
  Uptime: 861.0 sec
  Load Avg:  1.27392578125  1.3271484375  0.89697265625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      76308 s          0 s       3499 s      41498 s          0 s
       #2  2095 MHz      79033 s          0 s       3616 s      38372 s          0 s
       
  Memory: 6.7648773193359375 GB (2367.6484375 MB free)
  Uptime: 1234.0 sec
  Load Avg:  1.35302734375  1.38037109375  1.06787109375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Target result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 9:1
  • Package commit: 1b4b6f
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 2.200 ns (5%)
["findfirst", "0%", "tx"] 19.703 μs (5%) 11.97 KiB (1%) 219
["findfirst", "0%", "tx-noterm"] 19.302 μs (5%) 12.02 KiB (1%) 221
["findfirst", "0%", "tx-seq"] 191.419 ns (5%) 560 bytes (1%) 15
["findfirst", "10%", "base"] 87.608 μs (5%)
["findfirst", "10%", "tx"] 76.707 μs (5%) 14.44 KiB (1%) 271
["findfirst", "10%", "tx-noterm"] 228.821 μs (5%) 39.89 KiB (1%) 736
["findfirst", "10%", "tx-seq"] 88.108 μs (5%) 576 bytes (1%) 16
["findfirst", "20%", "base"] 175.315 μs (5%)
["findfirst", "20%", "tx"] 141.913 μs (5%) 21.41 KiB (1%) 398
["findfirst", "20%", "tx-noterm"] 237.521 μs (5%) 26.08 KiB (1%) 485
["findfirst", "20%", "tx-seq"] 175.915 μs (5%) 576 bytes (1%) 16
["findfirst", "30%", "base"] 282.133 μs (5%)
["findfirst", "30%", "tx"] 202.022 μs (5%) 28.34 KiB (1%) 525
["findfirst", "30%", "tx-noterm"] 261.729 μs (5%) 28.41 KiB (1%) 528
["findfirst", "30%", "tx-seq"] 263.629 μs (5%) 576 bytes (1%) 16
["findfirst", "40%", "base"] 350.434 μs (5%)
["findfirst", "40%", "tx"] 279.326 μs (5%) 35.36 KiB (1%) 654
["findfirst", "40%", "tx-noterm"] 312.029 μs (5%) 35.39 KiB (1%) 655
["findfirst", "40%", "tx-seq"] 351.134 μs (5%) 576 bytes (1%) 16
["findfirst", "50%", "base"] 438.046 μs (5%)
["findfirst", "50%", "tx"] 311.432 μs (5%) 37.81 KiB (1%) 705
["findfirst", "50%", "tx-noterm"] 410.142 μs (5%) 54.08 KiB (1%) 1004
["findfirst", "50%", "tx-seq"] 438.746 μs (5%) 576 bytes (1%) 16
["foreach", "base", "A .= B .+ B'"] 261.138 ms (5%) 33.849 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 198.328 ms (5%) 22.242 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 7.087 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 6.314 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 3.589 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 3.304 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 625.639 μs (5%)
["foreach_seq", "base", "Transpose"] 1.795 ms (5%)
["foreach_seq", "base", "Vector"] 621.138 μs (5%)
["foreach_seq", "tx", "Matrix"] 628.939 μs (5%)
["foreach_seq", "tx", "Transpose"] 1.010 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 625.338 μs (5%)
["foreach_seq_double", "cartesian", "man"] 17.001 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 17.201 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 18.601 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 17.201 μs (5%)
["foreach_seq_double", "linear", "man"] 43.998 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 45.503 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 45.806 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 43.639 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 932.828 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 929.379 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 2.300 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 2.300 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.129 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.353 ms (5%) 1.19 MiB (1%) 534
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 500.534 μs (5%) 965.11 KiB (1%) 1226
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 532.736 μs (5%) 1.02 MiB (1%) 1246
["sort", "F64 (wide)", "Base"] 5.390 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 4.403 ms (5%) 1.19 MiB (1%) 563
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 2.992 ms (5%) 1.01 MiB (1%) 2149
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 3.514 ms (5%) 1.39 MiB (1%) 2195
["sort", "I64 (narrow)", "Base"] 112.809 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 103.708 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 103.307 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 105.307 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 5.541 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 3.681 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 2.729 ms (5%) 1.01 MiB (1%) 2237
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 3.231 ms (5%) 1.40 MiB (1%) 2271
["sort", "reversed", "Base"] 596.341 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 987.165 μs (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 774.652 μs (5%) 998.72 KiB (1%) 1869
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.129 ms (5%) 1.36 MiB (1%) 1902
["sort", "sorted", "Base"] 562.737 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 756.049 μs (5%) 1.18 MiB (1%) 431
["sort", "sorted", "ThreadsX.QuickSort"] 772.651 μs (5%) 998.75 KiB (1%) 1871
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.117 ms (5%) 1.36 MiB (1%) 1904
["unique", "rand(1:10, 1000000)", "base"] 7.884 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 4.101 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 7.150 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 4.352 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      49712 s          0 s       2562 s      31921 s          0 s
       #2  2095 MHz      57024 s          0 s       2882 s      23988 s          0 s
       
  Memory: 6.7648773193359375 GB (2300.4140625 MB free)
  Uptime: 861.0 sec
  Load Avg:  1.27392578125  1.3271484375  0.89697265625
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 9:8
  • Package commit: 7362ea
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 2.200 ns (5%)
["findfirst", "0%", "tx"] 23.101 μs (5%) 11.97 KiB (1%) 219
["findfirst", "0%", "tx-noterm"] 18.501 μs (5%) 11.97 KiB (1%) 218
["findfirst", "0%", "tx-seq"] 181.392 ns (5%) 544 bytes (1%) 14
["findfirst", "10%", "base"] 87.605 μs (5%)
["findfirst", "10%", "tx"] 76.405 μs (5%) 14.36 KiB (1%) 266
["findfirst", "10%", "tx-noterm"] 223.433 μs (5%) 32.89 KiB (1%) 602
["findfirst", "10%", "tx-seq"] 87.805 μs (5%) 560 bytes (1%) 15
["findfirst", "20%", "base"] 178.525 μs (5%)
["findfirst", "20%", "tx"] 145.520 μs (5%) 21.34 KiB (1%) 394
["findfirst", "20%", "tx-noterm"] 221.832 μs (5%) 30.66 KiB (1%) 566
["findfirst", "20%", "tx-seq"] 175.825 μs (5%) 560 bytes (1%) 15
["findfirst", "30%", "base"] 265.916 μs (5%)
["findfirst", "30%", "tx"] 196.012 μs (5%) 28.27 KiB (1%) 520
["findfirst", "30%", "tx-noterm"] 285.417 μs (5%) 28.28 KiB (1%) 520
["findfirst", "30%", "tx-seq"] 263.316 μs (5%) 560 bytes (1%) 15
["findfirst", "40%", "base"] 352.921 μs (5%)
["findfirst", "40%", "tx"] 274.216 μs (5%) 35.31 KiB (1%) 651
["findfirst", "40%", "tx-noterm"] 308.518 μs (5%) 35.28 KiB (1%) 648
["findfirst", "40%", "tx-seq"] 350.921 μs (5%) 560 bytes (1%) 15
["findfirst", "50%", "base"] 442.226 μs (5%)
["findfirst", "50%", "tx"] 312.318 μs (5%) 37.67 KiB (1%) 696
["findfirst", "50%", "tx-noterm"] 405.823 μs (5%) 42.30 KiB (1%) 782
["findfirst", "50%", "tx-seq"] 438.525 μs (5%) 560 bytes (1%) 15
["foreach", "base", "A .= B .+ B'"] 261.270 ms (5%) 28.847 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 195.908 ms (5%) 27.876 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 7.285 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 6.339 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 3.950 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 3.192 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 621.262 μs (5%)
["foreach_seq", "base", "Transpose"] 1.793 ms (5%)
["foreach_seq", "base", "Vector"] 621.362 μs (5%)
["foreach_seq", "tx", "Matrix"] 624.961 μs (5%)
["foreach_seq", "tx", "Transpose"] 1.023 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 625.561 μs (5%)
["foreach_seq_double", "cartesian", "man"] 17.402 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 18.602 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 21.002 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 17.802 μs (5%)
["foreach_seq_double", "linear", "man"] 43.697 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 0.001 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 0.001 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 0.001 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 800.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 700.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 2.300 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 2.300 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.122 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.304 ms (5%) 1.19 MiB (1%) 534
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 500.257 μs (5%) 965.11 KiB (1%) 1226
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 524.358 μs (5%) 1.02 MiB (1%) 1246
["sort", "F64 (wide)", "Base"] 5.414 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 4.349 ms (5%) 1.19 MiB (1%) 564
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 3.021 ms (5%) 1.01 MiB (1%) 2146
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 3.533 ms (5%) 1.39 MiB (1%) 2195
["sort", "I64 (narrow)", "Base"] 111.513 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 105.012 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 100.712 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 103.112 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 5.370 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 3.764 ms (5%) 1.19 MiB (1%) 553
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 2.807 ms (5%) 1.01 MiB (1%) 2231
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 3.225 ms (5%) 1.40 MiB (1%) 2272
["sort", "reversed", "Base"] 596.866 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.031 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 800.188 μs (5%) 998.73 KiB (1%) 1870
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.164 ms (5%) 1.36 MiB (1%) 1904
["sort", "sorted", "Base"] 564.961 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 755.080 μs (5%) 1.18 MiB (1%) 432
["sort", "sorted", "ThreadsX.QuickSort"] 763.581 μs (5%) 998.73 KiB (1%) 1870
["sort", "sorted", "ThreadsX.StableQuickSort"] 900.095 μs (5%) 1.36 MiB (1%) 1902
["unique", "rand(1:10, 1000000)", "base"] 7.911 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 4.072 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 7.281 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 4.292 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      76308 s          0 s       3499 s      41498 s          0 s
       #2  2095 MHz      79033 s          0 s       3616 s      38372 s          0 s
       
  Memory: 6.7648773193359375 GB (2367.6484375 MB free)
  Uptime: 1234.0 sec
  Load Avg:  1.35302734375  1.38037109375  1.06787109375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Stepping:            4
CPU MHz:             2095.244
BogoMIPS:            4190.48
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves
Cpu Property Value
Brand Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Vendor :Intel
Architecture :Skylake
Model Family: 0x06, Model: 0x55, Stepping: 0x04, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 1024, 36608) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 512 bit = 64 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 28, 2020

Codecov Report

Merging #100 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #100   +/-   ##
=======================================
  Coverage   78.88%   78.88%           
=======================================
  Files           8        8           
  Lines         412      412           
=======================================
  Hits          325      325           
  Misses         87       87           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 16d6534...6b1c37b. Read the comment docs.

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmarks:
    • Target: 28 Jun 2020 - 09:32
    • Baseline: 28 Jun 2020 - 09:39
  • Package commits:
    • Target: a67605
    • Baseline: 3dc611
  • Julia commits:
    • Target: 44fa15
    • Baseline: 44fa15
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["findfirst", "0%", "tx"] 1.12 (5%) ❌ 1.00 (1%)
["findfirst", "0%", "tx-noterm"] 1.06 (5%) ❌ 1.01 (1%)
["findfirst", "0%", "tx-seq"] 1.14 (5%) ❌ 1.03 (1%) ❌
["findfirst", "10%", "tx-noterm"] 0.99 (5%) 1.18 (1%) ❌
["findfirst", "10%", "tx-seq"] 0.87 (5%) ✅ 1.03 (1%) ❌
["findfirst", "20%", "tx"] 0.93 (5%) ✅ 1.00 (1%)
["findfirst", "20%", "tx-seq"] 0.87 (5%) ✅ 1.03 (1%) ❌
["findfirst", "30%", "base"] 0.93 (5%) ✅ 1.00 (1%)
["findfirst", "30%", "tx"] 1.08 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx-noterm"] 1.11 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx-seq"] 1.01 (5%) 1.03 (1%) ❌
["findfirst", "40%", "tx"] 1.13 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx-noterm"] 1.06 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx-seq"] 1.15 (5%) ❌ 1.03 (1%) ❌
["findfirst", "50%", "tx-noterm"] 1.06 (5%) ❌ 1.00 (1%)
["findfirst", "50%", "tx-seq"] 1.15 (5%) ❌ 1.03 (1%) ❌
["foreach_seq", "base", "Matrix"] 1.09 (5%) ❌ 1.00 (1%)
["foreach_seq", "tx", "Matrix"] 1.09 (5%) ❌ 1.00 (1%)
["foreach_seq", "tx", "Vector"] 1.05 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 1.09 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 1.11 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "man"] 1.07 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.29 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.29 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 1.07 (5%) ❌ 1.00 (1%)
["sort", "F64 (narrow)", "Base"] 1.08 (5%) ❌ 1.00 (1%)
["sort", "I64 (narrow)", "Base"] 0.94 (5%) ✅ 1.00 (1%)
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 1.11 (5%) ❌ 1.00 (1%)
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 1.11 (5%) ❌ 1.00 (1%)
["sort", "sorted", "Base"] 0.89 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Target

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2397 MHz      63869 s          0 s       3321 s      43746 s          0 s
       #2  2397 MHz      49972 s          0 s       3943 s      55574 s          0 s
       
  Memory: 6.764884948730469 GB (2164.390625 MB free)
  Uptime: 1246.0 sec
  Load Avg:  1.2744140625  1.36083984375  1.064453125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, haswell)

Baseline

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2397 MHz      92998 s          0 s       4428 s      53311 s          0 s
       #2  2397 MHz      72349 s          0 s       4810 s      72168 s          0 s
       
  Memory: 6.764884948730469 GB (2366.59765625 MB free)
  Uptime: 1647.0 sec
  Load Avg:  1.33251953125  1.39794921875  1.1796875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, haswell)

Target result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 9:32
  • Package commit: a67605
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 3.700 ns (5%)
["findfirst", "0%", "tx"] 28.901 μs (5%) 11.97 KiB (1%) 219
["findfirst", "0%", "tx-noterm"] 26.101 μs (5%) 12.02 KiB (1%) 221
["findfirst", "0%", "tx-seq"] 292.969 ns (5%) 560 bytes (1%) 15
["findfirst", "10%", "base"] 78.002 μs (5%)
["findfirst", "10%", "tx"] 85.003 μs (5%) 14.44 KiB (1%) 271
["findfirst", "10%", "tx-noterm"] 217.907 μs (5%) 30.70 KiB (1%) 569
["findfirst", "10%", "tx-seq"] 68.702 μs (5%) 576 bytes (1%) 16
["findfirst", "20%", "base"] 155.804 μs (5%)
["findfirst", "20%", "tx"] 148.304 μs (5%) 21.42 KiB (1%) 399
["findfirst", "20%", "tx-noterm"] 213.306 μs (5%) 23.77 KiB (1%) 443
["findfirst", "20%", "tx-seq"] 136.604 μs (5%) 576 bytes (1%) 16
["findfirst", "30%", "base"] 203.906 μs (5%)
["findfirst", "30%", "tx"] 209.506 μs (5%) 28.34 KiB (1%) 525
["findfirst", "30%", "tx-noterm"] 228.007 μs (5%) 28.42 KiB (1%) 529
["findfirst", "30%", "tx-seq"] 218.207 μs (5%) 576 bytes (1%) 16
["findfirst", "40%", "base"] 271.808 μs (5%)
["findfirst", "40%", "tx"] 296.910 μs (5%) 35.39 KiB (1%) 656
["findfirst", "40%", "tx-noterm"] 287.709 μs (5%) 35.42 KiB (1%) 657
["findfirst", "40%", "tx-seq"] 312.109 μs (5%) 576 bytes (1%) 16
["findfirst", "50%", "base"] 340.010 μs (5%)
["findfirst", "50%", "tx"] 328.510 μs (5%) 37.81 KiB (1%) 705
["findfirst", "50%", "tx-noterm"] 352.311 μs (5%) 40.22 KiB (1%) 754
["findfirst", "50%", "tx-seq"] 391.212 μs (5%) 576 bytes (1%) 16
["foreach", "base", "A .= B .+ B'"] 508.550 ms (5%) 43.270 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 284.659 ms (5%) 45.048 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 70.842 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 8.321 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 35.772 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 6.621 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 708.033 μs (5%)
["foreach_seq", "base", "Transpose"] 3.650 ms (5%)
["foreach_seq", "base", "Vector"] 675.732 μs (5%)
["foreach_seq", "tx", "Matrix"] 714.033 μs (5%)
["foreach_seq", "tx", "Transpose"] 2.268 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 685.932 μs (5%)
["foreach_seq_double", "cartesian", "man"] 23.301 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 25.501 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 22.701 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 25.801 μs (5%)
["foreach_seq_double", "linear", "man"] 107.838 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 99.897 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 104.242 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 100.647 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 2.189 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 2.200 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 3.425 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 3.425 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.633 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.982 ms (5%) 1.19 MiB (1%) 535
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 1.747 ms (5%) 965.09 KiB (1%) 1225
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 1.775 ms (5%) 1.02 MiB (1%) 1245
["sort", "F64 (wide)", "Base"] 6.510 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 5.592 ms (5%) 1.19 MiB (1%) 563
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 5.404 ms (5%) 1.01 MiB (1%) 2145
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 6.309 ms (5%) 1.39 MiB (1%) 2193
["sort", "I64 (narrow)", "Base"] 145.408 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 166.809 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 164.609 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 166.609 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 6.195 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.659 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 4.480 ms (5%) 1.01 MiB (1%) 2236
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 5.434 ms (5%) 1.40 MiB (1%) 2270
["sort", "reversed", "Base"] 800.142 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.310 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 1.280 ms (5%) 998.75 KiB (1%) 1871
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.784 ms (5%) 1.36 MiB (1%) 1902
["sort", "sorted", "Base"] 721.137 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 967.948 μs (5%) 1.18 MiB (1%) 431
["sort", "sorted", "ThreadsX.QuickSort"] 1.294 ms (5%) 998.75 KiB (1%) 1871
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.443 ms (5%) 1.36 MiB (1%) 1904
["unique", "rand(1:10, 1000000)", "base"] 10.354 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 5.588 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 9.594 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 6.031 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2397 MHz      63869 s          0 s       3321 s      43746 s          0 s
       #2  2397 MHz      49972 s          0 s       3943 s      55574 s          0 s
       
  Memory: 6.764884948730469 GB (2164.390625 MB free)
  Uptime: 1246.0 sec
  Load Avg:  1.2744140625  1.36083984375  1.064453125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, haswell)

Baseline result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 9:39
  • Package commit: 3dc611
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 3.600 ns (5%)
["findfirst", "0%", "tx"] 25.701 μs (5%) 11.95 KiB (1%) 218
["findfirst", "0%", "tx-noterm"] 24.701 μs (5%) 11.95 KiB (1%) 217
["findfirst", "0%", "tx-seq"] 257.194 ns (5%) 544 bytes (1%) 14
["findfirst", "10%", "base"] 78.303 μs (5%)
["findfirst", "10%", "tx"] 84.404 μs (5%) 14.36 KiB (1%) 266
["findfirst", "10%", "tx-noterm"] 219.709 μs (5%) 25.94 KiB (1%) 476
["findfirst", "10%", "tx-seq"] 78.603 μs (5%) 560 bytes (1%) 15
["findfirst", "20%", "base"] 156.006 μs (5%)
["findfirst", "20%", "tx"] 160.307 μs (5%) 21.33 KiB (1%) 393
["findfirst", "20%", "tx-noterm"] 209.008 μs (5%) 23.66 KiB (1%) 436
["findfirst", "20%", "tx-seq"] 156.206 μs (5%) 560 bytes (1%) 15
["findfirst", "30%", "base"] 218.109 μs (5%)
["findfirst", "30%", "tx"] 193.308 μs (5%) 28.27 KiB (1%) 520
["findfirst", "30%", "tx-noterm"] 204.909 μs (5%) 28.30 KiB (1%) 521
["findfirst", "30%", "tx-seq"] 217.009 μs (5%) 560 bytes (1%) 15
["findfirst", "40%", "base"] 271.711 μs (5%)
["findfirst", "40%", "tx"] 262.911 μs (5%) 35.31 KiB (1%) 651
["findfirst", "40%", "tx-noterm"] 272.611 μs (5%) 35.30 KiB (1%) 649
["findfirst", "40%", "tx-seq"] 272.211 μs (5%) 560 bytes (1%) 15
["findfirst", "50%", "base"] 339.715 μs (5%)
["findfirst", "50%", "tx"] 318.014 μs (5%) 37.69 KiB (1%) 697
["findfirst", "50%", "tx-noterm"] 331.014 μs (5%) 40.06 KiB (1%) 744
["findfirst", "50%", "tx-seq"] 341.314 μs (5%) 560 bytes (1%) 15
["foreach", "base", "A .= B .+ B'"] 520.405 ms (5%) 53.001 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 290.750 ms (5%) 49.456 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 71.793 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 8.576 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 37.229 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 6.751 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 650.923 μs (5%)
["foreach_seq", "base", "Transpose"] 3.487 ms (5%)
["foreach_seq", "base", "Vector"] 651.123 μs (5%)
["foreach_seq", "tx", "Matrix"] 656.624 μs (5%)
["foreach_seq", "tx", "Transpose"] 2.233 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 651.223 μs (5%)
["foreach_seq_double", "cartesian", "man"] 23.200 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 23.401 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 23.400 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 23.300 μs (5%)
["foreach_seq_double", "linear", "man"] 100.647 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 100.000 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 100.000 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 100.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.700 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.700 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 3.300 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 3.200 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.440 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.874 ms (5%) 1.19 MiB (1%) 535
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 1.702 ms (5%) 965.09 KiB (1%) 1225
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 1.738 ms (5%) 1.02 MiB (1%) 1246
["sort", "F64 (wide)", "Base"] 6.447 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 5.556 ms (5%) 1.19 MiB (1%) 563
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 5.425 ms (5%) 1.01 MiB (1%) 2143
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 6.290 ms (5%) 1.39 MiB (1%) 2194
["sort", "I64 (narrow)", "Base"] 154.305 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 149.706 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 158.706 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 149.706 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 6.253 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.686 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 4.497 ms (5%) 1.01 MiB (1%) 2236
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 5.419 ms (5%) 1.40 MiB (1%) 2272
["sort", "reversed", "Base"] 786.630 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.326 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 1.271 ms (5%) 998.73 KiB (1%) 1870
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.775 ms (5%) 1.36 MiB (1%) 1903
["sort", "sorted", "Base"] 809.330 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 977.135 μs (5%) 1.18 MiB (1%) 430
["sort", "sorted", "ThreadsX.QuickSort"] 1.321 ms (5%) 998.73 KiB (1%) 1870
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.410 ms (5%) 1.36 MiB (1%) 1902
["unique", "rand(1:10, 1000000)", "base"] 10.021 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 5.449 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 9.697 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 5.984 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz: 
              speed         user         nice          sys         idle          irq
       #1  2397 MHz      92998 s          0 s       4428 s      53311 s          0 s
       #2  2397 MHz      72349 s          0 s       4810 s      72168 s          0 s
       
  Memory: 6.764884948730469 GB (2366.59765625 MB free)
  Uptime: 1647.0 sec
  Load Avg:  1.33251953125  1.39794921875  1.1796875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, haswell)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               63
Model name:          Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Stepping:            2
CPU MHz:             2397.220
BogoMIPS:            4794.44
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            30720K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm invpcid_single pti fsgsbase bmi1 avx2 smep bmi2 erms invpcid xsaveopt md_clear
Cpu Property Value
Brand Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz
Vendor :Intel
Architecture :Haswell
Model Family: 0x06, Model: 0x3f, Stepping: 0x02, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 256, 30720) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmarks:
    • Target: 28 Jun 2020 - 09:38
    • Baseline: 28 Jun 2020 - 09:44
  • Package commits:
    • Target: c017c6
    • Baseline: 3dc611
  • Julia commits:
    • Target: 44fa15
    • Baseline: 44fa15
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["findfirst", "0%", "tx-seq"] 1.04 (5%) 1.03 (1%) ❌
["findfirst", "10%", "base"] 1.50 (5%) ❌ 1.00 (1%)
["findfirst", "10%", "tx"] 1.15 (5%) ❌ 1.01 (1%)
["findfirst", "10%", "tx-noterm"] 1.09 (5%) ❌ 1.13 (1%) ❌
["findfirst", "10%", "tx-seq"] 1.50 (5%) ❌ 1.03 (1%) ❌
["findfirst", "20%", "base"] 1.48 (5%) ❌ 1.00 (1%)
["findfirst", "20%", "tx-noterm"] 1.10 (5%) ❌ 1.79 (1%) ❌
["findfirst", "20%", "tx-seq"] 1.65 (5%) ❌ 1.03 (1%) ❌
["findfirst", "30%", "base"] 1.72 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx"] 1.14 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx-noterm"] 1.16 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx-seq"] 1.73 (5%) ❌ 1.03 (1%) ❌
["findfirst", "40%", "base"] 1.54 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx-noterm"] 1.24 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx-seq"] 1.74 (5%) ❌ 1.03 (1%) ❌
["findfirst", "50%", "base"] 1.72 (5%) ❌ 1.00 (1%)
["findfirst", "50%", "tx"] 1.07 (5%) ❌ 1.00 (1%)
["findfirst", "50%", "tx-noterm"] 1.22 (5%) ❌ 0.96 (1%) ✅
["findfirst", "50%", "tx-seq"] 1.57 (5%) ❌ 1.03 (1%) ❌
["foreach", "base", "A .= B .+ B'"] 0.93 (5%) ✅ 1.00 (1%)
["foreach", "base", "A .= B .+ C"] 0.93 (5%) ✅ 1.00 (1%)
["foreach", "broadcast", "A .= B .+ B'"] 1.11 (5%) ❌ 1.00 (1%)
["foreach", "tx", "A .= B .+ B'"] 0.90 (5%) ✅ 1.00 (1%)
["foreach_seq", "base", "Transpose"] 0.85 (5%) ✅ 1.00 (1%)
["foreach_seq", "tx", "Vector"] 1.16 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "man"] 1.28 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 1.09 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 1.19 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 1.09 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 51373.35 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 51678.54 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 49491.87 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.06 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.06 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 0.89 (5%) ✅ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 0.90 (5%) ✅ 1.00 (1%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 1.08 (5%) ❌ 1.00 (1%)
["sort", "F64 (wide)", "Base"] 0.94 (5%) ✅ 1.00 (1%)
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 1.11 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "Base"] 0.91 (5%) ✅ 1.00 (1%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 1.12 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 1.06 (5%) ❌ 1.00 (1%)
["sort", "reversed", "Base"] 1.06 (5%) ❌ 1.00 (1%)
["sort", "reversed", "ThreadsX.MergeSort"] 0.93 (5%) ✅ 1.00 (1%)
["sort", "reversed", "ThreadsX.QuickSort"] 0.95 (5%) ✅ 1.00 (1%)
["sort", "reversed", "ThreadsX.StableQuickSort"] 0.92 (5%) ✅ 1.00 (1%)
["sort", "sorted", "ThreadsX.MergeSort"] 0.86 (5%) ✅ 1.00 (1%)
["unique", "rand(1:1000, 1000000)", "base"] 1.05 (5%) ❌ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Target

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      50617 s          0 s       2256 s      35814 s          0 s
       #2  2095 MHz      58729 s          0 s       3503 s      26468 s          0 s
       
  Memory: 6.764884948730469 GB (2237.3984375 MB free)
  Uptime: 907.0 sec
  Load Avg:  1.32470703125  1.384765625  0.94921875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      76398 s          0 s       3172 s      45497 s          0 s
       #2  2095 MHz      80964 s          0 s       4227 s      39907 s          0 s
       
  Memory: 6.764884948730469 GB (2377.15625 MB free)
  Uptime: 1273.0 sec
  Load Avg:  1.35009765625  1.390625  1.091796875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Target result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 9:38
  • Package commit: c017c6
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 2.600 ns (5%)
["findfirst", "0%", "tx"] 24.901 μs (5%) 11.98 KiB (1%) 220
["findfirst", "0%", "tx-noterm"] 21.400 μs (5%) 12.02 KiB (1%) 221
["findfirst", "0%", "tx-seq"] 223.219 ns (5%) 560 bytes (1%) 15
["findfirst", "10%", "base"] 101.700 μs (5%)
["findfirst", "10%", "tx"] 88.301 μs (5%) 14.44 KiB (1%) 271
["findfirst", "10%", "tx-noterm"] 277.901 μs (5%) 39.91 KiB (1%) 737
["findfirst", "10%", "tx-seq"] 87.900 μs (5%) 576 bytes (1%) 16
["findfirst", "20%", "base"] 203.500 μs (5%)
["findfirst", "20%", "tx"] 171.201 μs (5%) 21.41 KiB (1%) 398
["findfirst", "20%", "tx-noterm"] 294.101 μs (5%) 42.23 KiB (1%) 778
["findfirst", "20%", "tx-seq"] 225.800 μs (5%) 576 bytes (1%) 16
["findfirst", "30%", "base"] 305.201 μs (5%)
["findfirst", "30%", "tx"] 256.201 μs (5%) 28.34 KiB (1%) 525
["findfirst", "30%", "tx-noterm"] 325.802 μs (5%) 28.39 KiB (1%) 527
["findfirst", "30%", "tx-seq"] 306.001 μs (5%) 576 bytes (1%) 16
["findfirst", "40%", "base"] 406.902 μs (5%)
["findfirst", "40%", "tx"] 319.201 μs (5%) 35.39 KiB (1%) 656
["findfirst", "40%", "tx-noterm"] 381.602 μs (5%) 35.44 KiB (1%) 658
["findfirst", "40%", "tx-seq"] 407.701 μs (5%) 576 bytes (1%) 16
["findfirst", "50%", "base"] 508.702 μs (5%)
["findfirst", "50%", "tx"] 386.601 μs (5%) 37.81 KiB (1%) 705
["findfirst", "50%", "tx-noterm"] 487.803 μs (5%) 51.77 KiB (1%) 963
["findfirst", "50%", "tx-seq"] 509.502 μs (5%) 576 bytes (1%) 16
["foreach", "base", "A .= B .+ B'"] 288.770 ms (5%) 24.363 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 221.742 ms (5%) 23.282 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 8.571 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 6.714 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 4.062 ms (5%) 25.92 KiB (1%) 359
["foreach", "tx", "A .= B .+ C"] 3.356 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 626.302 μs (5%)
["foreach_seq", "base", "Transpose"] 1.793 ms (5%)
["foreach_seq", "base", "Vector"] 626.402 μs (5%)
["foreach_seq", "tx", "Matrix"] 629.602 μs (5%)
["foreach_seq", "tx", "Transpose"] 1.106 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 727.002 μs (5%)
["foreach_seq_double", "cartesian", "man"] 22.600 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 22.400 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 26.600 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 22.800 μs (5%)
["foreach_seq_double", "linear", "man"] 49.695 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 51.373 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 51.679 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 49.492 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.060 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 950.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 2.675 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 2.688 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.493 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.890 ms (5%) 1.19 MiB (1%) 535
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 571.902 μs (5%) 965.13 KiB (1%) 1227
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 615.602 μs (5%) 1.02 MiB (1%) 1246
["sort", "F64 (wide)", "Base"] 6.450 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 5.695 ms (5%) 1.19 MiB (1%) 563
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 3.784 ms (5%) 1.01 MiB (1%) 2148
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 4.403 ms (5%) 1.39 MiB (1%) 2195
["sort", "I64 (narrow)", "Base"] 129.701 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 118.701 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 119.600 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 115.201 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 6.389 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.704 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 3.543 ms (5%) 1.01 MiB (1%) 2238
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 3.964 ms (5%) 1.40 MiB (1%) 2270
["sort", "reversed", "Base"] 711.002 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.164 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 930.703 μs (5%) 998.77 KiB (1%) 1872
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.299 ms (5%) 1.36 MiB (1%) 1902
["sort", "sorted", "Base"] 573.402 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 841.302 μs (5%) 1.18 MiB (1%) 432
["sort", "sorted", "ThreadsX.QuickSort"] 888.302 μs (5%) 998.77 KiB (1%) 1872
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.077 ms (5%) 1.36 MiB (1%) 1902
["unique", "rand(1:10, 1000000)", "base"] 10.015 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 5.421 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 9.540 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 5.636 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      50617 s          0 s       2256 s      35814 s          0 s
       #2  2095 MHz      58729 s          0 s       3503 s      26468 s          0 s
       
  Memory: 6.764884948730469 GB (2237.3984375 MB free)
  Uptime: 907.0 sec
  Load Avg:  1.32470703125  1.384765625  0.94921875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 9:44
  • Package commit: 3dc611
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 2.600 ns (5%)
["findfirst", "0%", "tx"] 24.000 μs (5%) 11.97 KiB (1%) 219
["findfirst", "0%", "tx-noterm"] 21.700 μs (5%) 11.97 KiB (1%) 218
["findfirst", "0%", "tx-seq"] 215.306 ns (5%) 544 bytes (1%) 14
["findfirst", "10%", "base"] 67.900 μs (5%)
["findfirst", "10%", "tx"] 76.600 μs (5%) 14.36 KiB (1%) 266
["findfirst", "10%", "tx-noterm"] 255.102 μs (5%) 35.20 KiB (1%) 646
["findfirst", "10%", "tx-seq"] 58.700 μs (5%) 560 bytes (1%) 15
["findfirst", "20%", "base"] 137.201 μs (5%)
["findfirst", "20%", "tx"] 168.102 μs (5%) 21.34 KiB (1%) 394
["findfirst", "20%", "tx-noterm"] 268.402 μs (5%) 23.64 KiB (1%) 435
["findfirst", "20%", "tx-seq"] 137.001 μs (5%) 560 bytes (1%) 15
["findfirst", "30%", "base"] 177.402 μs (5%)
["findfirst", "30%", "tx"] 225.702 μs (5%) 28.27 KiB (1%) 520
["findfirst", "30%", "tx-noterm"] 280.702 μs (5%) 28.28 KiB (1%) 520
["findfirst", "30%", "tx-seq"] 176.601 μs (5%) 560 bytes (1%) 15
["findfirst", "40%", "base"] 264.002 μs (5%)
["findfirst", "40%", "tx"] 311.502 μs (5%) 35.31 KiB (1%) 651
["findfirst", "40%", "tx-noterm"] 307.302 μs (5%) 35.31 KiB (1%) 650
["findfirst", "40%", "tx-seq"] 234.901 μs (5%) 560 bytes (1%) 15
["findfirst", "50%", "base"] 295.202 μs (5%)
["findfirst", "50%", "tx"] 361.703 μs (5%) 37.72 KiB (1%) 699
["findfirst", "50%", "tx-noterm"] 400.803 μs (5%) 53.91 KiB (1%) 993
["findfirst", "50%", "tx-seq"] 324.802 μs (5%) 560 bytes (1%) 15
["foreach", "base", "A .= B .+ B'"] 311.899 ms (5%) 33.243 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 239.605 ms (5%) 33.841 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 7.754 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 6.690 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 4.519 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 3.348 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 626.003 μs (5%)
["foreach_seq", "base", "Transpose"] 2.102 ms (5%)
["foreach_seq", "base", "Vector"] 625.803 μs (5%)
["foreach_seq", "tx", "Matrix"] 625.803 μs (5%)
["foreach_seq", "tx", "Transpose"] 1.083 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 626.203 μs (5%)
["foreach_seq_double", "cartesian", "man"] 17.700 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 20.600 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 22.300 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 21.000 μs (5%)
["foreach_seq_double", "linear", "man"] 49.593 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 0.001 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 0.001 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 0.001 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.000 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 900.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 3.000 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 3.000 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.475 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.681 ms (5%) 1.19 MiB (1%) 535
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 564.903 μs (5%) 965.14 KiB (1%) 1228
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 598.503 μs (5%) 1.02 MiB (1%) 1245
["sort", "F64 (wide)", "Base"] 6.878 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 5.682 ms (5%) 1.19 MiB (1%) 563
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 3.839 ms (5%) 1.01 MiB (1%) 2145
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 4.502 ms (5%) 1.39 MiB (1%) 2196
["sort", "I64 (narrow)", "Base"] 129.201 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 119.700 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 107.401 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 116.301 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 7.055 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.214 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 3.332 ms (5%) 1.01 MiB (1%) 2237
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 4.158 ms (5%) 1.40 MiB (1%) 2272
["sort", "reversed", "Base"] 672.004 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.254 ms (5%) 1.18 MiB (1%) 432
["sort", "reversed", "ThreadsX.QuickSort"] 983.305 μs (5%) 998.72 KiB (1%) 1869
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.415 ms (5%) 1.36 MiB (1%) 1904
["sort", "sorted", "Base"] 588.203 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 977.105 μs (5%) 1.18 MiB (1%) 432
["sort", "sorted", "ThreadsX.QuickSort"] 910.705 μs (5%) 998.75 KiB (1%) 1871
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.094 ms (5%) 1.36 MiB (1%) 1903
["unique", "rand(1:10, 1000000)", "base"] 9.804 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 5.364 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 9.047 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 5.528 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      76398 s          0 s       3172 s      45497 s          0 s
       #2  2095 MHz      80964 s          0 s       4227 s      39907 s          0 s
       
  Memory: 6.764884948730469 GB (2377.15625 MB free)
  Uptime: 1273.0 sec
  Load Avg:  1.35009765625  1.390625  1.091796875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Stepping:            4
CPU MHz:             2095.076
BogoMIPS:            4190.15
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves
Cpu Property Value
Brand Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Vendor :Intel
Architecture :Skylake
Model Family: 0x06, Model: 0x55, Stepping: 0x04, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 1024, 36608) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 512 bit = 64 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@tkf tkf force-pushed the create-pull-request/pkg-update branch from dea8660 to d4939e0 Compare June 28, 2020 09:54
@github-actions
Copy link
Copy Markdown
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmarks:
    • Target: 28 Jun 2020 - 10:07
    • Baseline: 28 Jun 2020 - 10:13
  • Package commits:
    • Target: e30bca
    • Baseline: ab83e0
  • Julia commits:
    • Target: 44fa15
    • Baseline: 44fa15
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["findfirst", "0%", "base"] 0.87 (5%) ✅ 1.00 (1%)
["findfirst", "0%", "tx-seq"] 0.79 (5%) ✅ 1.03 (1%) ❌
["findfirst", "10%", "base"] 1.29 (5%) ❌ 1.00 (1%)
["findfirst", "10%", "tx"] 0.90 (5%) ✅ 1.01 (1%)
["findfirst", "10%", "tx-noterm"] 1.01 (5%) 0.95 (1%) ✅
["findfirst", "10%", "tx-seq"] 1.31 (5%) ❌ 1.03 (1%) ❌
["findfirst", "20%", "base"] 1.48 (5%) ❌ 1.00 (1%)
["findfirst", "20%", "tx"] 1.06 (5%) ❌ 1.00 (1%)
["findfirst", "20%", "tx-noterm"] 1.13 (5%) ❌ 1.25 (1%) ❌
["findfirst", "20%", "tx-seq"] 1.49 (5%) ❌ 1.03 (1%) ❌
["findfirst", "30%", "base"] 1.34 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx-seq"] 1.29 (5%) ❌ 1.03 (1%) ❌
["findfirst", "40%", "base"] 1.28 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx"] 0.91 (5%) ✅ 1.00 (1%)
["findfirst", "40%", "tx-noterm"] 1.05 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx-seq"] 1.29 (5%) ❌ 1.03 (1%) ❌
["findfirst", "50%", "base"] 1.28 (5%) ❌ 1.00 (1%)
["findfirst", "50%", "tx-noterm"] 1.12 (5%) ❌ 0.96 (1%) ✅
["findfirst", "50%", "tx-seq"] 1.29 (5%) ❌ 1.03 (1%) ❌
["foreach_seq", "base", "Matrix"] 1.16 (5%) ❌ 1.00 (1%)
["foreach_seq", "base", "Transpose"] 1.30 (5%) ❌ 1.00 (1%)
["foreach_seq", "base", "Vector"] 1.16 (5%) ❌ 1.00 (1%)
["foreach_seq", "tx", "Matrix"] 1.17 (5%) ❌ 1.00 (1%)
["foreach_seq", "tx", "Transpose"] 1.06 (5%) ❌ 1.00 (1%)
["foreach_seq", "tx", "Vector"] 1.16 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "man"] 1.29 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 1.22 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 1.13 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 1.29 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "man"] 1.13 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 51434.43 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 51741.80 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 49337.41 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.06 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.18 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 0.89 (5%) ✅ 1.00 (1%)
["sort", "F64 (narrow)", "Base"] 1.09 (5%) ❌ 1.00 (1%)
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 1.06 (5%) ❌ 1.00 (1%)
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 1.11 (5%) ❌ 1.00 (1%)
["sort", "F64 (wide)", "Base"] 0.92 (5%) ✅ 1.00 (1%)
["sort", "I64 (wide)", "Base"] 1.14 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 1.13 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 1.10 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 1.16 (5%) ❌ 1.00 (1%)
["sort", "reversed", "Base"] 1.31 (5%) ❌ 1.00 (1%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.07 (5%) ❌ 1.00 (1%)
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.08 (5%) ❌ 1.00 (1%)
["sort", "sorted", "Base"] 1.26 (5%) ❌ 1.00 (1%)
["sort", "sorted", "ThreadsX.QuickSort"] 1.20 (5%) ❌ 1.00 (1%)
["unique", "rand(1:10, 1000000)", "base"] 1.12 (5%) ❌ 1.00 (1%)
["unique", "rand(1:1000, 1000000)", "tx"] 0.90 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Target

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      64405 s          0 s       2815 s      17716 s          0 s
       #2  2095 MHz      43783 s          0 s       2711 s      39216 s          0 s
       
  Memory: 6.764884948730469 GB (2131.84765625 MB free)
  Uptime: 874.0 sec
  Load Avg:  1.2177734375  1.28662109375  0.9033203125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      91778 s          0 s       3767 s      25798 s          0 s
       #2  2095 MHz      64585 s          0 s       3267 s      54301 s          0 s
       
  Memory: 6.764884948730469 GB (2300.484375 MB free)
  Uptime: 1240.0 sec
  Load Avg:  1.46484375  1.3935546875  1.078125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Target result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 10:7
  • Package commit: e30bca
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 2.600 ns (5%)
["findfirst", "0%", "tx"] 23.700 μs (5%) 11.98 KiB (1%) 220
["findfirst", "0%", "tx-noterm"] 23.501 μs (5%) 12.02 KiB (1%) 221
["findfirst", "0%", "tx-seq"] 192.421 ns (5%) 560 bytes (1%) 15
["findfirst", "10%", "base"] 87.700 μs (5%)
["findfirst", "10%", "tx"] 89.800 μs (5%) 14.44 KiB (1%) 271
["findfirst", "10%", "tx-noterm"] 263.702 μs (5%) 39.89 KiB (1%) 736
["findfirst", "10%", "tx-seq"] 102.200 μs (5%) 576 bytes (1%) 16
["findfirst", "20%", "base"] 203.500 μs (5%)
["findfirst", "20%", "tx"] 187.201 μs (5%) 21.41 KiB (1%) 398
["findfirst", "20%", "tx-noterm"] 302.202 μs (5%) 35.42 KiB (1%) 658
["findfirst", "20%", "tx-seq"] 204.301 μs (5%) 576 bytes (1%) 16
["findfirst", "30%", "base"] 275.802 μs (5%)
["findfirst", "30%", "tx"] 247.602 μs (5%) 28.34 KiB (1%) 525
["findfirst", "30%", "tx-noterm"] 304.402 μs (5%) 28.39 KiB (1%) 527
["findfirst", "30%", "tx-seq"] 263.602 μs (5%) 576 bytes (1%) 16
["findfirst", "40%", "base"] 350.402 μs (5%)
["findfirst", "40%", "tx"] 315.801 μs (5%) 35.38 KiB (1%) 655
["findfirst", "40%", "tx-noterm"] 365.001 μs (5%) 35.45 KiB (1%) 659
["findfirst", "40%", "tx-seq"] 351.501 μs (5%) 576 bytes (1%) 16
["findfirst", "50%", "base"] 438.102 μs (5%)
["findfirst", "50%", "tx"] 409.202 μs (5%) 37.81 KiB (1%) 705
["findfirst", "50%", "tx-noterm"] 510.102 μs (5%) 51.75 KiB (1%) 962
["findfirst", "50%", "tx-seq"] 439.103 μs (5%) 576 bytes (1%) 16
["foreach", "base", "A .= B .+ B'"] 289.800 ms (5%) 25.065 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 222.702 ms (5%) 34.203 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 8.137 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 6.451 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 4.300 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 3.271 ms (5%) 12.73 KiB (1%) 123
["foreach_seq", "base", "Matrix"] 726.602 μs (5%)
["foreach_seq", "base", "Transpose"] 2.271 ms (5%)
["foreach_seq", "base", "Vector"] 726.302 μs (5%)
["foreach_seq", "tx", "Matrix"] 730.702 μs (5%)
["foreach_seq", "tx", "Transpose"] 1.067 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 726.702 μs (5%)
["foreach_seq_double", "cartesian", "man"] 22.500 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 22.500 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 24.600 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 22.900 μs (5%)
["foreach_seq_double", "linear", "man"] 49.642 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 51.434 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 51.742 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 49.337 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 950.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.064 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 2.678 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 2.678 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.688 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 3.054 ms (5%) 1.19 MiB (1%) 534
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 626.502 μs (5%) 965.13 KiB (1%) 1227
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 659.102 μs (5%) 1.02 MiB (1%) 1246
["sort", "F64 (wide)", "Base"] 6.283 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 5.723 ms (5%) 1.19 MiB (1%) 564
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 3.791 ms (5%) 1.01 MiB (1%) 2148
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 4.588 ms (5%) 1.39 MiB (1%) 2196
["sort", "I64 (narrow)", "Base"] 134.201 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 124.800 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 120.300 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 117.800 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 6.836 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.844 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 3.525 ms (5%) 1.01 MiB (1%) 2237
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 4.197 ms (5%) 1.40 MiB (1%) 2269
["sort", "reversed", "Base"] 791.203 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.205 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 938.004 μs (5%) 998.75 KiB (1%) 1871
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.396 ms (5%) 1.36 MiB (1%) 1904
["sort", "sorted", "Base"] 751.103 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 889.903 μs (5%) 1.18 MiB (1%) 432
["sort", "sorted", "ThreadsX.QuickSort"] 948.903 μs (5%) 998.73 KiB (1%) 1870
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.086 ms (5%) 1.36 MiB (1%) 1903
["unique", "rand(1:10, 1000000)", "base"] 10.234 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 5.227 ms (5%) 50.97 KiB (1%) 881
["unique", "rand(1:1000, 1000000)", "base"] 8.587 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 5.050 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      64405 s          0 s       2815 s      17716 s          0 s
       #2  2095 MHz      43783 s          0 s       2711 s      39216 s          0 s
       
  Memory: 6.764884948730469 GB (2131.84765625 MB free)
  Uptime: 874.0 sec
  Load Avg:  1.2177734375  1.28662109375  0.9033203125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 10:13
  • Package commit: ab83e0
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 3.000 ns (5%)
["findfirst", "0%", "tx"] 23.800 μs (5%) 11.95 KiB (1%) 218
["findfirst", "0%", "tx-noterm"] 23.701 μs (5%) 11.97 KiB (1%) 218
["findfirst", "0%", "tx-seq"] 243.765 ns (5%) 544 bytes (1%) 14
["findfirst", "10%", "base"] 68.100 μs (5%)
["findfirst", "10%", "tx"] 99.701 μs (5%) 14.36 KiB (1%) 266
["findfirst", "10%", "tx-noterm"] 261.302 μs (5%) 42.08 KiB (1%) 769
["findfirst", "10%", "tx-seq"] 78.300 μs (5%) 560 bytes (1%) 15
["findfirst", "20%", "base"] 137.801 μs (5%)
["findfirst", "20%", "tx"] 177.401 μs (5%) 21.34 KiB (1%) 394
["findfirst", "20%", "tx-noterm"] 268.202 μs (5%) 28.31 KiB (1%) 522
["findfirst", "20%", "tx-seq"] 136.801 μs (5%) 560 bytes (1%) 15
["findfirst", "30%", "base"] 205.901 μs (5%)
["findfirst", "30%", "tx"] 251.002 μs (5%) 28.27 KiB (1%) 520
["findfirst", "30%", "tx-noterm"] 303.402 μs (5%) 28.31 KiB (1%) 522
["findfirst", "30%", "tx-seq"] 205.002 μs (5%) 560 bytes (1%) 15
["findfirst", "40%", "base"] 274.103 μs (5%)
["findfirst", "40%", "tx"] 347.103 μs (5%) 35.33 KiB (1%) 652
["findfirst", "40%", "tx-noterm"] 347.502 μs (5%) 35.31 KiB (1%) 650
["findfirst", "40%", "tx-seq"] 272.802 μs (5%) 560 bytes (1%) 15
["findfirst", "50%", "base"] 343.602 μs (5%)
["findfirst", "50%", "tx"] 392.104 μs (5%) 37.70 KiB (1%) 698
["findfirst", "50%", "tx-noterm"] 453.904 μs (5%) 53.91 KiB (1%) 993
["findfirst", "50%", "tx-seq"] 340.403 μs (5%) 560 bytes (1%) 15
["foreach", "base", "A .= B .+ B'"] 295.067 ms (5%) 33.031 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 233.477 ms (5%) 33.039 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 8.396 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 6.425 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 4.311 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 3.245 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 625.103 μs (5%)
["foreach_seq", "base", "Transpose"] 1.749 ms (5%)
["foreach_seq", "base", "Vector"] 625.403 μs (5%)
["foreach_seq", "tx", "Matrix"] 623.404 μs (5%)
["foreach_seq", "tx", "Transpose"] 1.002 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 625.403 μs (5%)
["foreach_seq_double", "cartesian", "man"] 17.500 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 18.400 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 21.700 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 17.700 μs (5%)
["foreach_seq_double", "linear", "man"] 43.820 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 0.001 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 0.001 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 0.001 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 900.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 900.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 3.000 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 2.600 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.475 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 3.004 ms (5%) 1.19 MiB (1%) 535
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 589.504 μs (5%) 965.11 KiB (1%) 1226
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 591.203 μs (5%) 1.02 MiB (1%) 1247
["sort", "F64 (wide)", "Base"] 6.820 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 5.732 ms (5%) 1.19 MiB (1%) 563
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 3.858 ms (5%) 1.01 MiB (1%) 2145
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 4.453 ms (5%) 1.39 MiB (1%) 2196
["sort", "I64 (narrow)", "Base"] 130.600 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 120.901 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 120.901 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 119.301 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 5.980 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.290 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 3.205 ms (5%) 1.01 MiB (1%) 2237
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 3.629 ms (5%) 1.40 MiB (1%) 2271
["sort", "reversed", "Base"] 603.503 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.122 ms (5%) 1.18 MiB (1%) 433
["sort", "reversed", "ThreadsX.QuickSort"] 905.406 μs (5%) 998.73 KiB (1%) 1870
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.293 ms (5%) 1.36 MiB (1%) 1903
["sort", "sorted", "Base"] 596.104 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 873.004 μs (5%) 1.18 MiB (1%) 432
["sort", "sorted", "ThreadsX.QuickSort"] 791.404 μs (5%) 998.77 KiB (1%) 1872
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.059 ms (5%) 1.36 MiB (1%) 1904
["unique", "rand(1:10, 1000000)", "base"] 9.169 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 5.207 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 8.536 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 5.638 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      91778 s          0 s       3767 s      25798 s          0 s
       #2  2095 MHz      64585 s          0 s       3267 s      54301 s          0 s
       
  Memory: 6.764884948730469 GB (2300.484375 MB free)
  Uptime: 1240.0 sec
  Load Avg:  1.46484375  1.3935546875  1.078125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Stepping:            4
CPU MHz:             2095.076
BogoMIPS:            4190.15
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves
Cpu Property Value
Brand Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Vendor :Intel
Architecture :Skylake
Model Family: 0x06, Model: 0x55, Stepping: 0x04, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 1024, 36608) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 512 bit = 64 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmarks:
    • Target: 28 Jun 2020 - 10:08
    • Baseline: 28 Jun 2020 - 10:14
  • Package commits:
    • Target: f20254
    • Baseline: ab83e0
  • Julia commits:
    • Target: 44fa15
    • Baseline: 44fa15
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["findfirst", "0%", "base"] 1.11 (5%) ❌ 1.00 (1%)
["findfirst", "0%", "tx"] 1.19 (5%) ❌ 1.00 (1%)
["findfirst", "0%", "tx-noterm"] 1.20 (5%) ❌ 1.00 (1%)
["findfirst", "0%", "tx-seq"] 1.08 (5%) ❌ 1.03 (1%) ❌
["findfirst", "10%", "tx-noterm"] 1.05 (5%) 0.67 (1%) ✅
["findfirst", "10%", "tx-seq"] 1.01 (5%) 1.03 (1%) ❌
["findfirst", "20%", "base"] 1.06 (5%) ❌ 1.00 (1%)
["findfirst", "20%", "tx-noterm"] 1.04 (5%) 0.67 (1%) ✅
["findfirst", "20%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "30%", "tx"] 1.10 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx-noterm"] 1.06 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "40%", "base"] 0.89 (5%) ✅ 1.00 (1%)
["findfirst", "40%", "tx-noterm"] 1.10 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx-seq"] 1.00 (5%) 1.03 (1%) ❌
["findfirst", "50%", "base"] 0.89 (5%) ✅ 1.00 (1%)
["findfirst", "50%", "tx"] 0.94 (5%) ✅ 1.00 (1%)
["findfirst", "50%", "tx-noterm"] 0.99 (5%) 1.24 (1%) ❌
["findfirst", "50%", "tx-seq"] 0.89 (5%) ✅ 1.03 (1%) ❌
["foreach", "base", "A .= B .+ B'"] 1.12 (5%) ❌ 1.00 (1%)
["foreach", "base", "A .= B .+ C"] 1.13 (5%) ❌ 1.00 (1%)
["foreach", "broadcast", "A .= B .+ B'"] 1.18 (5%) ❌ 1.00 (1%)
["foreach", "broadcast", "A .= B .+ C"] 1.08 (5%) ❌ 1.00 (1%)
["foreach_seq", "base", "Vector"] 1.10 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "cartesian", "man"] 0.87 (5%) ✅ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 0.88 (5%) ✅ 1.00 (1%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 0.90 (5%) ✅ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 1.22 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 1.16 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 1.19 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.20 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.12 (5%) ❌ 1.00 (1%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 1.08 (5%) ❌ 1.00 (1%)
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 1.08 (5%) ❌ 1.00 (1%)
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 1.08 (5%) ❌ 1.00 (1%)
["sort", "F64 (wide)", "Base"] 1.17 (5%) ❌ 1.00 (1%)
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 1.06 (5%) ❌ 1.00 (1%)
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 1.09 (5%) ❌ 1.00 (1%)
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 1.05 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 1.09 (5%) ❌ 1.00 (1%)
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 1.07 (5%) ❌ 1.00 (1%)
["sort", "reversed", "Base"] 1.06 (5%) ❌ 1.00 (1%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.11 (5%) ❌ 1.00 (1%)
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.10 (5%) ❌ 1.00 (1%)
["sort", "sorted", "ThreadsX.MergeSort"] 1.12 (5%) ❌ 1.00 (1%)
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.14 (5%) ❌ 1.00 (1%)
["unique", "rand(1:10, 1000000)", "base"] 0.93 (5%) ✅ 1.00 (1%)
["unique", "rand(1:10, 1000000)", "tx"] 0.94 (5%) ✅ 1.00 (1%)
["unique", "rand(1:1000, 1000000)", "base"] 0.94 (5%) ✅ 1.00 (1%)
["unique", "rand(1:1000, 1000000)", "tx"] 0.94 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Target

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      55017 s          0 s       2779 s      27222 s          0 s
       #2  2294 MHz      53062 s          0 s       2752 s      29621 s          0 s
       
  Memory: 6.7648773193359375 GB (2159.6875 MB free)
  Uptime: 870.0 sec
  Load Avg:  1.3271484375  1.365234375  0.91357421875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, broadwell)

Baseline

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      80553 s          0 s       3419 s      36362 s          0 s
       #2  2294 MHz      74134 s          0 s       3585 s      42995 s          0 s
       
  Memory: 6.7648773193359375 GB (2369.9609375 MB free)
  Uptime: 1225.0 sec
  Load Avg:  1.36767578125  1.42919921875  1.08935546875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, broadwell)

Target result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 10:8
  • Package commit: f20254
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 3.100 ns (5%)
["findfirst", "0%", "tx"] 26.400 μs (5%) 11.97 KiB (1%) 219
["findfirst", "0%", "tx-noterm"] 23.700 μs (5%) 12.02 KiB (1%) 221
["findfirst", "0%", "tx-seq"] 214.437 ns (5%) 560 bytes (1%) 15
["findfirst", "10%", "base"] 59.900 μs (5%)
["findfirst", "10%", "tx"] 71.600 μs (5%) 14.44 KiB (1%) 271
["findfirst", "10%", "tx-noterm"] 186.601 μs (5%) 28.36 KiB (1%) 525
["findfirst", "10%", "tx-seq"] 59.801 μs (5%) 576 bytes (1%) 16
["findfirst", "20%", "base"] 124.800 μs (5%)
["findfirst", "20%", "tx"] 130.400 μs (5%) 21.42 KiB (1%) 399
["findfirst", "20%", "tx-noterm"] 186.801 μs (5%) 28.41 KiB (1%) 528
["findfirst", "20%", "tx-seq"] 118.100 μs (5%) 576 bytes (1%) 16
["findfirst", "30%", "base"] 176.501 μs (5%)
["findfirst", "30%", "tx"] 184.601 μs (5%) 28.34 KiB (1%) 525
["findfirst", "30%", "tx-noterm"] 206.901 μs (5%) 28.41 KiB (1%) 528
["findfirst", "30%", "tx-seq"] 176.801 μs (5%) 576 bytes (1%) 16
["findfirst", "40%", "base"] 234.502 μs (5%)
["findfirst", "40%", "tx"] 247.601 μs (5%) 35.39 KiB (1%) 656
["findfirst", "40%", "tx-noterm"] 267.601 μs (5%) 35.44 KiB (1%) 658
["findfirst", "40%", "tx-seq"] 235.101 μs (5%) 576 bytes (1%) 16
["findfirst", "50%", "base"] 293.102 μs (5%)
["findfirst", "50%", "tx"] 275.902 μs (5%) 37.84 KiB (1%) 707
["findfirst", "50%", "tx-noterm"] 323.101 μs (5%) 49.47 KiB (1%) 922
["findfirst", "50%", "tx-seq"] 293.301 μs (5%) 576 bytes (1%) 16
["foreach", "base", "A .= B .+ B'"] 450.178 ms (5%) 39.736 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 235.401 ms (5%) 26.476 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 15.711 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 9.340 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 7.662 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 4.988 ms (5%) 12.77 KiB (1%) 125
["foreach_seq", "base", "Matrix"] 630.902 μs (5%)
["foreach_seq", "base", "Transpose"] 2.041 ms (5%)
["foreach_seq", "base", "Vector"] 672.702 μs (5%)
["foreach_seq", "tx", "Matrix"] 597.402 μs (5%)
["foreach_seq", "tx", "Transpose"] 993.703 μs (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 594.001 μs (5%)
["foreach_seq_double", "cartesian", "man"] 20.300 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 23.100 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 20.100 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 20.500 μs (5%)
["foreach_seq_double", "linear", "man"] 118.661 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 121.547 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 116.356 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 118.880 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 2.289 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 2.122 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 2.837 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 2.850 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.406 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.874 ms (5%) 1.19 MiB (1%) 535
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 1.713 ms (5%) 965.09 KiB (1%) 1225
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 1.738 ms (5%) 1.02 MiB (1%) 1245
["sort", "F64 (wide)", "Base"] 6.081 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 4.926 ms (5%) 1.19 MiB (1%) 562
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 5.072 ms (5%) 1.01 MiB (1%) 2146
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 6.030 ms (5%) 1.39 MiB (1%) 2194
["sort", "I64 (narrow)", "Base"] 141.600 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 146.900 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 147.600 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 144.401 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 5.746 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.552 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 4.387 ms (5%) 1.01 MiB (1%) 2236
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 4.707 ms (5%) 1.40 MiB (1%) 2270
["sort", "reversed", "Base"] 763.102 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.351 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 1.221 ms (5%) 998.75 KiB (1%) 1871
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.720 ms (5%) 1.36 MiB (1%) 1905
["sort", "sorted", "Base"] 710.602 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 949.503 μs (5%) 1.18 MiB (1%) 432
["sort", "sorted", "ThreadsX.QuickSort"] 1.227 ms (5%) 998.77 KiB (1%) 1872
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.373 ms (5%) 1.36 MiB (1%) 1904
["unique", "rand(1:10, 1000000)", "base"] 8.473 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 4.554 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 7.903 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 4.849 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      55017 s          0 s       2779 s      27222 s          0 s
       #2  2294 MHz      53062 s          0 s       2752 s      29621 s          0 s
       
  Memory: 6.7648773193359375 GB (2159.6875 MB free)
  Uptime: 870.0 sec
  Load Avg:  1.3271484375  1.365234375  0.91357421875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, broadwell)

Baseline result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 10:14
  • Package commit: ab83e0
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 2.800 ns (5%)
["findfirst", "0%", "tx"] 22.200 μs (5%) 11.97 KiB (1%) 219
["findfirst", "0%", "tx-noterm"] 19.801 μs (5%) 11.97 KiB (1%) 218
["findfirst", "0%", "tx-seq"] 198.328 ns (5%) 544 bytes (1%) 14
["findfirst", "10%", "base"] 59.501 μs (5%)
["findfirst", "10%", "tx"] 68.801 μs (5%) 14.36 KiB (1%) 266
["findfirst", "10%", "tx-noterm"] 178.201 μs (5%) 42.16 KiB (1%) 774
["findfirst", "10%", "tx-seq"] 59.301 μs (5%) 560 bytes (1%) 15
["findfirst", "20%", "base"] 118.101 μs (5%)
["findfirst", "20%", "tx"] 124.801 μs (5%) 21.34 KiB (1%) 394
["findfirst", "20%", "tx-noterm"] 180.201 μs (5%) 42.14 KiB (1%) 772
["findfirst", "20%", "tx-seq"] 118.101 μs (5%) 560 bytes (1%) 15
["findfirst", "30%", "base"] 176.002 μs (5%)
["findfirst", "30%", "tx"] 168.001 μs (5%) 28.27 KiB (1%) 520
["findfirst", "30%", "tx-noterm"] 195.502 μs (5%) 28.30 KiB (1%) 521
["findfirst", "30%", "tx-seq"] 176.502 μs (5%) 560 bytes (1%) 15
["findfirst", "40%", "base"] 264.003 μs (5%)
["findfirst", "40%", "tx"] 236.203 μs (5%) 35.31 KiB (1%) 651
["findfirst", "40%", "tx-noterm"] 242.502 μs (5%) 35.31 KiB (1%) 650
["findfirst", "40%", "tx-seq"] 235.102 μs (5%) 560 bytes (1%) 15
["findfirst", "50%", "base"] 329.804 μs (5%)
["findfirst", "50%", "tx"] 293.803 μs (5%) 37.72 KiB (1%) 699
["findfirst", "50%", "tx-noterm"] 326.903 μs (5%) 40.00 KiB (1%) 740
["findfirst", "50%", "tx-seq"] 329.903 μs (5%) 560 bytes (1%) 15
["foreach", "base", "A .= B .+ B'"] 402.293 ms (5%) 32.999 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 207.896 ms (5%) 31.466 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 13.285 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 8.679 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 7.589 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 4.864 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 611.704 μs (5%)
["foreach_seq", "base", "Transpose"] 2.071 ms (5%)
["foreach_seq", "base", "Vector"] 611.904 μs (5%)
["foreach_seq", "tx", "Matrix"] 597.303 μs (5%)
["foreach_seq", "tx", "Transpose"] 990.206 μs (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 593.503 μs (5%)
["foreach_seq_double", "cartesian", "man"] 23.400 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 23.400 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 22.900 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 22.900 μs (5%)
["foreach_seq_double", "linear", "man"] 116.356 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 100.000 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 100.000 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 100.000 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.900 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.900 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 2.900 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 2.900 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.340 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 2.671 ms (5%) 1.19 MiB (1%) 535
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 1.581 ms (5%) 965.11 KiB (1%) 1226
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 1.610 ms (5%) 1.02 MiB (1%) 1243
["sort", "F64 (wide)", "Base"] 5.192 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 4.918 ms (5%) 1.19 MiB (1%) 563
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 4.801 ms (5%) 1.01 MiB (1%) 2146
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 5.523 ms (5%) 1.39 MiB (1%) 2194
["sort", "I64 (narrow)", "Base"] 142.101 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 141.201 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 144.301 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 137.401 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 5.722 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.185 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 4.086 ms (5%) 1.01 MiB (1%) 2238
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 4.712 ms (5%) 1.40 MiB (1%) 2271
["sort", "reversed", "Base"] 722.905 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.216 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 1.190 ms (5%) 998.75 KiB (1%) 1871
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.569 ms (5%) 1.36 MiB (1%) 1905
["sort", "sorted", "Base"] 684.804 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 848.105 μs (5%) 1.18 MiB (1%) 431
["sort", "sorted", "ThreadsX.QuickSort"] 1.182 ms (5%) 998.75 KiB (1%) 1871
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.203 ms (5%) 1.36 MiB (1%) 1904
["unique", "rand(1:10, 1000000)", "base"] 9.141 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 4.821 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 8.374 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 5.144 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz: 
              speed         user         nice          sys         idle          irq
       #1  2294 MHz      80553 s          0 s       3419 s      36362 s          0 s
       #2  2294 MHz      74134 s          0 s       3585 s      42995 s          0 s
       
  Memory: 6.7648773193359375 GB (2369.9609375 MB free)
  Uptime: 1225.0 sec
  Load Avg:  1.36767578125  1.42919921875  1.08935546875
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, broadwell)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               79
Model name:          Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Stepping:            1
CPU MHz:             2294.687
BogoMIPS:            4589.37
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            51200K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt md_clear
Cpu Property Value
Brand Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Vendor :Intel
Architecture :Broadwell
Model Family: 0x06, Model: 0x4f, Stepping: 0x01, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 256, 51200) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 256 bit = 32 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@github-actions
Copy link
Copy Markdown
Contributor

Benchmark result

Judge result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmarks:
    • Target: 28 Jun 2020 - 10:42
    • Baseline: 28 Jun 2020 - 10:48
  • Package commits:
    • Target: ba9d14
    • Baseline: 16d653
  • Julia commits:
    • Target: 44fa15
    • Baseline: 44fa15
  • Julia command flags:
    • Target: None
    • Baseline: None
  • Environment variables:
    • Target: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2
    • Baseline: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

A ratio greater than 1.0 denotes a possible regression (marked with ❌), while a ratio less
than 1.0 denotes a possible improvement (marked with ✅). Only significant results - results
that indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).

ID time ratio memory ratio
["findfirst", "0%", "tx"] 1.07 (5%) ❌ 1.00 (1%)
["findfirst", "0%", "tx-seq"] 1.06 (5%) ❌ 1.03 (1%) ❌
["findfirst", "10%", "base"] 1.47 (5%) ❌ 1.00 (1%)
["findfirst", "10%", "tx-noterm"] 0.98 (5%) 0.89 (1%) ✅
["findfirst", "10%", "tx-seq"] 1.49 (5%) ❌ 1.03 (1%) ❌
["findfirst", "20%", "base"] 1.49 (5%) ❌ 1.00 (1%)
["findfirst", "20%", "tx"] 1.06 (5%) ❌ 1.00 (1%)
["findfirst", "20%", "tx-noterm"] 1.07 (5%) ❌ 0.82 (1%) ✅
["findfirst", "20%", "tx-seq"] 1.49 (5%) ❌ 1.03 (1%) ❌
["findfirst", "30%", "base"] 1.49 (5%) ❌ 1.00 (1%)
["findfirst", "30%", "tx-seq"] 1.50 (5%) ❌ 1.03 (1%) ❌
["findfirst", "40%", "base"] 1.49 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx"] 1.06 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx-noterm"] 1.18 (5%) ❌ 1.00 (1%)
["findfirst", "40%", "tx-seq"] 1.50 (5%) ❌ 1.03 (1%) ❌
["findfirst", "50%", "base"] 1.49 (5%) ❌ 1.00 (1%)
["findfirst", "50%", "tx"] 1.06 (5%) ❌ 1.00 (1%)
["findfirst", "50%", "tx-noterm"] 1.12 (5%) ❌ 1.23 (1%) ❌
["findfirst", "50%", "tx-seq"] 1.50 (5%) ❌ 1.03 (1%) ❌
["foreach", "base", "A .= B .+ C"] 0.94 (5%) ✅ 1.00 (1%)
["foreach", "broadcast", "A .= B .+ B'"] 1.09 (5%) ❌ 1.00 (1%)
["foreach", "tx", "A .= B .+ B'"] 1.07 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 59003.05 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 59409.97 (5%) ❌ 1.00 (1%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 56619.14 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.19 (5%) ❌ 1.00 (1%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.18 (5%) ❌ 1.00 (1%)
["sort", "sorted", "ThreadsX.MergeSort"] 0.91 (5%) ✅ 1.00 (1%)
["sort", "sorted", "ThreadsX.StableQuickSort"] 0.92 (5%) ✅ 1.00 (1%)

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Target

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      55628 s          0 s       2695 s      28277 s          0 s
       #2  2095 MHz      53489 s          0 s       2870 s      30110 s          0 s
       
  Memory: 6.7648773193359375 GB (1956.9140625 MB free)
  Uptime: 887.0 sec
  Load Avg:  1.44970703125  1.3740234375  0.94189453125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      78821 s          0 s       3564 s      40951 s          0 s
       #2  2095 MHz      78706 s          0 s       3580 s      40955 s          0 s
       
  Memory: 6.7648773193359375 GB (2308.09375 MB free)
  Uptime: 1256.0 sec
  Load Avg:  1.296875  1.4072265625  1.1005859375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Target result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 10:42
  • Package commit: ba9d14
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 3.000 ns (5%)
["findfirst", "0%", "tx"] 25.700 μs (5%) 11.97 KiB (1%) 219
["findfirst", "0%", "tx-noterm"] 21.100 μs (5%) 12.02 KiB (1%) 221
["findfirst", "0%", "tx-seq"] 255.829 ns (5%) 560 bytes (1%) 15
["findfirst", "10%", "base"] 116.800 μs (5%)
["findfirst", "10%", "tx"] 102.301 μs (5%) 14.44 KiB (1%) 271
["findfirst", "10%", "tx-noterm"] 260.202 μs (5%) 37.55 KiB (1%) 692
["findfirst", "10%", "tx-seq"] 117.200 μs (5%) 576 bytes (1%) 16
["findfirst", "20%", "base"] 233.701 μs (5%)
["findfirst", "20%", "tx"] 187.301 μs (5%) 21.41 KiB (1%) 398
["findfirst", "20%", "tx-noterm"] 286.902 μs (5%) 30.72 KiB (1%) 570
["findfirst", "20%", "tx-seq"] 234.402 μs (5%) 576 bytes (1%) 16
["findfirst", "30%", "base"] 350.403 μs (5%)
["findfirst", "30%", "tx"] 258.002 μs (5%) 28.34 KiB (1%) 525
["findfirst", "30%", "tx-noterm"] 306.802 μs (5%) 28.41 KiB (1%) 528
["findfirst", "30%", "tx-seq"] 351.203 μs (5%) 576 bytes (1%) 16
["findfirst", "40%", "base"] 467.203 μs (5%)
["findfirst", "40%", "tx"] 360.402 μs (5%) 35.39 KiB (1%) 656
["findfirst", "40%", "tx-noterm"] 414.103 μs (5%) 35.41 KiB (1%) 656
["findfirst", "40%", "tx-seq"] 468.003 μs (5%) 576 bytes (1%) 16
["findfirst", "50%", "base"] 584.004 μs (5%)
["findfirst", "50%", "tx"] 418.704 μs (5%) 37.81 KiB (1%) 705
["findfirst", "50%", "tx-noterm"] 524.104 μs (5%) 49.45 KiB (1%) 921
["findfirst", "50%", "tx-seq"] 584.804 μs (5%) 576 bytes (1%) 16
["foreach", "base", "A .= B .+ B'"] 306.849 ms (5%) 26.435 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 223.956 ms (5%) 25.539 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 8.729 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 6.636 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 4.601 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 3.367 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 827.504 μs (5%)
["foreach_seq", "base", "Transpose"] 2.312 ms (5%)
["foreach_seq", "base", "Vector"] 828.304 μs (5%)
["foreach_seq", "tx", "Matrix"] 838.004 μs (5%)
["foreach_seq", "tx", "Transpose"] 1.277 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 828.403 μs (5%)
["foreach_seq_double", "cartesian", "man"] 22.600 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 22.600 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 25.100 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 22.600 μs (5%)
["foreach_seq_double", "linear", "man"] 56.707 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 59.003 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 59.410 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 56.619 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.190 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.180 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 3.075 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 3.075 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.835 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 3.091 ms (5%) 1.19 MiB (1%) 534
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 635.104 μs (5%) 965.14 KiB (1%) 1228
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 666.104 μs (5%) 1.02 MiB (1%) 1246
["sort", "F64 (wide)", "Base"] 7.195 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 5.779 ms (5%) 1.19 MiB (1%) 563
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 3.858 ms (5%) 1.01 MiB (1%) 2147
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 4.399 ms (5%) 1.39 MiB (1%) 2197
["sort", "I64 (narrow)", "Base"] 143.001 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 128.701 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 128.301 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 130.801 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 7.152 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.733 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 3.527 ms (5%) 1.01 MiB (1%) 2236
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 4.235 ms (5%) 1.40 MiB (1%) 2270
["sort", "reversed", "Base"] 783.704 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.214 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 935.505 μs (5%) 998.73 KiB (1%) 1870
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.359 ms (5%) 1.36 MiB (1%) 1904
["sort", "sorted", "Base"] 746.403 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 876.905 μs (5%) 1.18 MiB (1%) 432
["sort", "sorted", "ThreadsX.QuickSort"] 932.105 μs (5%) 998.75 KiB (1%) 1871
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.084 ms (5%) 1.36 MiB (1%) 1904
["unique", "rand(1:10, 1000000)", "base"] 10.473 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 5.449 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 9.543 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 5.748 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      55628 s          0 s       2695 s      28277 s          0 s
       #2  2095 MHz      53489 s          0 s       2870 s      30110 s          0 s
       
  Memory: 6.7648773193359375 GB (1956.9140625 MB free)
  Uptime: 887.0 sec
  Load Avg:  1.44970703125  1.3740234375  0.94189453125
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Baseline result

Benchmark Report for /home/runner/work/ThreadsX.jl/ThreadsX.jl

Job Properties

  • Time of benchmark: 28 Jun 2020 - 10:48
  • Package commit: 16d653
  • Julia commit: 44fa15
  • Julia command flags: None
  • Environment variables: OMP_NUM_THREADS => 1 JULIA_NUM_THREADS => 2

Results

Below is a table of this job's results, obtained by running the benchmarks.
The values listed in the ID column have the structure [parent_group, child_group, ..., key], and can be used to
index into the BaseBenchmarks suite to retrieve the corresponding benchmarks.
The percentages accompanying time and memory values in the below table are noise tolerances. The "true"
time/memory value for a given benchmark is expected to fall within this percentage of the reported value.
An empty cell means that the value was zero.

ID time GC time memory allocations
["findfirst", "0%", "base"] 3.000 ns (5%)
["findfirst", "0%", "tx"] 24.100 μs (5%) 11.95 KiB (1%) 218
["findfirst", "0%", "tx-noterm"] 21.300 μs (5%) 11.97 KiB (1%) 218
["findfirst", "0%", "tx-seq"] 240.921 ns (5%) 544 bytes (1%) 14
["findfirst", "10%", "base"] 79.201 μs (5%)
["findfirst", "10%", "tx"] 99.701 μs (5%) 14.36 KiB (1%) 266
["findfirst", "10%", "tx-noterm"] 265.403 μs (5%) 42.17 KiB (1%) 775
["findfirst", "10%", "tx-seq"] 78.701 μs (5%) 560 bytes (1%) 15
["findfirst", "20%", "base"] 156.901 μs (5%)
["findfirst", "20%", "tx"] 177.303 μs (5%) 21.34 KiB (1%) 394
["findfirst", "20%", "tx-noterm"] 267.703 μs (5%) 37.63 KiB (1%) 692
["findfirst", "20%", "tx-seq"] 156.801 μs (5%) 560 bytes (1%) 15
["findfirst", "30%", "base"] 234.601 μs (5%)
["findfirst", "30%", "tx"] 252.701 μs (5%) 28.27 KiB (1%) 520
["findfirst", "30%", "tx-noterm"] 303.701 μs (5%) 28.30 KiB (1%) 521
["findfirst", "30%", "tx-seq"] 234.701 μs (5%) 560 bytes (1%) 15
["findfirst", "40%", "base"] 312.601 μs (5%)
["findfirst", "40%", "tx"] 341.605 μs (5%) 35.33 KiB (1%) 652
["findfirst", "40%", "tx-noterm"] 350.604 μs (5%) 35.30 KiB (1%) 649
["findfirst", "40%", "tx-seq"] 312.501 μs (5%) 560 bytes (1%) 15
["findfirst", "50%", "base"] 391.402 μs (5%)
["findfirst", "50%", "tx"] 394.802 μs (5%) 37.70 KiB (1%) 698
["findfirst", "50%", "tx-noterm"] 469.302 μs (5%) 40.05 KiB (1%) 743
["findfirst", "50%", "tx-seq"] 390.402 μs (5%) 560 bytes (1%) 15
["foreach", "base", "A .= B .+ B'"] 298.805 ms (5%) 32.717 ms 305.18 MiB (1%) 16000002
["foreach", "base", "A .= B .+ C"] 238.537 ms (5%) 33.139 ms 305.18 MiB (1%) 16000001
["foreach", "broadcast", "A .= B .+ B'"] 8.026 ms (5%)
["foreach", "broadcast", "A .= B .+ C"] 6.568 ms (5%)
["foreach", "tx", "A .= B .+ B'"] 4.290 ms (5%) 25.94 KiB (1%) 360
["foreach", "tx", "A .= B .+ C"] 3.364 ms (5%) 12.75 KiB (1%) 124
["foreach_seq", "base", "Matrix"] 826.907 μs (5%)
["foreach_seq", "base", "Transpose"] 2.315 ms (5%)
["foreach_seq", "base", "Vector"] 827.807 μs (5%)
["foreach_seq", "tx", "Matrix"] 829.107 μs (5%)
["foreach_seq", "tx", "Transpose"] 1.262 ms (5%) 16 bytes (1%) 1
["foreach_seq", "tx", "Vector"] 828.306 μs (5%)
["foreach_seq_double", "cartesian", "man"] 22.700 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => :ivdep"] 22.700 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => false"] 26.400 μs (5%)
["foreach_seq_double", "cartesian", "tx", ":simd => true"] 22.700 μs (5%)
["foreach_seq_double", "linear", "man"] 56.707 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => :ivdep"] 0.001 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => false"] 0.001 ns (5%)
["foreach_seq_double", "linear", "tx", ":simd => true"] 0.001 ns (5%)
["foreach_seq_sum_many", ":nvecs => 8", "man"] 1.000 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => :ivdep"] 1.000 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => false"] 3.000 μs (5%)
["foreach_seq_sum_many", ":nvecs => 8", "tx", ":simd => true"] 3.000 μs (5%)
["sort", "F64 (narrow)", "Base"] 2.821 ms (5%)
["sort", "F64 (narrow)", "ThreadsX.MergeSort"] 3.026 ms (5%) 1.19 MiB (1%) 535
["sort", "F64 (narrow)", "ThreadsX.QuickSort"] 634.705 μs (5%) 965.13 KiB (1%) 1227
["sort", "F64 (narrow)", "ThreadsX.StableQuickSort"] 657.206 μs (5%) 1.02 MiB (1%) 1247
["sort", "F64 (wide)", "Base"] 7.076 ms (5%)
["sort", "F64 (wide)", "ThreadsX.MergeSort"] 5.836 ms (5%) 1.19 MiB (1%) 564
["sort", "F64 (wide)", "ThreadsX.QuickSort"] 3.887 ms (5%) 1.01 MiB (1%) 2148
["sort", "F64 (wide)", "ThreadsX.StableQuickSort"] 4.552 ms (5%) 1.39 MiB (1%) 2197
["sort", "I64 (narrow)", "Base"] 146.402 μs (5%) 160 bytes (1%) 1
["sort", "I64 (narrow)", "ThreadsX.MergeSort"] 133.701 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.QuickSort"] 132.702 μs (5%) 864 bytes (1%) 17
["sort", "I64 (narrow)", "ThreadsX.StableQuickSort"] 132.901 μs (5%) 864 bytes (1%) 17
["sort", "I64 (wide)", "Base"] 7.153 ms (5%)
["sort", "I64 (wide)", "ThreadsX.MergeSort"] 4.814 ms (5%) 1.19 MiB (1%) 554
["sort", "I64 (wide)", "ThreadsX.QuickSort"] 3.637 ms (5%) 1.01 MiB (1%) 2238
["sort", "I64 (wide)", "ThreadsX.StableQuickSort"] 4.143 ms (5%) 1.40 MiB (1%) 2271
["sort", "reversed", "Base"] 782.907 μs (5%)
["sort", "reversed", "ThreadsX.MergeSort"] 1.257 ms (5%) 1.18 MiB (1%) 435
["sort", "reversed", "ThreadsX.QuickSort"] 945.408 μs (5%) 998.73 KiB (1%) 1870
["sort", "reversed", "ThreadsX.StableQuickSort"] 1.418 ms (5%) 1.36 MiB (1%) 1903
["sort", "sorted", "Base"] 743.306 μs (5%)
["sort", "sorted", "ThreadsX.MergeSort"] 963.408 μs (5%) 1.18 MiB (1%) 432
["sort", "sorted", "ThreadsX.QuickSort"] 914.807 μs (5%) 998.77 KiB (1%) 1872
["sort", "sorted", "ThreadsX.StableQuickSort"] 1.183 ms (5%) 1.36 MiB (1%) 1902
["unique", "rand(1:10, 1000000)", "base"] 10.285 ms (5%) 832 bytes (1%) 8
["unique", "rand(1:10, 1000000)", "tx"] 5.439 ms (5%) 50.98 KiB (1%) 882
["unique", "rand(1:1000, 1000000)", "base"] 9.478 ms (5%) 65.95 KiB (1%) 27
["unique", "rand(1:1000, 1000000)", "tx"] 5.658 ms (5%) 1.07 MiB (1%) 1186

Benchmark Group List

Here's a list of all the benchmark groups executed by this job:

  • ["findfirst", "0%"]
  • ["findfirst", "10%"]
  • ["findfirst", "20%"]
  • ["findfirst", "30%"]
  • ["findfirst", "40%"]
  • ["findfirst", "50%"]
  • ["foreach", "base"]
  • ["foreach", "broadcast"]
  • ["foreach", "tx"]
  • ["foreach_seq", "base"]
  • ["foreach_seq", "tx"]
  • ["foreach_seq_double", "cartesian"]
  • ["foreach_seq_double", "cartesian", "tx"]
  • ["foreach_seq_double", "linear"]
  • ["foreach_seq_double", "linear", "tx"]
  • ["foreach_seq_sum_many", ":nvecs => 8"]
  • ["foreach_seq_sum_many", ":nvecs => 8", "tx"]
  • ["sort", "F64 (narrow)"]
  • ["sort", "F64 (wide)"]
  • ["sort", "I64 (narrow)"]
  • ["sort", "I64 (wide)"]
  • ["sort", "reversed"]
  • ["sort", "sorted"]
  • ["unique", "rand(1:10, 1000000)"]
  • ["unique", "rand(1:1000, 1000000)"]

Julia versioninfo

Julia Version 1.4.2
Commit 44fa15b150* (2020-05-23 18:35 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
      Ubuntu 18.04.4 LTS
  uname: Linux 5.3.0-1028-azure #29~18.04.1-Ubuntu SMP Fri Jun 5 14:32:34 UTC 2020 x86_64 x86_64
  CPU: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz: 
              speed         user         nice          sys         idle          irq
       #1  2095 MHz      78821 s          0 s       3564 s      40951 s          0 s
       #2  2095 MHz      78706 s          0 s       3580 s      40955 s          0 s
       
  Memory: 6.7648773193359375 GB (2308.09375 MB free)
  Uptime: 1256.0 sec
  Load Avg:  1.296875  1.4072265625  1.1005859375
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-8.0.1 (ORCJIT, skylake)

Runtime information

Runtime Info
BLAS #threads 2
BLAS.vendor() openblas64
Sys.CPU_THREADS 2

lscpu output:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  2
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               85
Model name:          Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Stepping:            4
CPU MHz:             2095.079
BogoMIPS:            4190.15
Hypervisor vendor:   Microsoft
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            1024K
L3 cache:            36608K
NUMA node0 CPU(s):   0,1
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology cpuid pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt avx512cd avx512bw avx512vl xsaveopt xsavec xsaves
Cpu Property Value
Brand Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz
Vendor :Intel
Architecture :Skylake
Model Family: 0x06, Model: 0x55, Stepping: 0x04, Type: 0x00
Cores 2 physical cores, 2 logical cores (on executing CPU)
No Hyperthreading detected
Clock Frequencies Not supported by CPU
Data Cache Level 1:3 : (32, 1024, 36608) kbytes
64 byte cache line size
Address Size 48 bits virtual, 44 bits physical
SIMD 512 bit = 64 byte max. SIMD vector size
Time Stamp Counter TSC is accessible via rdtsc
TSC increased at every clock cycle (non-invariant TSC)
Perf. Monitoring Performance Monitoring Counters (PMC) are not supported
Hypervisor Yes, Microsoft

@mergify mergify Bot merged commit 0dcc034 into master Jun 28, 2020
@mergify mergify Bot deleted the create-pull-request/pkg-update branch June 28, 2020 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants