Skip to content

Slow disk IO on MacOS (potentially Windows too) #35907

@IanButterworth

Description

@IanButterworth

Benchmarks reports from SystemBenchmark.jl that have been community submitted have shown that disk IO performance is impeded on MacOS and potentially Windows too.

Example 1: A same-hardware test on a 2015 Macbook Pro, booted into Arch Linux and MacOS shows mostly much worse disk IO in MacOS
The tests are in milliseconds here (except peakflops), and the factor is test / res, so higher numbers indicate that the test (MacOS) is running slower than the ref (Arch Linux)

│ Row │ cat         │ testname          │ units   │ ref_res                                   │ test_res                                  │ factor    │
│     │ String      │ String            │ String? │ Any                                       │ Any                                       │ Any       │
├─────┼─────────────┼───────────────────┼─────────┼───────────────────────────────────────────┼───────────────────────────────────────────┼───────────┤
│ 1   │ info        │ SysBenchVer       │ missing │ 0.2.0                                     │ 0.2.0                                     │ Equal     │
│ 2   │ info        │ JuliaVer          │ missing │ 1.4.1                                     │ 1.4.1                                     │ Equal     │
│ 3   │ info        │ OS                │ missing │ Linux (x86_64-pc-linux-gnu)               │ macOS (x86_64-apple-darwin18.7.0)         │ Not Equal │
│ 4   │ info        │ CPU               │ missing │ Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz │ Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz │ Equal     │
│ 5   │ info        │ WORD_SIZE         │ missing │ 64                                        │ 64                                        │ Equal     │
│ 6   │ info        │ LIBM              │ missing │ libopenlibm                               │ libopenlibm                               │ Equal     │
│ 7   │ info        │ LLVM              │ missing │ libLLVM-8.0.1 (ORCJIT, haswell)           │ libLLVM-8.0.1 (ORCJIT, haswell)           │ Equal     │
│ 8   │ info        │ GPU               │ missing │ missing                                   │ missing                                   │ Not Equal │
│ 9   │ cpu         │ FloatMul          │ missing │ 1.3759999999999998e-6                     │ 1.3879999999999999e-6                     │ 1.00872   │
│ 10  │ cpu         │ FusedMulAdd       │ missing │ 1.4e-8                                    │ 3.1e-8                                    │ 2.21429   │
│ 11  │ cpu         │ FloatSin          │ missing │ 5.1799999999999995e-6                     │ 5.191e-6                                  │ 1.00212   │
│ 12  │ cpu         │ VecMulBroad       │ missing │ 3.6742943548387095e-5                     │ 5.503947368421053e-5                      │ 1.49796   │
│ 13  │ cpu         │ CPUMatMul         │ missing │ 0.022055                                  │ 0.0228935                                 │ 1.03802   │
│ 14  │ cpu         │ MatMulBroad       │ missing │ 0.003428                                  │ 0.005100375                               │ 1.48786   │
│ 15  │ cpu         │ 3DMulBroad        │ missing │ 0.0013702                                 │ 0.0015412                                 │ 1.1248    │
│ 16  │ cpu         │ peakflops         │ missing │ 1.530257840890542e11                      │ 1.4811308237589566e11                     │ 0.967896  │
│ 17  │ cpu         │ FFMPEGH264Write   │ missing │ 151.23031                                 │ 161.079387                                │ 1.06513   │
│ 18  │ mem         │ DeepCopy          │ missing │ 0.00016894915254237288                    │ 0.00017180699481865284                    │ 1.01692   │

│ 19  │ diskio      │ DiskWrite1KB      │ missing │ 0.0297705                                 │ 0.11039                                   │ 3.70803   │ <<
│ 20  │ diskio      │ DiskWrite1MB      │ missing │ 0.871518                                  │ 0.3283265                                 │ 0.376729  │ <<
│ 21  │ diskio      │ DiskRead1KB       │ missing │ 0.00951925                                │ 0.0330435                                 │ 3.47123   │ <<
│ 22  │ diskio      │ DiskRead1MB       │ missing │ 0.1272875                                 │ 1.0485435                                 │ 8.2376    │ <<

│ 23  │ loading     │ JuliaLoad         │ missing │ 119.903458                                │ 180.141969                                │ 1.50239   │
│ 24  │ compilation │ compilecache      │ missing │ 320.3946125                               │ 344.544143                                │ 1.07537   │
│ 25  │ compilation │ create_expr_cache │ missing │ 1.053377                                  │ 7.869244                                  │ 7.47049   │

Example 2: A same hardware test on a 2018 Macbook Pro with Parallels-hosted Ubuntu 18.04 VM (Ref), vs. the native MacOS (Test)

On a general note, I was surprised how well the Ubuntu VM did here

32×6 DataFrame
│ Row │ cat         │ testname                  │ units   │ ref_res                                  │ test_res                                 │ factor    │
│     │ String      │ String                    │ String? │ Any                                      │ Any                                      │ Any       │
├─────┼─────────────┼───────────────────────────┼─────────┼──────────────────────────────────────────┼──────────────────────────────────────────┼───────────┤
│ 1   │ info        │ SysBenchVer               │ missing │ 0.3.0                                    │ 0.3.0                                    │ Equal     │
│ 2   │ info        │ JuliaVer                  │ missing │ 1.4.1                                    │ 1.4.1                                    │ Equal     │
│ 3   │ info        │ OS                        │ missing │ Linux (x86_64-pc-linux-gnu)              │ macOS (x86_64-apple-darwin18.7.0)        │ Not Equal │
│ 4   │ info        │ CPU                       │ missing │ Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz │ Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz │ Equal     │
│ 5   │ info        │ WORD_SIZE                 │ missing │ 64                                       │ 64                                       │ Equal     │
│ 6   │ info        │ LIBM                      │ missing │ libopenlibm                              │ libopenlibm                              │ Equal     │
│ 7   │ info        │ LLVM                      │ missing │ libLLVM-8.0.1 (ORCJIT, skylake)          │ libLLVM-8.0.1 (ORCJIT, skylake)          │ Equal     │
│ 8   │ info        │ GPU                       │ missing │ missing                                  │ missing                                  │ Not Equal │
│ 9   │ cpu         │ FloatMul                  │ ms      │ 1.7070000000000001e-6                    │ 1.7190000000000002e-6                    │ 1.00703   │
│ 10  │ cpu         │ FusedMulAdd               │ ms      │ 1.709e-6                                 │ 1.716e-6                                 │ 1.0041    │
│ 11  │ cpu         │ FloatSin                  │ ms      │ 5.093e-6                                 │ 4.927e-6                                 │ 0.967406  │
│ 12  │ cpu         │ VecMulBroad               │ ms      │ 3.742237903225806e-5                     │ 4.559270516717325e-5                     │ 1.21833   │
│ 13  │ cpu         │ CPUMatMul                 │ ms      │ 0.021145                                 │ 0.0367385                                │ 1.73746   │
│ 14  │ cpu         │ MatMulBroad               │ ms      │ 0.003607                                 │ 0.0189734375                             │ 5.26017   │
│ 15  │ cpu         │ 3DMulBroad                │ ms      │ 0.0014235                                │ 0.00169535                               │ 1.19097   │
│ 16  │ cpu         │ peakflops                 │ flops   │ 1.4902694486333694e11                    │ 2.6443904974091913e11                    │ 1.77444   │
│ 17  │ cpu         │ FFMPEGH264Write           │ ms      │ 121.60521                                │ 143.683998                               │ 1.18156   │
│ 18  │ mem         │ DeepCopy                  │ ms      │ 0.00019974039938556066                   │ 0.00019046474820143885                   │ 0.953561  │
│ 19  │ mem         │ Bandwidth10kB             │ MiB/s   │ 96002.55773217192                        │ 99035.61046884404                        │ 1.03159   │
│ 20  │ mem         │ Bandwidth100kB            │ MiB/s   │ 41807.6505372956                         │ 43489.184021444205                       │ 1.04022   │
│ 21  │ mem         │ Bandwidth1MB              │ MiB/s   │ 21869.752938891692                       │ 26635.227381825163                       │ 1.2179    │
│ 22  │ mem         │ Bandwidth10MB             │ MiB/s   │ 10084.215207687876                       │ 11624.277395933892                       │ 1.15272   │
│ 23  │ mem         │ Bandwidth100MB            │ MiB/s   │ 10347.835483172948                       │ 8848.556421771302                        │ 0.855112  │

│ 24  │ diskio      │ DiskWrite1KB              │ ms      │ 0.051837                                 │ 0.131                                    │ 2.52715   │ <<
│ 25  │ diskio      │ DiskWrite1MB              │ ms      │ 1.1210205                                │ 0.518321                                 │ 0.462365  │ <<
│ 26  │ diskio      │ DiskRead1KB               │ ms      │ 0.0114043                                │ 0.0964005                                │ 8.453     │ <<
│ 27  │ diskio      │ DiskRead1MB               │ ms      │ 0.166096                                 │ 1.016004                                 │ 6.11697   │ <<

│ 28  │ loading     │ JuliaLoad                 │ ms      │ 126.233121                               │ 216.478817                               │ 1.71491   │
│ 29  │ compilation │ compilecache              │ ms      │ 277.6027165                              │ 357.7933795                              │ 1.28887   │
│ 30  │ compilation │ success_create_expr_cache │ ms      │ 269.5016405                              │ 370.772621                               │ 1.37577   │
│ 31  │ compilation │ create_expr_cache         │ ms      │ 3.351107                                 │ 12.492654                                │ 3.72792   │
│ 32  │ compilation │ output-ji-substart        │ ms      │ 40.68284                                 │ 27.275530500000002                       │ 0.670443  │

Further, plotting the full results so far shows a fast clustering of linux disk io results, while MacOS & windows struggle (bottom left here). There are some Linux outliers which according to users are systems using slower storage devices such as SD cards.

image

Windows needs more testing. So if anyone with a windows machine with a linux VM could run this, it would be helpful:

using SystemBenchmark
res = runbenchmark()
savebenchmark("res_windows.txt", res)
... # repeat on the linux VM
comp = compare(readbenchmark("res_windows.txt),readbenchmark("res_linux.txt))
show(comp, allrows=true, allcols=true)

Metadata

Metadata

Assignees

No one assigned

    Labels

    ioInvolving the I/O subsystem: libuv, read, write, etc.performanceMust go fastersystem:macAffects only macOSsystem:windowsAffects only Windows

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions