Difference between `asv dev` and `asv run` (due to OpenMP) #671

prisae · 2018-07-12T14:16:29Z

I run into a weird issue which I cannot track down (using asv v0.2.1), so I thought I'd ask here if it is a known or common problem.

If I run a simple test using asv dev with a parameter that can take three different values, I get the following output:

asv/> asv dev -b dipole_time
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[  0.00%] ·· Building for existing-py_home_dtr_anaconda3_bin_python
[  0.00%] ·· Benchmarking existing-py_home_dtr_anaconda3_bin_python
[100.00%] ··· Running model.Dipole.time_dipole_time                                    ok
[100.00%] ···· 
               ====== ==========
                loop            
               ------ ----------
                None   139.11ms 
                freq   322.62ms 
                off    139.09ms 
               ====== ==========

These times agree with the times I get if I run the tests simply with %timeit in an IPython console.

Now if I run asv run instead of asv dev,

asv/> asv run -b time_dipole_time
· Fetching recent changes
· Creating environments
· Discovering benchmarks
·· Uninstalling from conda-py3.6
·· Building for conda-py3.6
·· Installing into conda-py3.6.
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[  0.00%] · For empymod commit hash b804477a:
[  0.00%] ·· Building for conda-py3.6.
[  0.00%] ·· Benchmarking conda-py3.6
[100.00%] ··· Running model.Dipole.time_dipole_time                          655.54ms;...

the result stored in b804477a-conda-py3.6.json is:

[snip]
    "results": {
        "model.Dipole.time_dipole_time": {
            "params": [
                [
                    "None",
                    "'freq'",
                    "'off'"
                ]
            ],
            "result": [
                0.6555442148,
                0.3023681289999999,
                0.6842122744999999
            ]
        }
[snip]

The times vastly differ between the asv dev and the asv run results. Any idea why that could be?

The text was updated successfully, but these errors were encountered:

prisae · 2018-07-12T14:34:13Z

I just saw that v0.2.2rc1 was released, and I tested the above issue with this version. However, it results in the same.

pv · 2018-07-12T16:33:06Z

`asv dev` is the same as `asv run --quick` which just sets `repeat = number = 1` so the benchmark gets run once. You can try different values of repeat and number in Ipython e.g. with `%timeit -n3 -r4` .

pv · 2018-07-12T16:38:25Z

You can also check asv 0.3.dev which reports also the statistics. If you are running e.g. on a laptop, average of multiple runs can easily be larger than a single run, if the CPU decides to shift its frequency down under load etc. various possibilities.

pv · 2018-07-12T17:17:40Z

In addition, maybe you get a different liner algebra library (e.g. MKL vs openblas) in the new environment created by `asv run`.

pv · 2018-07-12T17:18:35Z

Basically, as usual, in the absence of a reproducible example little can be said.

prisae · 2018-07-12T17:23:21Z

Thanks @pv. Running asv run --quick indeed gives the same result as asv dev. I learned something new here. Would you say then the default %timeit is heavily flawed? I kind of always relied on the default %timeitto compare runtimes of different things. Seems that is not such a good idea.

Not running on a laptop, but on an Intel NUC, which might behave similar indeed.

In addition, maybe you get a different liner algebra library (e.g. MKL
vs openblas) in the new environment created by asv run.

I hope not. I think in both cases it uses conda MKL.

pv · 2018-07-12T17:27:51Z

If `asv run --quick` gives the same result as `asv dev` then it's not environment issue. A remaining question then is whether you managed to reproduce it with the -n and -r options to %timeit?

prisae · 2018-07-12T17:39:24Z

Nope. This is the default output:

In [22]: for loop in [None, 'freq', 'off']:
    ...:     %timeit dipole(loop=loop, **model)
    ...:     
141 ms ± 5.35 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
355 ms ± 28.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
137 ms ± 2.34 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

So the times agrees with asv dev or asv run --quick. The numbers of %timeit don't change even if I fix -r 1 -n 1, -r 7 -n 10, or -r 10 -n 10; they are all similar.

There seems to be another issue: If I run asv run, the CPU usage (CPU @ 1.60GHz × 4) goes to 100%. However, when I run asv dev, asv --quick, or %timeit, it tops at 50% max. Can it be that asv run handles threads differently?

prisae · 2018-07-12T17:48:03Z

Is it possible to limit the number of threads for asv?

pv · 2018-07-12T17:55:59Z

`asv` doesn't know about threads or use them --- the only difference between `asv run --quick` and `asv run` is with whether `timer = timeit.Timer(...); results = timer.repeat(number, repeat)` is run with `number = repeat = 1` or with other parameters. For asv development, the more interesting question is what asv 0.3.dev from the git master branch says. (0.2.2 will only fixs a few selected bugs, doesn't change benchmark methodology.)

prisae · 2018-07-12T17:57:32Z

I'll give it a shot with v0.3.dev and report back.

pv · 2018-07-12T18:01:18Z

For asv 0.2.x, you can also try setting e.g. `number = 1` and `repeat = 5` as benchmark attributes, to disable their automatic selection --- if that helps, there is/was a problem with the automatic selection.

prisae · 2018-07-12T18:05:20Z

If I set number = 1 and repeat = 5 I get the same output with asv run as with asv run --quick, asv dev, and %timeit.

prisae · 2018-07-12T18:05:46Z

...and the CPU usage does not explode...

prisae · 2018-07-12T18:21:06Z

With v0.3.dev (commit 747d928) it looks different.

asv dev yields slower times than in v0.2.2rc1.
asv run and asv run --quick are now the same, yielding similar results to asv run in v0.2.2rc1
The trick with number = 1 and repeat = 5 does not seem to have an effect.

asv dev (v0.3.dev):

asv/> asv dev
· Discovering benchmarks
· Running 1 total benchmarks (1 commits * 1 environments * 1 benchmarks)
[  0.00%] ·· Building for existing-py_home_dtr_anaconda3_bin_python
[  0.00%] ·· Benchmarking existing-py_home_dtr_anaconda3_bin_python
[100.00%] ··· Running model.Dipole.time_dipole_time                           ok
[100.00%] ···· 
               ====== =======
                loop         
               ------ -------
                None   278ms 
                freq   321ms 
                off    341ms 
               ====== =======

asv run (v0.3.dev) :

        "model.Dipole.time_dipole_time": {
            "params": [
                [
                    "None",
                    "'freq'",
                    "'off'"
                ]
            ],
            "result": [
                0.6451392059999999,
                0.8246050810000001,
                0.6298039939999999
            ],
            "stats": [
                {
                    "ci_99": [
                        0.6274002239703472,
                        0.6647882010296525
                    ],
                    "max": 0.6549872219999999,
                    "mean": 0.6460942124999999,
                    "min": 0.6391112159999999,
                    "n": 4,
                    "q_25": 0.6402948367499998,
                    "q_75": 0.6509385817500001,
                    "std": 0.0065052649672145
                },
                {
                    "ci_99": [
                        0.5469139810000008,
                        1.1022961809999992
                    ],
                    "max": 0.8301589030000001,
                    "mean": 0.8246050810000001,
                    "min": 0.8190512590000001,
                    "n": 2,
                    "q_25": 0.8218281700000001,
                    "q_75": 0.8273819920000001,
                    "std": 0.005553821999999986
                },
                {
                    "ci_99": [
                        0.5361086224943821,
                        0.7356948780056181
                    ],
                    "max": 0.689744374,
                    "mean": 0.63590175025,
                    "min": 0.5942546389999999,
                    "n": 4,
                    "q_25": 0.6095456847499998,
                    "q_75": 0.6561600594999999,
                    "std": 0.03594302763343558
                }
            ]
        }
    },

Here again %timeit for comparison

In [22]: for loop in [None, 'freq', 'off']:
    ...:     %timeit dipole(loop=loop, **model)
    ...:     
141 ms ± 5.35 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
355 ms ± 28.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
137 ms ± 2.34 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

pv · 2018-07-12T18:44:36Z

Finally, to check again for the environment effects, you can do asv run -e --python=same (without specifying commit) instead of asv run. Note that asv dev == asv run --quick -e --python=same. The -e makes it print the result tables (so you don't need to paste the json).

If you don't get the strange bigger timings with asv 0.2.x for any combination of number and repeat (e.g. larger number, smaller repeat), then I'm running out of guesses.

prisae · 2018-07-12T19:08:27Z

If I put number and repeat higher, then the times do go up to similar levels (say number = 10, repeat = 5).

I run the whole benchmark of my project now (https://github.com/empymod/asv), paying close attention to the CPU. Almost the entire benchmark runs on 1 thread (25% of CPU usage), as you say, asv does not know about threads. But there are 4 benchmarks where the CPU usage immediately shoots up to 100% (4 threads). So I wonder if in some cases something goes wrong and it tries to run distribute the tests, which then makes it slower?

pv · 2018-07-12T19:19:08Z

Intel MKL is multithreaded, so that's one possibility, and perhaps it tries to enable them in a "smart" way that's problematic here, so you can try with MKL_NUM_THREADS=1 or setting some of the other MKL environment variables.

prisae · 2018-07-12T19:21:14Z

If I set asv run -e --python=same I get yet another result. Slower than asv dev, but faster than asv run. It uses 50% of the CPU, versus 25% with asv dev and 100% with asv run.

prisae · 2018-07-12T19:32:26Z

There we go. Setting in bash export OMP_NUM_THREADS=1 before running asv run did the trick. Now everything is run in 1 thread, and the times of asv run agree with %timeit and asv dev.

Non sure if this is something that should be mentioned in the docs or set by asv itself to ensure consistency. Thanks @pv for resolving this!

pv · 2018-07-12T19:41:42Z

There could be a FAQ on setting up the benchmarking environment, as it does have this sort of caveats.

prisae · 2018-07-12T19:45:36Z

That could help. I guess the question-tag is also OK. I did check the issues before posting, so in that case I would have found it. Thanks again for letting me pick your brain, and for maintaining asv.

prisae closed this as completed Jul 12, 2018

pv changed the title ~~Difference between asv dev and asv run~~ Difference between asv dev and asv run (due to OpenMP) Jul 12, 2018

pv added the question Triaged as a question, not requiring action after answering label Jul 12, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between `asv dev` and `asv run` (due to OpenMP) #671

Difference between `asv dev` and `asv run` (due to OpenMP) #671

prisae commented Jul 12, 2018

prisae commented Jul 12, 2018 •

edited

pv commented Jul 12, 2018 via email

pv commented Jul 12, 2018 via email

pv commented Jul 12, 2018 via email

pv commented Jul 12, 2018 via email

prisae commented Jul 12, 2018

pv commented Jul 12, 2018 via email

prisae commented Jul 12, 2018

prisae commented Jul 12, 2018

pv commented Jul 12, 2018 via email

prisae commented Jul 12, 2018

pv commented Jul 12, 2018 via email

prisae commented Jul 12, 2018

prisae commented Jul 12, 2018

prisae commented Jul 12, 2018

pv commented Jul 12, 2018 •

edited

prisae commented Jul 12, 2018

pv commented Jul 12, 2018 •

edited

prisae commented Jul 12, 2018

prisae commented Jul 12, 2018

pv commented Jul 12, 2018

prisae commented Jul 12, 2018

Difference between asv dev and asv run (due to OpenMP) #671

Difference between asv dev and asv run (due to OpenMP) #671

Comments

prisae commented Jul 12, 2018

prisae commented Jul 12, 2018 • edited

pv commented Jul 12, 2018 via email

pv commented Jul 12, 2018 via email

pv commented Jul 12, 2018 via email

pv commented Jul 12, 2018 via email

prisae commented Jul 12, 2018

pv commented Jul 12, 2018 via email

prisae commented Jul 12, 2018

prisae commented Jul 12, 2018

pv commented Jul 12, 2018 via email

prisae commented Jul 12, 2018

pv commented Jul 12, 2018 via email

prisae commented Jul 12, 2018

prisae commented Jul 12, 2018

prisae commented Jul 12, 2018

pv commented Jul 12, 2018 • edited

prisae commented Jul 12, 2018

pv commented Jul 12, 2018 • edited

prisae commented Jul 12, 2018

prisae commented Jul 12, 2018

pv commented Jul 12, 2018

prisae commented Jul 12, 2018

Difference between `asv dev` and `asv run` (due to OpenMP) #671

Difference between `asv dev` and `asv run` (due to OpenMP) #671

prisae commented Jul 12, 2018 •

edited

pv commented Jul 12, 2018 •

edited

pv commented Jul 12, 2018 •

edited