Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: number=1 or max_rounds #9

Closed
gavingc opened this issue Jun 10, 2015 · 21 comments
Closed

Feature request: number=1 or max_rounds #9

gavingc opened this issue Jun 10, 2015 · 21 comments
Assignees

Comments

@gavingc
Copy link

gavingc commented Jun 10, 2015

I realise how much effort has gone into getting a reasonable average benchmark.

However I have just run into a use case where the unit under test must run exactly once.
It's not so much a benchmark as indicative.
The unit is inserting objects into a database (within a complex seq) so runs after the first are not representative.

A bit of an edge case I know.

For now I'm using:
t = timeit.timeit(sync_objects, number=1)
assert t < 1

@ionelmc
Copy link
Owner

ionelmc commented Jun 10, 2015

Quick question: how much does this function usually take?

@gavingc
Copy link
Author

gavingc commented Jun 13, 2015

Purely depends on how many objects are set for the test, > 1 second at say 1000 objects.

@ionelmc
Copy link
Owner

ionelmc commented Aug 9, 2015

TODO note for me: this requires fixing the result storage in the inner loop (the runner function). Currently it's not stored and the function_to_benchmark is ran one more time at the end (just for the result).

@ionelmc
Copy link
Owner

ionelmc commented Aug 19, 2015

@gavingc Would solution proposed in #14 fix this for you?

@gavingc
Copy link
Author

gavingc commented Aug 22, 2015

No I don't think so, the case here does not have seperatable parts to measure.
The whole test needs to run together and is longer than the micro second range.
The enviroment setup before object sync starts can be expensive too.

The use case is to provide an indicative value and put an alert in place should the test suddenly exceed a nominal time.

Basically just need to be able to run the test once without calibration and averaging but take advantage of the pytest report integration.

@ionelmc
Copy link
Owner

ionelmc commented Aug 23, 2015

@gavingc I'm thinking about something like this:

@pytest.mark.benchmark(max_rounds=1, calibration=False)
def test_a_thing_that_shouldnt_use_pytest_bechmark(benchmark):
    benchmark(...)

At least calibration would only be available as a marker option to discourage general use.

@gavingc
Copy link
Author

gavingc commented Aug 23, 2015

Looks the business 👍

@ionelmc
Copy link
Owner

ionelmc commented Sep 1, 2015

After some consideration I'm implementing an API that allows this:

def test_that_is_special(benchmark):
    benchmark.manual(func, args=(...), kwargs={...}, setup=some_setup_func, iterations=1, rounds=1)

This should cover what you asked for.

@gavingc
Copy link
Author

gavingc commented Sep 1, 2015

Intense and looks wickedly flexible :-)

@ionelmc
Copy link
Owner

ionelmc commented Sep 1, 2015

Well give it a try. Last chance to change this before I release 3.0.0

On Tuesday, September 1, 2015, Gavin Kromhout notifications@github.com
wrote:

Intense and looks wickedly flexible :-)


Reply to this email directly or view it on GitHub
#9 (comment)
.

Thanks,
-- Ionel Cristian Mărieș, http://blog.ionelmc.ro

@gavingc
Copy link
Author

gavingc commented Sep 1, 2015

Ok trying, my head is a million miles away in another project so I might be missing something simple.

I tried to install from github with pip into my project's virtualenv and got this:

$env/bin/pip install git+https://github.com/ionelmc/pytest-benchmark.git
Downloading/unpacking git+https://github.com/ionelmc/pytest-benchmark.git
Cloning https://github.com/ionelmc/pytest-benchmark.git to /tmp/pip-NVBPRR-build
Running setup.py (path:/tmp/pip-NVBPRR-build/setup.py) egg_info for package from git+https://github.com/ionelmc/pytest-benchmark.git
error in pytest-benchmark setup command: Invalid environment marker: python_version < "3.4"
Complete output from command python setup.py egg_info:
error in pytest-benchmark setup command: Invalid environment marker: python_version < "3.4"


Cleaning up...
Command python setup.py egg_info failed with error code 1 in /tmp/pip-NVBPRR-build
Storing debug log for failure in ~/.pip/pip.log

@gavingc
Copy link
Author

gavingc commented Sep 1, 2015

@gavingc
Copy link
Author

gavingc commented Sep 1, 2015

Ok, so after the above change benchmark appears to install.

My test:

def sync_objects(*args, **kwargs):
            r = server.sync_objects(holder)

benchmark.manual(sync_objects, iterations=1, rounds=1)

It appears that the side effect of running the test twice still occurs yes?
Also I had to include _args, *_kwargs which are unused parameters and IntelliJ IDEA rightfully complains.

Eyes falling out = bed time.

@ionelmc
Copy link
Owner

ionelmc commented Sep 1, 2015

@gavingc the install issue is causes by too old setuptools. Can you upgrade your setuptools?

The example you pasted should only call sync_objects once. There is a test exactly for that: https://github.com/ionelmc/pytest-benchmark/blob/master/tests/test_manual.py#L4-L7

Can you show me how IntelliJ complains about the args (a screenshot or something would be nice)?

@gavingc
Copy link
Author

gavingc commented Sep 2, 2015

OK!
Setuptools upgraded and install now works, thanks.
In the above sync_object does indeed only get called once (parameterised test on my side).
Also now I can remove *args, **kwargs from sync_objects without a Python error.

Essentially IntelliJ IDEA will warn about any unused parameters/vars since that is usually a mistake on the programmers part.

@gavingc
Copy link
Author

gavingc commented Sep 2, 2015

Given:

    def test_bench(self, benchmark):

        from time import sleep

        def slp():
            sleep(3)

        benchmark.manual(slp, iterations=1, rounds=1)
$./tests.py -k test_bench
collected 142 items 
tests.py::TestServerUnit::test_bench PASSED
------- benchmark: 1 tests, min 5 rounds (of min 25.00us), 1.00s max time, timer: time.time --------
Name (time in s)        Min     Max    Mean  StdDev  Median     IQR  Outliers(*)  Rounds  Iterations
----------------------------------------------------------------------------------------------------
test_bench           3.0031  3.0031  3.0031  0.0000  3.0031  0.0000          0;0       1           1
----------------------------------------------------------------------------------------------------
(*) Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
=========================================================================== 141 tests deselected by '-k test_bench' ============================================================================
=========================================================================== 1 passed, 141 deselected in 3.08 seconds ===========================================================================

The only thing I can see that looks a little bit off is the: "min 5 rounds".
Min/Max in this case over several runs: 3.0001 - 3.0031
Which is perfectly acceptable, no plans to optimise below that :-)

@ionelmc
Copy link
Owner

ionelmc commented Sep 2, 2015

Good catch. Maybe I should just remove the timing data from the header (doesn't really relate to the tests ran in that group - those are just default settings).

@ionelmc
Copy link
Owner

ionelmc commented Sep 12, 2015

FYI: there's gonna be a rename: benchmark.manual is going to be renamed to benchmark.pedantic to make it more clear about what's going on in the callsite.

ionelmc added a commit that referenced this issue Sep 12, 2015
@ionelmc
Copy link
Owner

ionelmc commented Sep 12, 2015

@gavingc
Copy link
Author

gavingc commented Sep 13, 2015

I really like it, fantastic work on the docs.

For consideration perhaps unsafe options shouldn't have defaults and be optional.
Make me specify rounds=1, iterations=1so that it's explicit.
(Providing it doesn't break/make the implementation nasty)

I think the implementation and options available will prompt me to improve the test too.

@ionelmc
Copy link
Owner

ionelmc commented Sep 13, 2015

The typical usecase that I had in mind was benchmarking slow side-effectful functions. For that situation round=1, iterations=1 are good defaults.

On the other hand I don't want to require the users to type so much, and type everything :)

If you do use the defaults and target a fast function then your results are going to look obviously wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants