Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use asv for scikit-bio #272

Closed
anderspitman opened this issue May 15, 2015 · 5 comments
Closed

Use asv for scikit-bio #272

anderspitman opened this issue May 15, 2015 · 5 comments
Labels
question Triaged as a question, not requiring action after answering

Comments

@anderspitman
Copy link

We are currently working on setting up asv to handle our benchmark testing for scikit-bio. While exploring the features we would like to get working, we're thinking of using asv the following ways:

  1. Automatic benchmarking for pull requests, a la Travis. The plan is to use Github Webhooks to trigger benchmarking on our dedicated machine (there would be a daemon with a job queue). The daemon would run the benchmarks and push to a new Github page for each PR. These would be automatically cleaned up after a time period or when the PR is closed.
  2. Include benchmarking tests inline in our repo, and update them exactly as we do with unit tests. We would add an if __name__ == '__main__' to each of these benchmarking files, which would import the timeit functionality from asv and run the benchmarks in that file. This would allow developers to run benchmarks for the unit they are modifying before and after their modification, as a smoke test before submitting a PR. These "unit" benchmarking tests would then be aliased and imported by the benchmarking file that actually gets run by asv. The goal of this would be to avoid explicitly checking the __version__ of our repo in our benchmarking file.

Does anyone here have experience implementing any of this with asv? Any suggestions as we set out? Of course we're happy to push anything useful we get working upstream.

@pv
Copy link
Collaborator

pv commented May 15, 2015

On (2): You may be interested in looking at how we do it in Scipy: https://github.com/scipy/scipy/tree/master/benchmarks
Some examples of it in action: scipy/scipy#4642 scipy/scipy#4766 scipy/scipy#4712

In particular, the solution is to put the benchmarks into the same repository as the project itself, but, effectively in a completely separate Python package. Running benchmarks against the current working tree version of the package to be benchmarked can be done in a fairly straightforward way by making use of asv's --python=same option.

Note that it is not very useful to put the benchmarks inside the package itself, because you want to keep the set of benchmarks fixed when running them against old versions of the package. It however is possible to keep both in the same repository.

@anderspitman
Copy link
Author

That's very helpful, thanks! We'll definitely look at scipy's implementation.

For the last point, our thinking was that if you had something like a skbio.bench defined for each git revision, inside master's benchmarking.py you can simply do:

#benchmarking.py
import skbio.bench

def time_thing():
    skbio.bench.thing()

Then each revision would have it's own implementation of thing, so revision X might be:

# bench.py
def thing():
    skbio.some_lib.thing()

but revision Y is maybe:

# bench.py
def thing():
    skbio.some_other_lib.thing()

The idea being that master does not need to know anything about compatibility with previous commits, given the constraint that benchmarking names need to remain the same.

But there might be some issues here we're not seeing.

@pv
Copy link
Collaborator

pv commented May 15, 2015

The technical constraint here is that the benchmark names and attributes (including parameters for parameterized benchmarks) must be fixed by your "benchmarking.py", and should not depend on what is inside "skbio.bench" that varies depending on the currently active version.

Less technical thoughts on this: That approach adds more complexity, and there is additional hassle in ensuring that the benchmark code in the different versions actually does the same thing, so that results remain comparable. This might be more transparent to do with explicit version checks. It seems it will also not be possible to add benchmarks a posteriori, to see how performance of some feature has evolved previously, and whether past changes introduced performance regressions.

EDIT: another advantage of keeping benchmarks separate: if there's a mistake in a benchmark (e.g. not measuring the correct thing, or you want to slightly change the parameters it runs with to be more representative), you can easily fix it and rerun it for the whole history. However, if the benchmark is frozen inside the project's history, you cannot change it any more, and there's no way to see how things would work out with the fixed benchmark for any of the previous commits. So I think the way asv tries to force you to write your benchmarks separate from your code has merit.

@mdboom
Copy link
Collaborator

mdboom commented May 15, 2015

Just a bit of history to add to what @pv said: An earlier sketch of asv worked more like you described, but it ultimately turned out to be hard to manage because you have to basically retroactively stick commits containing the benchmarks into old history (and all of the issues that rewriting git history entails). Perhaps with a new project this is tracktable (even then, you'd probably want to add benchmarks over time and still run them on past project history).

@anderspitman
Copy link
Author

All valid points, glad we asked ahead of time. Thanks for your input!

@pv pv added the question Triaged as a question, not requiring action after answering label Jul 6, 2015
@pv pv closed this as completed Jun 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Triaged as a question, not requiring action after answering
Projects
None yet
Development

No branches or pull requests

3 participants