Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add vbench #1070

Closed
wants to merge 1 commit into from
Closed

ENH: Add vbench #1070

wants to merge 1 commit into from

Conversation

TomAugspurger
Copy link
Contributor

Closes #936

Still a WIP atm. This is just the skeleton and I've got a few details to iron out over the next couple of days.

Do we know what kind of machine this will be running on? I can test in a windows VM if need be, I just need to setup a python environment.

I'm keeping some docs too.

Here's an example of the output!

ols

@TomAugspurger
Copy link
Contributor Author

I should reemphasize that the is still very much WIP. It's probably not worth reviewing yet. I just wanted to get it onto GH.

@TomAugspurger
Copy link
Contributor Author

Any questions that you anticipate having while I'm writing up some developer docs for this? Right now I have headings for

Writing a good vbench
Pre-PR testing

which I think will suffice for most cases. I'll also write something up for

Running the full suite

which most people won't have to do. And then some notes on the implementation in case people need to change it down the line. Perhaps I can write up some vbench docs and push them to the vbench repo.

@vincentarelbundock
Copy link
Contributor

So just to be clear, the process is we need a computer that checks git periodically and runs the vbench suite every time there's a commit. Then, we upload results and graph to a website somewhere, right?

I have a couple always on machines that could be used for that. Long-term, or just for testing. Both run relatively recent version of ubuntu.

@josef-pkt
Copy link
Member

@TomAugspurger Thanks again for working on this

I only gave it a quick browse

If you can link to documentation in other packages, then our documentation could be pretty short.

Some question besides the general "How do you run it?" that I thought about

How do we add benchmarks? Should they have a loop to run the same code several times (e.g. 10 OLS(..).fit() )
Do we add conditional code directly in the benchmark, if the API has changed?

Can we make selective runs, additional to the scheduled runs?
selective either in terms of additional commit points, or
selective in terms which benchmarks are run.

Related: In the test suite we have some unit tests marked as slow. Is it possible to run basic benchmarks on a higher frequency than slow benchmarks?
or in another way: can we define groups/sets of benchmarks that can be run on demand.

I'm trying to figure out how we can handle benchmarks for the different submodules
We will want to run the core models linear models, discrete models, RLM, GLM, and the tsa models on a regular basis and as soon as it is working. (plus formula and pandas versions)
nonparametric and emplike are largely isolated from the other parts but are the most time consuming to run.
and then there are many smaller functions (stats, distributions), where it is currently less important to benchmark them.

to the question by @vincentarelbundock Should we really run on every commit or periodically?
Every commit sounds like a lot of processing, when often there will only be changes that affect a small part.

@vincentarelbundock
Copy link
Contributor

Oh yeah, well the cron job can check github for new commits once a day or once a week. Doesn't have to be every second...

@TomAugspurger
Copy link
Contributor Author

There's actually two main ways to use vbench. I'll talk to the pandas people to get a better idea of how they use it for the long-term testing. But the thing that most contributors will use is the test_perf.py file (which I haven't added yet.) This is a command-line tool that lets you compare your commit against a known benchmark and gives you a list of the benchmarks that differ. You'll get a nice output like

frame_reindex_axis0                         0.6189     1.8109     0.3418
frame_reindex_axis0                         1.0079     1.2519     0.8051
frame_reindex_axis0                         0.5227     0.6118     0.8543
frame_reindex_axis0                         0.4115     0.4681     0.8792
frame_reindex_axis0                         0.4141     0.4454     0.9296
frame_reindex_axis0                         0.5567     0.5934     0.9382
frame_reindex_axis0                         0.4374     0.4661     0.9385
frame_reindex_axis0                         0.4723     0.5015     0.9417
frame_reindex_axis0                         0.4407     0.4583     0.9616

where the columns are head (your PR), base, and the ratio.

@vincentarelbundock
Copy link
Contributor

What would be nice is if the initial run could be taken at different points in the commit history, so we can "backfill" benchmark history. :)

@TomAugspurger
Copy link
Contributor Author

@vincentarelbundock If I'm understanding you correctly, then that should be possible.

@josef-pkt I'll get those answered more formally later, but for now:

How do we add benchmarks? S

Basically you write a module (a single .py file) for a related suite of tests. Each of those test is run once, either for each commit in a specified date range, or just a diff between your current and some know good benchmark.

Do we add conditional code directly in the benchmark, if the API has changed?

Yep. There's a few examples in the pandas repo that does this. I'll link to them.

Can we make selective runs, additional to the scheduled runs?

Yes. This will be the test_perf.py file

I'll check about the running subsets. I think it's possible.

@TomAugspurger
Copy link
Contributor Author

Should I add a new vbench file to docs/source/dev for the notes on this? Or should I stick them into docs/source/dev/test_notes.rst?

Getting close on this. One thing is a bit weird, and I want to compare it to the pandas results before saying this is ready. But I'm hitting some error when running the pandas vbench, so I'm waiting on that.

@TomAugspurger
Copy link
Contributor Author

Okay this is probably about ready to go. I typed up some notes that will hopefully clear some questions up. I might clean them up and submit them to the vbench repo, but this should do for now.

One question for now. Each benchmark takes a start_date argument for how far back you want to go. I've got it set to just a few months ago for testing, but we'll want to push that back. Any idea how far you want to go back?

And obviously we need to expand the coverage. I basically just took the discrete examples and converted them into a benchmark.
I can probably add more later, but there's no reason this can't be merged before I can get around to that.
And that way anyone who wants to can throw one in.

Anyone mind checkout out my branch and giving it a shot? You just need config file, see the notes I wrote up.

@TomAugspurger
Copy link
Contributor Author

For future reference: wesm/vbench#34

@TomAugspurger
Copy link
Contributor Author

There's a final bit that I haven't implemented yet. pandas and numpy have webpages with the performance benchmarks. There are make files that I haven't added yet to automate the job. Any interest?

@josef-pkt
Copy link
Member

I wanted to look at it today, but didn't find time yet.

make file for creating the html would be needed (maybe you have it already). The part to push them automatically on a webpage won't be necessary until we have decided how to publish them.

@TomAugspurger
Copy link
Contributor Author

No rush on my end. The steps to make the docs are

  1. run the suite :) python run_suite.py
  2. Generate the rst files: python generate_rst.py
  3. Make the html: python make.py html

All from the statsmodels/vb_suite/ directory.

@TomAugspurger
Copy link
Contributor Author

The things we'll need to change are in make.py, specifically the function upload() and the dict funcd that I've commented out for now.

It also expects config and credentials files.

@vincentarelbundock
Copy link
Contributor

I'll give this a serious look next week if nobody has gotten around to it by then. Sorry I can't do it any sooner!

@TomAugspurger
Copy link
Contributor Author

Thanks! No rush.

-Tom
On Sep 10, 2013, at 7:51 AM, Vincent Arel-Bundock <notifications@github.commailto:notifications@github.com>
wrote:

I'll give this a serious look next week if nobody has gotten around to it by then. Sorry I can't do it any sooner!


Reply to this email directly or view it on GitHubhttps://github.com//pull/1070#issuecomment-24156826.

TMP_DIR = config.get('setup', 'tmp_dir')
except:
REPO_PATH = os.path.abspath(os.path.join(os.path.dirname(__file__), "../"))
REPO_URL = 'git@github.com:statsmodels/statsmodels.git'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my machine, I had to change git@ to a URL: https://github.com/statsmodels/statsmodels.git

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I forgot to check the except part of that block.

@vincentarelbundock
Copy link
Contributor

Build works well for me. I uploaded it to my website in case @jseabold and @josef-pkt want to have a look.

http://umich.edu/~varel/vbench

This looks really great and it's easy to use. Thanks a lot Tom!

A couple things:

  • We probably need a very short blurb just to explain what this is
  • I like the folding TOC on the right-hand side, so perhaps we don't need to duplicate it with the ugly link hierarchy in the middle
  • If benchmark results are saved locally to benchmark.db, then I think we should go way back in time with benchmarks, like a year or two. Subsequent runs will be much cheaper than the first.

@vincentarelbundock
Copy link
Contributor

Also, can we include a download link to the raw benchmark database?

@TomAugspurger
Copy link
Contributor Author

re: sharing the raw database, I'll have to check how vbench hashes the runs (so that it doesn't have to rerun one that it's already done). As long as its not using anything specific to the local file system, that should be fine I think.

I'm guessing that this won't work on windows right now. I've had a bit of trouble getting python setup on windows. I think my VM starves it of ram. I might have a bit of time later this week to look into that.

@vincentarelbundock
Copy link
Contributor

Well, I do know that vbench won't re-run the same benchmarks if they're in the db already. Just try running the suite twice in a row :)

Also, the build machine will likely be Linux or mac, so that shouldn't be a problem.

@josef-pkt
Copy link
Member

@TomAugspurger You don't need to get a Windows VM just for this. I can look at the Windows specific problems when I try to run it.

(the cheapest way to get a almost fully loaded python environment is https://code.google.com/p/winpython/ or pythonxy for integration with Windows.
git also has a portable package, only the ssh key needs to be set up.)

@vincentarelbundock
Copy link
Contributor

For the download link, it would just be a matter of inserting a line in make.py to copy benchmark.db somewhere in the build/html path, and then to include an appropriate link in the docs.

@josef-pkt
Copy link
Member

Thanks to both, It looks good on the website that Vincent added.

One question will be to figure out how noisy the results are, or how to reduce noise (spike before Sep 2013 label)

@vincentarelbundock
Copy link
Contributor

Not sure about noise, but it looks like this varies from run to run. I uploaded a new set and it doesn't have the big Sep 2013 spike (it has other new ones). I think I put my laptop to sleep in the middle of the initial run, so that might explain that. Might be sensitive to what else is going on on the computer. I plan to run this on an "at rest" computer tonight but I don't have that available to me now. Perhaps noise will disappear.

@josef-pkt
Copy link
Member

(aside for this PR:
ols 1.7 milliseconds, ols with formula 6 milliseconds)

Mostly self-contained in the `vb_suite` directory in the main
statsmodels repo.

Also added some docs under `/docs/source/vbench.rst`

BUG:  Change remote repo location to url.

BUG:  Just ignore any database

BUG: Change repo location to url

removed accidental addition of database.

BUG:  Change version to .__version__
@TomAugspurger
Copy link
Contributor Author

Just pushed those two fixes (the version and the git vs. https url).

Thanks for looking at this.

What else needs to be done? Vincent mentioned a short write-up of what this is, how to use it. I can trim down the note I put in the docs to something more manageable.

@TomAugspurger
Copy link
Contributor Author

FYI I just added some for GLM, RLM, WLS, and a bit more for ARIMA. I'm running those now and if everything checks out I'll push them up too.

@josef-pkt josef-pkt added the PR label Feb 19, 2014
@jseabold
Copy link
Member

jseabold commented Apr 2, 2014

Didn't forget about this (totally). Is this in decent shape? I was just thinking how I'd like some quick scripts that I can use to do some profiling.

@jseabold
Copy link
Member

jseabold commented Apr 2, 2014

This looks ok to me. I made some changes locally, so I can add it to my cron jobs and push this along with the docs. I'm likely going to stick this in cron.monthly unless we start focusing on performance more. If everything looks ok after it finishes running and the docs builds then I'll probably open a new PR to add 1-2 commits to this.

@jseabold jseabold mentioned this pull request Apr 2, 2014
@jseabold
Copy link
Member

jseabold commented Apr 2, 2014

@TomAugspurger I moved all of your work here https://github.com/statsmodels/vbench

Let me know if you'd like commit rights to this repo.

@jseabold jseabold closed this Apr 2, 2014
@josef-pkt
Copy link
Member

@jseabold I think it would be better organized if we move the benchmarks into a subdirectory, to keep them separate from the vbench files.

something like:

__import__('benchmark_modules.' + modname') in suite.py

@jseabold
Copy link
Member

jseabold commented Apr 2, 2014

Go ahead and file an issue on the other repo. Feel free to make a PR.

@TomAugspurger
Copy link
Contributor Author

@jseabold Thanks. I've subscribed to the repo. I'll take a look this weekend to see if anything has gone stale. If you run into troubles go ahead and ping me. I'm happy to maintain it.

@jseabold
Copy link
Member

jseabold commented Apr 2, 2014

One thing I didn't do is add any config file to the repo. It seemed the defaults were fine. I also never looked at the vbench code. I just updated the e-mail and upload stuff and ran it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vbench for statsmodels
4 participants