Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start building benchmarks, gradually #1372

Closed
rrnewton opened this issue Apr 18, 2016 · 15 comments
Closed

Start building benchmarks, gradually #1372

rrnewton opened this issue Apr 18, 2016 · 15 comments

Comments

@rrnewton
Copy link
Contributor

rrnewton commented Apr 18, 2016

There has been discussion on ghc-devs about revitalizing the Haskell benchmarking suites. Stackage benchmark suites would appear to be one useful source of data. Indeed, there are almost 200 benchmark suites among the ~1,700 current LTS packages.

However, running Stackage benchamarks is currently hard to achieve, for several reasons:

  • Tarballs generated from cabal sdist are often wrong (see pipes, lts-5.13 which is missing a file)
  • There is no general way to map from a Hackage release back to the original source code commit.
  • Stackage doesn't enforce that any benchmark suites build, much less run. (It does, however, build the dependencies of benchmark suites, except for a list that are skipped.)
  • As a result, even popular packages like lens-4.13 will error when you try to build their benchmarks.

While I understand that benchmarks were left out of the original stackage focus, is it perhaps time to start tightening the screws and weeding out these invalid tarballs?

Perhaps a first step would be a warning mechanism. The maintainers agreement has some hard deadlines, but perhaps failing benchmark suites builds could only be a soft warning for some period of time, and then later become a hard requirement along with the rest.

(CC @RyanGlScott @vollmerm @osa1)

@juhp
Copy link
Contributor

juhp commented Apr 20, 2016

I think it is an interesting idea: to try at least building the Benchmarks in stackage.
Of course it will add to the maintenance burden and build-times.
I dunno if it could be started off as a separate effort before integrating to Stackage itself - it would be good at least to do some initial testing first to get a better idea of how good/bad things are currently.

@phadej
Copy link
Contributor

phadej commented Apr 20, 2016

I tried to build some Subaru of benchmarks around Christmas, many depend on old criterion. So even notifying maintainers about restrictive bounds would be nice

@rrnewton
Copy link
Contributor Author

rrnewton commented Apr 20, 2016

@juhp We will gather data on how many of them already build, and report that here. I hope that just building benchmarks does not add too much to the build time because Stackage already builds benchmark dependencies.

There may be a substantial one-time transitioning cost -- getting people to fix currently-broken packages. But I hope that, long term, building benchmarks won't add too much friction, because it doesn't increase the "surface area" of maintainers and packages. Rather, it's just one more correctness check packages must pass to be included.

@DanBurton
Copy link
Contributor

So even notifying maintainers about restrictive bounds would be nice

As I understand it, we do notify maintainers about restrictive bounds on benchmarks, just the same as with their regular dependencies. But once they get put on the "skipped benchmarks" list, we stop notifying them.

@snoyberg
Copy link
Contributor

I have no problem with turning on benchmark building. We'll just need to populate a field for expected benchmark failures pretty quickly. I'll make the tweak. @juhp I can wait to activate this until I'm back on curator duty if you'd like

@phadej
Copy link
Contributor

phadej commented Apr 21, 2016

@DanBurton oh, I didn't know! That's very nice.

snoyberg added a commit to fpco/stackage-curator that referenced this issue Apr 21, 2016
@juhp
Copy link
Contributor

juhp commented Apr 21, 2016

@snoyberg awesome! Well we can try it now I guess and see how it goes.
I see benchmarks are building now actually - I can try to report the initial results here when that finishes.

@snoyberg
Copy link
Contributor

Yeah, sorry, I moved ahead with it right away to test out my code. If it
causes you trouble, let me know and I'll roll it back.

On Thu, Apr 21, 2016 at 11:08 AM, Jens Petersen notifications@github.com
wrote:

@snoyberg https://github.com/snoyberg awesome! Well we can try it now I
guess and see how it goes.
I see benchmarks are building now actually - I can try to report the
initial results here when that finishes.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#1372 (comment)

@snoyberg
Copy link
Contributor

The following benchmarks appear to be failing:

  • Frames
  • attoparsec
  • bzlib-conduit
  • cacophony
  • carray
  • cipher-aes128
  • cryptohash
  • dbus
  • effect-handlers
  • fast-builder
  • gitson
  • hashable
  • http-link-header
  • idris
  • jose-jwt
  • lens
  • lucid
  • mongoDB
  • mutable-containers
  • picoparsec
  • pipes
  • psqueues
  • rethinkdb
  • stateWriter
  • streaming-commons
  • thyme
  • vector-binary-instances
  • vinyl
  • warp
  • web-routing
  • xmlgen
  • yesod-core
  • yi-rope

Not pinging maintainers on this (though I'll look into a few of those myself).

@juhp
Copy link
Contributor

juhp commented Apr 21, 2016

thanks that's great

@juhp
Copy link
Contributor

juhp commented May 16, 2016

Can we close this now?

@snoyberg
Copy link
Contributor

Yes, I think so

@rrnewton
Copy link
Contributor Author

rrnewton commented May 18, 2016

Just FYI we are now successfully grabbing and running >66 benchmark suites and recording their results. We've got various build errors that don't 100% match up with the list above, and we can gradually help poke maintainers or send PRs.

Many of them are simple fixes like missing other-modules.

@juhp
Copy link
Contributor

juhp commented May 18, 2016

@rrnewton Thank you. If you do want to create an issue to track progress you can do that, otherwise I guess the current Stackage status can be checked in the "Expected benchmark failures" section in build-constraints.yaml at least.

@RyanGlScott
Copy link
Contributor

I'm currently hunting down Stackage benchmark failures separately in this issue. As these packages get fixed and uploaded to Hackage, I'll submit PRs to this repo to add their benchmarks back into the fold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants