Detecting benchmark regressions #174

shahsaurabh0605 · 2016-07-05T12:19:02Z

Added a method to detect benchmark regressions. This fetches the result of last 10 commits from the database and compares with the newly created result.

tgxworld · 2016-07-05T12:39:05Z

Hmm have you seen my post in http://community.rubybench.org/t/gsoc-project-improving-rubybench/99/53?u=tgxworld? 😄

tgxworld · 2016-07-05T12:39:57Z

app/controllers/benchmark_runs_controller.rb

+      commit_objects.each do |object|
+        commit_results << object.result[category]
+      end
+    end


All this should be done in a background job

The background job can be called after the current benchmark_run is saved to the database successfully.

shahsaurabh0605 · 2016-07-06T09:44:35Z

app/jobs/benchmark_regression_job.rb

+
+  def perform(initiator, benchmark_type, benchmark_result_type, category)
+    commit_objects = BenchmarkRun.where( initiator: initiator, benchmark_type: benchmark_type,
+                     benchmark_result_type: benchmark_result_type).limit(10)


What should be the best value here instead of 10?

What values have you considered and why?

I think we can hardcode it to a specific number but this largely depends on what baseline we are considering to detect performance regression in this new commit.

What method have you decided to implement to detect regressions? Based on the method, you can them derive a reasonable value.

I added a post in http://community.rubybench.org/t/gsoc-project-improving-rubybench/99/55 to discuss the method.

shahsaurabh0605 · 2016-07-07T10:21:42Z

Have a look at this.

tgxworld · 2016-07-07T10:31:58Z

app/jobs/benchmark_regression_job.rb

+    results_standard_deviation = Math.sqrt(commit_results.inject(0){|accum, i| accum +(i-results_avg)**2 }/(commit_results.length - 1).to_f)
+    results_param = results_avg + 2*results_standard_deviation
+
+    if commit_objects[0].result[category] > results_param


I don't think the new result should be included in the calculation of the standard deviation

I am not including the new result in the calculation of standard deviation by dropping the first element in the loop (see commit_objects.drop(1))

ahh ic. Anyway, the default scope is to order by created_at, what we want here is to order by Commit#created_at instead.

shahsaurabh0605 · 2016-07-08T07:05:45Z

app/jobs/benchmark_regression_job.rb

+
+  def perform(initiator, benchmark_type, benchmark_result_type, category)
+    commit_objects = BenchmarkRun.where(benchmark_type: benchmark_type,
+                     benchmark_result_type: benchmark_result_type).offset(10).limit(1000)


Since, we are fetching 1000 results it doesn't make much of a difference if we leave a first few results. So we can set a safety offset(here 10) so that we can be sure that we are fetching the correct results.

Since, we are fetching 1000 results it doesn't make much of a difference if we leave a first few results.

It does make a difference if the last 10 results are way off. What you want here is the benchmark results of the last 1000 commits which you can easily fetch.

shahsaurabh0605 · 2016-07-16T15:02:10Z

app/jobs/benchmark_regression_job.rb

+                     benchmark_result_type: benchmark_result_type).order(:created_at).first.initiator_id
+
+    benchmark_runs = BenchmarkRun.where(initiator_type: Commit, benchmark_type: benchmark_type,
+                     benchmark_result_type: benchmark_result_type).offset(last_initiator - current_initiator).limit(1000)


This sets the offset so that the benchmarks which came after our benchmark are not considered.

shahsaurabh0605 · 2016-07-16T15:35:46Z

Done 👍

tgxworld · 2016-07-17T01:45:17Z

Is this PR done?

shahsaurabh0605 · 2016-07-17T07:04:08Z

You can have a look at it once. According to me, it's done 😄

tgxworld · 2016-07-18T04:41:01Z

hmm I see WIP in the title and we don't seem to be creating an issue in Github yet when a regression is detected? You need to add tests as well otherwise, we won't know that your code is working as intended.

shahsaurabh0605 · 2016-07-18T16:20:07Z

Done 👍

tgxworld · 2016-07-18T16:23:55Z

test/jobs/benchmark_regression_job_test.rb

+    category = "sometime"
+    BenchmarkRegressionJob.new.perform(benchmark_run, initiator, benchmark_type, benchmark_result_type, category)
+  end
+end


erm.... what exactly is this test testing for?

It's testing for the new benchmark regression job which I added!

Can you please read through http://guides.rubyonrails.org/testing.html.

I am not following how your test is actually testing that the job you added works

How do I add tests for this job differently? Currently this test just runs the perform method with appropriate parameters to test the method which is similar to what implemented in remote_server_job tests.

It isn't similar to remote_server_job tests. Note how those tests sets expectations about certain method being called.

What you're doing here is just calling the method.

How do you know if it is correctly detecting a regression?

When it correctly detects a regression, what is the code expected to do?

These have to be included in the tests

shahsaurabh0605 · 2016-08-01T10:05:05Z

Done 👍

tgxworld · 2016-08-01T10:07:30Z

What is done ❓

shahsaurabh0605 · 2016-08-01T10:09:48Z

I have made the changes. You can have a look.

tgxworld · 2016-08-01T10:13:42Z

app/jobs/benchmark_regression_job.rb

+    uri = URI.parse(Rails.application.secrets.github_api+"?state=open&since=#{get_time}")
+    response = JSON.parse(Net::HTTP.get(uri))
+    response.each do |response|
+      puts response["body"].nil?


@shahsaurabh0605 Please check your PR... I've mentioned this before.

shahsaurabh0605 · 2016-08-01T12:18:33Z

Made the changes.

tgxworld · 2016-08-02T04:03:10Z

test/jobs/benchmark_regression_job_test.rb

+  test "check for similar issues" do
+    stub_request(:get, Rails.application.secrets.github_api+"?state=open&since=#{BenchmarkRegressionJob.new.get_time}").
+    with(:headers => {'Accept'=>'*/*', 'Accept-Encoding'=>'gzip;q=1.0,deflate;q=0.6,identity;q=0.3', 'User-Agent'=>'Ruby'}).
+    to_return(:status => 200, :body => %Q[["body"]], :headers => {})


Is there a reason why we are not using a VCR cassette here ❓

If we are using stub_request then we are not using an actual api request. So do we need vcr cassettes over here?

You are using VCR for the other request to GitHub so I'm curious why we end up stubbing here

Here we are using current time in the api request which continuously changes. So i think stubbing the request must be the solution.

There is https://github.com/travisjeffery/timecop for that. I just realized you're only unit testing each method individually. How are you sure that when we glue everything together, it'll work?

To make sure all the pieces glue together, i think i need to add test for the perform method which makes use of all the other methods.

Please do :) I actually think a single test will cover everything. We don't really need to unit test those methods individually. Be sure to cover failure cases as well. Example: When an issue is not supposed to be created

tgxworld · 2016-08-02T04:08:07Z

The PR is looking good. One last thing. I need you to fix up your code style because it is all over the place right now.

Leave a blank line before and after for code that spans multi lines.
Leave a space after , for method parameters. For example: create_issue(@benchmark_run, @benchmark_run.result.keys[0],4.0)
Math.sqrt(previous_benchmark_results.inject(0){|accum, i| accum +(i-results_average)**2 }/(previous_benchmark_results.size - 1).to_f) Some operators have spaces before, some don't

Please look through each and every line carefully to make sure the code style is consistent.

shahsaurabh0605 · 2016-08-02T09:57:01Z

test/jobs/benchmark_regression_job_test.rb

+    travel_to "2016-06-03 13:50:41 +0530" do
+      VCR.use_cassette('benchmark_regression') do
+        assert_equal ["sometime"], BenchmarkRegressionJob.new.perform(@benchmark_run.id)
+      end


I am a bit stuck here. I am able to travel to a particular time and use VCR cassettes. I need to add two tests, one for the regression detected and issue created and another for issue not created. But how can I test this as the method does not return response code now?

Well then you got to think about what you are testing for. In this case, you care that certain HTTP calls will be made under the right scenario. Whether the response code is returned by the method doesn't really matter because the VCR cassette will end up returning the same response code every when is it using the cassette.

Actually just restore the previous unit tests. I realized the trouble of writing integration test isn't worth it.

shahsaurabh0605 · 2016-08-02T11:17:49Z

Any errors in indentation?

shahsaurabh0605 · 2016-08-04T10:36:08Z

Anything left in this?

tgxworld · 2016-08-04T10:38:20Z

Looks good to me. I'll probably have to merge this over the weekend so that I can fix the tests after merging

tgxworld · 2016-08-04T10:38:47Z

@shahsaurabh0605 Thanks for the work :)

shahsaurabh0605 · 2016-08-05T06:03:31Z

Have you seen my reply http://community.rubybench.org/t/gsoc-project-improving-rubybench/99/70 ?

tgxworld · 2017-12-04T13:34:34Z

Closing as stale.

shahsaurabh0605 force-pushed the diverged3 branch from e8fbccf to 415236b Compare July 5, 2016 12:20

tgxworld reviewed Jul 5, 2016
View reviewed changes

shahsaurabh0605 force-pushed the diverged3 branch from 415236b to fc4f350 Compare July 6, 2016 09:43

shahsaurabh0605 reviewed Jul 6, 2016
View reviewed changes

shahsaurabh0605 force-pushed the diverged3 branch from fc4f350 to ae15b3c Compare July 7, 2016 10:20

tgxworld reviewed Jul 7, 2016
View reviewed changes

shahsaurabh0605 force-pushed the diverged3 branch 2 times, most recently from 4e551f3 to 13dd3cd Compare July 8, 2016 07:03

shahsaurabh0605 reviewed Jul 8, 2016
View reviewed changes

shahsaurabh0605 force-pushed the diverged3 branch from 13dd3cd to 1f6ad98 Compare July 16, 2016 14:58

shahsaurabh0605 reviewed Jul 16, 2016
View reviewed changes

shahsaurabh0605 force-pushed the diverged3 branch from 1f6ad98 to 68817ff Compare July 16, 2016 15:33

shahsaurabh0605 changed the title ~~[WIP] Detecting benchmark regressions~~ Detecting benchmark regressions Jul 18, 2016

shahsaurabh0605 force-pushed the diverged3 branch from 68817ff to 2394fb3 Compare July 18, 2016 16:12

tgxworld reviewed Jul 18, 2016
View reviewed changes

shahsaurabh0605 force-pushed the diverged3 branch from 2394fb3 to 2097328 Compare July 20, 2016 04:24

shahsaurabh0605 force-pushed the diverged3 branch from b2e862c to f9095dd Compare August 1, 2016 10:03

tgxworld reviewed Aug 1, 2016
View reviewed changes

shahsaurabh0605 force-pushed the diverged3 branch from f9095dd to 2d8689d Compare August 1, 2016 12:15

shahsaurabh0605 force-pushed the diverged3 branch from 2d8689d to a404814 Compare August 2, 2016 02:35

tgxworld reviewed Aug 2, 2016
View reviewed changes

shahsaurabh0605 force-pushed the diverged3 branch 2 times, most recently from 2177622 to 02e0b3b Compare August 2, 2016 09:51

shahsaurabh0605 reviewed Aug 2, 2016
View reviewed changes

Detecting benchmark regressions

d3800b1

shahsaurabh0605 force-pushed the diverged3 branch from 02e0b3b to d3800b1 Compare August 2, 2016 11:16

tgxworld added the GSOC label Aug 12, 2016

tgxworld closed this Dec 4, 2017

Detecting benchmark regressions #174

Detecting benchmark regressions #174

Uh oh!

Conversation

shahsaurabh0605 commented Jul 5, 2016

Uh oh!

tgxworld commented Jul 5, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shahsaurabh0605 commented Jul 7, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shahsaurabh0605 commented Jul 16, 2016

Uh oh!

tgxworld commented Jul 17, 2016

Uh oh!

shahsaurabh0605 commented Jul 17, 2016

Uh oh!

tgxworld commented Jul 18, 2016

Uh oh!

shahsaurabh0605 commented Jul 18, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tgxworld Jul 19, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shahsaurabh0605 commented Aug 1, 2016

Uh oh!

tgxworld commented Aug 1, 2016

Uh oh!

shahsaurabh0605 commented Aug 1, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shahsaurabh0605 commented Aug 1, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tgxworld Jul 19, 2016 •

edited

Loading

tgxworld commented Aug 2, 2016 •

edited

Loading