Skip to content

Conversation

shahsaurabh0605
Copy link
Contributor

Added a method to detect benchmark regressions. This fetches the result of last 10 commits from the database and compares with the newly created result.

@tgxworld
Copy link
Member

tgxworld commented Jul 5, 2016

commit_objects.each do |object|
commit_results << object.result[category]
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this should be done in a background job

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The background job can be called after the current benchmark_run is saved to the database successfully.


def perform(initiator, benchmark_type, benchmark_result_type, category)
commit_objects = BenchmarkRun.where( initiator: initiator, benchmark_type: benchmark_type,
benchmark_result_type: benchmark_result_type).limit(10)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should be the best value here instead of 10?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What values have you considered and why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can hardcode it to a specific number but this largely depends on what baseline we are considering to detect performance regression in this new commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What method have you decided to implement to detect regressions? Based on the method, you can them derive a reasonable value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shahsaurabh0605
Copy link
Contributor Author

Have a look at this.

results_standard_deviation = Math.sqrt(commit_results.inject(0){|accum, i| accum +(i-results_avg)**2 }/(commit_results.length - 1).to_f)
results_param = results_avg + 2*results_standard_deviation

if commit_objects[0].result[category] > results_param
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the new result should be included in the calculation of the standard deviation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not including the new result in the calculation of standard deviation by dropping the first element in the loop (see commit_objects.drop(1))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh ic. Anyway, the default scope is to order by created_at, what we want here is to order by Commit#created_at instead.

@shahsaurabh0605 shahsaurabh0605 force-pushed the diverged3 branch 2 times, most recently from 4e551f3 to 13dd3cd Compare July 8, 2016 07:03

def perform(initiator, benchmark_type, benchmark_result_type, category)
commit_objects = BenchmarkRun.where(benchmark_type: benchmark_type,
benchmark_result_type: benchmark_result_type).offset(10).limit(1000)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since, we are fetching 1000 results it doesn't make much of a difference if we leave a first few results. So we can set a safety offset(here 10) so that we can be sure that we are fetching the correct results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since, we are fetching 1000 results it doesn't make much of a difference if we leave a first few results.

It does make a difference if the last 10 results are way off. What you want here is the benchmark results of the last 1000 commits which you can easily fetch.

benchmark_result_type: benchmark_result_type).order(:created_at).first.initiator_id

benchmark_runs = BenchmarkRun.where(initiator_type: Commit, benchmark_type: benchmark_type,
benchmark_result_type: benchmark_result_type).offset(last_initiator - current_initiator).limit(1000)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sets the offset so that the benchmarks which came after our benchmark are not considered.

@shahsaurabh0605
Copy link
Contributor Author

Done 👍

@tgxworld
Copy link
Member

Is this PR done?

@shahsaurabh0605
Copy link
Contributor Author

You can have a look at it once. According to me, it's done 😄

@tgxworld
Copy link
Member

hmm I see WIP in the title and we don't seem to be creating an issue in Github yet when a regression is detected? You need to add tests as well otherwise, we won't know that your code is working as intended.

@shahsaurabh0605 shahsaurabh0605 changed the title [WIP] Detecting benchmark regressions Detecting benchmark regressions Jul 18, 2016
@shahsaurabh0605
Copy link
Contributor Author

Done 👍

category = "sometime"
BenchmarkRegressionJob.new.perform(benchmark_run, initiator, benchmark_type, benchmark_result_type, category)
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

erm.... what exactly is this test testing for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's testing for the new benchmark regression job which I added!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please read through http://guides.rubyonrails.org/testing.html.

I am not following how your test is actually testing that the job you added works

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I add tests for this job differently? Currently this test just runs the perform method with appropriate parameters to test the method which is similar to what implemented in remote_server_job tests.

Copy link
Member

@tgxworld tgxworld Jul 19, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't similar to remote_server_job tests. Note how those tests sets expectations about certain method being called.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you're doing here is just calling the method.

  • How do you know if it is correctly detecting a regression?
  • When it correctly detects a regression, what is the code expected to do?

These have to be included in the tests

@shahsaurabh0605
Copy link
Contributor Author

Done 👍

@tgxworld
Copy link
Member

tgxworld commented Aug 1, 2016

What is done ❓

@shahsaurabh0605
Copy link
Contributor Author

I have made the changes. You can have a look.

uri = URI.parse(Rails.application.secrets.github_api+"?state=open&since=#{get_time}")
response = JSON.parse(Net::HTTP.get(uri))
response.each do |response|
puts response["body"].nil?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shahsaurabh0605 Please check your PR... I've mentioned this before.

@shahsaurabh0605
Copy link
Contributor Author

Made the changes.

test "check for similar issues" do
stub_request(:get, Rails.application.secrets.github_api+"?state=open&since=#{BenchmarkRegressionJob.new.get_time}").
with(:headers => {'Accept'=>'*/*', 'Accept-Encoding'=>'gzip;q=1.0,deflate;q=0.6,identity;q=0.3', 'User-Agent'=>'Ruby'}).
to_return(:status => 200, :body => %Q[["body"]], :headers => {})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we are not using a VCR cassette here ❓

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are using stub_request then we are not using an actual api request. So do we need vcr cassettes over here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are using VCR for the other request to GitHub so I'm curious why we end up stubbing here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are using current time in the api request which continuously changes. So i think stubbing the request must be the solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is https://github.com/travisjeffery/timecop for that. I just realized you're only unit testing each method individually. How are you sure that when we glue everything together, it'll work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make sure all the pieces glue together, i think i need to add test for the perform method which makes use of all the other methods.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do :) I actually think a single test will cover everything. We don't really need to unit test those methods individually. Be sure to cover failure cases as well. Example: When an issue is not supposed to be created

@tgxworld
Copy link
Member

tgxworld commented Aug 2, 2016

The PR is looking good. One last thing. I need you to fix up your code style because it is all over the place right now.

  1. Leave a blank line before and after for code that spans multi lines.
  2. Leave a space after , for method parameters. For example: create_issue(@benchmark_run, @benchmark_run.result.keys[0],4.0)
  3. Math.sqrt(previous_benchmark_results.inject(0){|accum, i| accum +(i-results_average)**2 }/(previous_benchmark_results.size - 1).to_f) Some operators have spaces before, some don't

Please look through each and every line carefully to make sure the code style is consistent.

@shahsaurabh0605 shahsaurabh0605 force-pushed the diverged3 branch 2 times, most recently from 2177622 to 02e0b3b Compare August 2, 2016 09:51
travel_to "2016-06-03 13:50:41 +0530" do
VCR.use_cassette('benchmark_regression') do
assert_equal ["sometime"], BenchmarkRegressionJob.new.perform(@benchmark_run.id)
end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit stuck here. I am able to travel to a particular time and use VCR cassettes. I need to add two tests, one for the regression detected and issue created and another for issue not created. But how can I test this as the method does not return response code now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well then you got to think about what you are testing for. In this case, you care that certain HTTP calls will be made under the right scenario. Whether the response code is returned by the method doesn't really matter because the VCR cassette will end up returning the same response code every when is it using the cassette.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually just restore the previous unit tests. I realized the trouble of writing integration test isn't worth it.

@shahsaurabh0605
Copy link
Contributor Author

Any errors in indentation?

@shahsaurabh0605
Copy link
Contributor Author

Anything left in this?

@tgxworld
Copy link
Member

tgxworld commented Aug 4, 2016

Looks good to me. I'll probably have to merge this over the weekend so that I can fix the tests after merging

@tgxworld
Copy link
Member

tgxworld commented Aug 4, 2016

@shahsaurabh0605 Thanks for the work :)

@shahsaurabh0605
Copy link
Contributor Author

@tgxworld tgxworld added the GSOC label Aug 12, 2016
@tgxworld
Copy link
Member

tgxworld commented Dec 4, 2017

Closing as stale.

@tgxworld tgxworld closed this Dec 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants