New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add manual runner #210

Merged
merged 9 commits into from Jun 15, 2017

Conversation

Projects
None yet
4 participants
@bmarkons
Copy link
Collaborator

bmarkons commented Jun 8, 2017

I needed to run Sequel benchmarks for last 2000 commits. So far I could do it in production console in a pretty hacky (unsafe) way. Therefore I want to add app/services/manual_runner.rb which would fetch, store and run last 2000 commits from github api. That way we could run last number of commits manually in a tested way.

@tgxworld

This comment has been minimized.

Copy link
Member

tgxworld commented Jun 8, 2017

Excellent work :) This is what I've been wanting to do but have not gotten around to do so. This can be a seperate PR but can we add an admin interface as well?

@@ -0,0 +1,25 @@
module CommitsRunner

This comment has been minimized.

@tgxworld

tgxworld Jun 9, 2017

Member

Let's just make this a class :) I usually prefer to use modules only if you're including them into other classes.

end

def create(commit)
Commit.find_or_create_by(sha1: commit[:sha]) do |c|

This comment has been minimized.

@tgxworld

tgxworld Jun 9, 2017

Member

I think we might need to check if the creation is successfully. Otherwise, it might silently fail.

This comment has been minimized.

@bmarkons

bmarkons Jun 9, 2017

Collaborator

👍 will switch to find_or_create_by!

private

def fetch_commits_since(date)
Octokit.auto_paginate = true

This comment has been minimized.

@tgxworld

tgxworld Jun 9, 2017

Member

Does this paginate the response? Otherwise we might be fetching 2000 commits into memory.

This comment has been minimized.

@bmarkons

bmarkons Jun 9, 2017

Collaborator

Yeah, this will cause 2000 commits to be fetched into memory.

Maybe I could disable auto_paginate and enqueue commits after each fetched page instead of fetching all and enqueuing them all after?

class CommitsRunnerTest < ActiveSupport::TestCase
setup do
@repo = create(:repo)
BenchmarkPool.stubs(:enqueue)

This comment has been minimized.

@tgxworld

tgxworld Jun 9, 2017

Member

Instead of stubbing, perhaps we can look into the using the helps provided by sidekiq?

https://github.com/mperham/sidekiq/wiki/Testing

This comment has been minimized.

@bmarkons

bmarkons Jun 9, 2017

Collaborator

sure 👍

@bmarkons

This comment has been minimized.

Copy link
Collaborator

bmarkons commented Jun 9, 2017

@tgxworld Yes 🎉 I was thinking about some admin UI, but wasn't sure. We can add it in later PR 👏

I would rather keep this PR separate and use ManualRunner from console to run sequel last 2000 commits.

bmarkons added some commits Jun 8, 2017

@bmarkons bmarkons force-pushed the bmarkons:manual-runner branch from 79cdf52 to be7c03f Jun 9, 2017

@bmarkons bmarkons force-pushed the bmarkons:manual-runner branch 3 times, most recently from 5bfe6eb to 1ea63b3 Jun 9, 2017

@bmarkons bmarkons force-pushed the bmarkons:manual-runner branch from 1ea63b3 to b5d54da Jun 9, 2017

@bmarkons bmarkons changed the title Add manual runner [WIP] Add manual runner Jun 9, 2017

@bmarkons

This comment has been minimized.

Copy link
Collaborator

bmarkons commented Jun 9, 2017

Please feel free to drop suggestions on PR. Once we have this merged, I will execute benchmarks for last 2000 commits for sequel repo.

BTW, we already have some sequel runs present https://rubybench.org/jeremyevans/sequel/commits

@noahgibbs

This comment has been minimized.

Copy link

noahgibbs commented Jun 9, 2017

Looks reasonable to me.

@robin850
Copy link
Member

robin850 left a comment

Cool work Marko ! 🎉

class CommitsRunner
def self.run(commits)
commits.each do |commit|
if (valid?(commit))

This comment has been minimized.

@robin850

robin850 Jun 9, 2017

Member

You can remove the braces around the valid? call here.

author: {
name: commit['commit']['author']['name']
}
}

This comment has been minimized.

@robin850

robin850 Jun 9, 2017

Member

Given the amount of data that this method will have to handle, it may be worth trying to optimize it as much as you can. This is certainly going to be faster using something like:

def format_commits(commits)
  commits.each do |commit|
    commit[:repo] = @repo
    commit.merge(commit['commit'])
    commit.symbolize_keys!
  end

  commits
end

You don't really care here if the objects are modified here. Also with the above example you don't have exactly the same API but not sure it's worth allocating a new hash to be able to do something like commit[:url] instead of commit[:html_url].

I may be wrong though (both on the fact that you can modify objects without copying them and on the fact that may be faster) ; maybe you can write a tiny benchmark with benchmark/ips to see how fast it is and also check the number of allocated objects. What do you think ? 🙂

This comment has been minimized.

@noahgibbs

noahgibbs Jun 9, 2017

If this is running the actual tests for each commit (and it is, right?) then the amount of work in this method will absolutely pale in comparison to the time to run the tests :-(

This comment has been minimized.

@noahgibbs

noahgibbs Jun 9, 2017

2000 commits (so 2000 times creating a new hash table) just won't be a big deal on that scale.

This comment has been minimized.

@robin850

robin850 Jun 9, 2017

Member

Since :repo and :author references a hash as well, it would rather be 6000 times creating a Hash (while they are already allocated somewhere by Octokit). Either way, your call Marko, I don't mind ; there's certainly no big deal. :-)

This comment has been minimized.

@bmarkons

bmarkons Jun 9, 2017

Collaborator

Thank you guys for reviews ! 🙇

not sure it's worth allocating a new hash

Excuse me for no explanation why do I format commits before passing them to CommitsRunner.

It is simply because commits fetched from api and webhook don't have the same format and CommitsRunner is intended to handle both manually and automatically (via webhook) fetched commits.

So far, we've only handled commits from webhook via GithubEventHandler. Since now we have introduced ManualRunner I want to move creating and running commits in CommitsRunner. Therefore GithubEventHandler and ManualRunner should only receive commits and pass them to CommitsRunner in specific format.

Since :repo and :author references a hash as well, it would rather be 6000 times creating a Hash

I wasn't thinking about performance too much here simply because I don't find it very crucial in this scenario as @noahgibbs noticed. But creating 2000 hashes instead of 6000 can be fixed with this:

{
  sha: commit['sha'],
  message: commit['commit']['message'],
  repo_id: @repo.id,
  repo_name: @repo.name,
  url: commit['html_url'],
  created_at: commit['commit']['author']['date'],
  author_name: commit['commit']['author']['name']
}

I'll change that 👍

This comment has been minimized.

@bmarkons

bmarkons Jun 9, 2017

Collaborator

I plan to move creating and running commits from GithubEventHandler in later PR. :)

This comment has been minimized.

@robin850

robin850 Jun 10, 2017

Member

Okay, never mind then, my comment isn't really relevant ! :-)

private

def self.valid?(commit)
!Commit.merge_or_skip_ci?(commit[:message]) && Commit.valid_author?(commit[:author_name])

This comment has been minimized.

@robin850

robin850 Jun 9, 2017

Member

Unless I'm missing something, this will always return nil, it should rather be commit[:author][:name].

This comment has been minimized.

@bmarkons

bmarkons Jun 9, 2017

Collaborator

Oh yeah, good catch. I wonder how this passed test then 😕 will fix it.

commits_hashes.each do |hash|
commit = Commit.find_by!(sha1: hash[:sha])

assert_equal commit.sha1, hash[:sha]

This comment has been minimized.

@robin850

robin850 Jun 9, 2017

Member

I guess you should put arguments the other way around to have a non-misleading error message in case things are not equal.

assert_equal hash[:sha], commit.sha1

@bmarkons bmarkons force-pushed the bmarkons:manual-runner branch from 3627df3 to 0d4e466 Jun 12, 2017

@bmarkons

This comment has been minimized.

Copy link
Collaborator

bmarkons commented Jun 12, 2017

Thanks @robin850, I've made suggested changes 😌

@bmarkons bmarkons force-pushed the bmarkons:manual-runner branch from 0d4e466 to 1ec2345 Jun 12, 2017

@bmarkons bmarkons force-pushed the bmarkons:manual-runner branch 3 times, most recently from 185b2c0 to 64b15aa Jun 12, 2017

@bmarkons bmarkons force-pushed the bmarkons:manual-runner branch from 64b15aa to 13f6ddc Jun 13, 2017

@bmarkons

This comment has been minimized.

Copy link
Collaborator

bmarkons commented Jun 15, 2017

@tgxworld can we merge this one? :)

@tgxworld tgxworld merged commit 9203992 into ruby-bench:master Jun 15, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@bmarkons bmarkons added GSOC gsoc2017 and removed bug GSOC labels Aug 21, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment