Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Begin refactoring of Benchmark module #93

Closed
wants to merge 1 commit into from
Closed

WIP - Begin refactoring of Benchmark module #93

wants to merge 1 commit into from

Conversation

devonestes
Copy link
Collaborator

This is a non-functional, very early step in the process. I'm
just putting it up now to get some feedback before continuing. There
are a lot of things I still will need to work out.

While I was working on adding the memory usage information, it struck
me that it would be a lot easier to add that without major changes if
there was already a struct representing an individual benchmark that I
could add it to. I decided to explore this refactoring a bit, and after
this initial step I kind of like it.

Basically, I split the Benchee.Benchmark module into two parts -
Benchee.Benchmark, which handles creating the benchmark structs and
passing them along to the actual functions that will run the benchmarks,
and another module (Benchee.Benchmark.Runner) that handles the actual
running of the benchmarks.

So, what do folks think? It made sense to me since we already had a
struct for a Suite of benchmarks.

This is a non-functional, very eary, broken step in the process. I'm
just putting it up now to get some feedback before continuing.

While I was working on adding the memory usage information, it struck
me that it would be a lot easier to add that without major changes if
there was already a struct representing an individual benchmark that I
could add it to. I decided to explore this refactoring a bit, and after
this initial step I kind of like it.

Basically, I split the `Benchee.Benchmark` module into two parts -
`Benchee.Benchmark`, which handles creating the benchmark structs and
passing them along to the actual functions that will run the benchmarks,
and another module (`Benchee.Benchmark.Runner`) that handles the actual
running of the benchmarks.

So, what do folks think? It made sense to me since we already had a
struct for a `Suite` of benchmarks.
Copy link
Member

@PragTob PragTob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋

First thanks for making this. I've been wanting to split out a runner like module from Benchmark for a while as that code grew quite big. Also, wanted to get rid off the functions with the way too many arguments by putting then in some form of struct a well, so ver much approve of the direction 👍

Just not too sure about what data to include in the struct, if the struct should be visible in suite at all or just internally for better runner code.

For me this sort of turned into the question if jobs/run_times/statistics should stay separate dumb data structures or if maybe it'd be more useful to aggregate all their data together. I.e. a Benchmark or Job or whatever struct would hold all the data for one particular so .benchmark defines it and then .measure and .statistics will fill it up with appropriate data. That'd be a BIG change and I wanna be as certain as can be that it goes into the right direction. Reworking plugins isn't always fun :D But we're still pre 1.0 so still very much possible.

All in all, really appreciate your input in turn but I also still have to think about it 😄 More of my ramblings inline, few code comments because not too much to comment there =D

if you want we could also meet some time in good ol Berlin and pair refactor/talk things 😃

The bunnies tried to help by going up to their high up vantage point, but weren't really able to have a huge impact.

img_20170619_225131


@type name :: String.t

defstruct [:name, :input, :input_name, :function, :printer, :config,
:run_times, :memory_usage]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought some time about putting some of this data into a struct for easier handling in the benchmarking module so this is a good thought, I'm not completely sure about the data here yet though :) Plus my thought was mostly to just use the struct during benchmarking and then just report back the run_times/memory_usage into the general suite. I have to put further thought on this.

As is it duplicates a lot of data that exists elsewhere already (input, input_name, function, config) it also exposes data that I think should not end up in the suite as it's rather an implementation detail (printer). Specifically as the config is global I'd avoid it ending up again somewhere in the result Benchee Suite.

That said, this gave me another idea, that is worth deliberating. Some of the repeated data (name, function and to some extent input/input_name) defines the benchmarking jobs themselves. So theoretically it might be an option for that when we create benchmarks through Benchee.benchmark to already create this struct and then have the measure step just fill out the run_times and memory_usage. Might be worthwhile.

If we did that though, why not go all the way and include statistics in there as well? That'd save us the sort of "mirrored" data structures we have to access but it'd migrate away from the rather nice "every layer has its own key that it just fills up and following layers may use it or not" approach. Also, with the struct serialization might become slightly more involved but then benchee_json would finally have something to do 😂 :


%Suite{suite | run_times: run_times}
%Suite{suite | benchmarks: benchmarks_with_results}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this breaks our interface/data structure which means that all plugins would have to be patched/simultaneously released again. Not that I'm totally opposed to doing this, if we do this I just wanna be super duper sure we're happy with the new structure/interface because I wanna slowly gravitate towards a 1.0 and then I definitely don't wanna have breaking changes like this for quite some times :)

That goes for the struct contents, but also If I read the code correctly it is no longer divided in a map by inputs. Aka not %{input => %{"job1" => ...}} but [job_1_input_1, job_1_input_2, job_2_input_1, ...]. I believe the nested structure is preferrable as for comparisons etc. you mostly just wanna look at functions that ran with the same input to have an ok comparison.

Also used to be able to directly Map.fetch the benchmark we needed by name, I'd have to review the plugin code to see how important that is...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing that pops to my head is that I'm not too sure about the benchmarks naming. I still haven't totally gotten down all the vocabulary. When I say "I wrote a benchmark" I mostly mean the whole thing e.g. all jobs, all inputs together are one benchmark, or in the benchee terminology somewhat a benchmarking suite.

So calling each individual also "benchmark" seems weird, I took to calling the individual functions to be "benchmarking jobs" and they then gather "measurements" or something, but not too certain about that and not to set about that

alias Benchee.Suite
alias Benchee.Benchmark.Runner
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Like the name, often thought about doing the very same refactoring

"""
alias Benchee.Utility.RepeatN
alias Benchee.Benchmark

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see why you wanted so much data in the Benchmark struct, but I worry a bit when/if that's supposed to be public. It might be another way to create another more internal benchmark structure here (or well in another module) to get the additional data if needed :)

collection
|> Enum.map(fn(_) -> Task.async(func) end)
|> Enum.map(&Task.await(&1, :infinity))
|> List.flatten
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it was like this, reading this I just wonder if we couldn't use flat_map ?

defp run_warmup(benchmark = %Benchmark{config: config}) do
print = Map.put(config.print, :fast_warning, false)
config = config |> Map.put(:print, print) |> Map.put(:time, config.warmup)
benchmark = %Benchmark{benchmark | config: config}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure, but overriding the config fir the benchmark and getting a new benchmark just for that feels a bit clumsy 🤔

Not sure if maybe tagging along the parameter to control this might be easier in the end.

:erlang.garbage_collect
{n, initial_run_time} = determine_n_times(benchmark)
benchmark = %Benchmark{benchmark | run_times: [initial_run_time]}
do_benchmark(benchmark, n, current_time(), finish_time)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could even imagine the n becoming part of the public facing struct (with a nicer name, like repeats or something), that way plugins could also output that information in the reports.

end

defp do_benchmark(benchmark, n, current_time, finish_time)
defp do_benchmark(benchmark = %Benchmark{run_times: run_times}, _, current_time, finish_time)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finally a managable amount of parameters 😍

The function had been bugging me for some time but didn't quite know how to tackle it best. It is a lot of the reason why credo doesn't run on the CI yet.

@PragTob
Copy link
Member

PragTob commented Jun 28, 2017

Okay we talked and thought about great new ways:

  • we'll call the combination of jobs and inputs along with all their measurements "scenarios"
  • these basically replace all of jobs, run_times and statistics keys in the suite structure as they are now in on and will be stored in a list ( @devonestes has the list of keys we want to be in there)
  • the runner will use something like a ScenarioContext to do its work where the additional data that shouldn't end up in the scenarios themselves (config, printer, probably start and finish time)
  • we probably one move along with adding this structure first and then retackle the extraction of the runnter

@devonestes
Copy link
Collaborator Author

Closing this in favor of #95 & #96

@devonestes devonestes closed this Jul 9, 2017
@devonestes devonestes deleted the refactor-add-benchmark-struct branch July 9, 2017 11:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants