From 59f9886dc5ec98e7c2d21cb8b1d3f6bed3811a15 Mon Sep 17 00:00:00 2001 From: Tobias Pfeiffer Date: Thu, 14 Dec 2023 20:00:42 +0100 Subject: [PATCH] document newly introduced formatter limitations --- CHANGELOG.md | 13 +++++++++++++ lib/benchee/formatter.ex | 25 ++++++++++++++++++++++++- test/benchee/configuration_test.exs | 5 +++++ 3 files changed, 42 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 3e836848..19e36015 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,8 +13,21 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 * `Benchee.report/1` got introduced if you just want to load saves benchmarks and report on them. ### Bugfixes (User Facing) +* Memory usage should be massively reduced when dealing with larger sets of data in inputs or benchmarking functions. They were needlessly sent to processes calculating statistics or formatters which could lead to memory blowing up. +* Similarly, inputs and benchmarking functions will no longer be saved when using the `:save` option, this makes it immensely faster and depending on the size of the data a lot slower (I have an example with a factor 200x). The side effect of this is that you also can't use `:load` and run the benchmarks saved again from just the file, this was never an intended use case though (as loading happens after benchmarking by default). You also still should have the benchmarking script so it's also not needed. * Fix a bug where relative statistics would always rely on the inputs provided in the config, which can break when you load saved benchmarks. +### Breaking Changes (Plugins) +Woopsie, didn't wanna do any of these in 1.x, sorry but there's good reason :( + +* Formatters have lost access to benchmarking functions and the inputs, this is to enable huge memory and run time savings when using a lot of data. I also believe they should not be needed for formatters, please get in touch if this is a problem so we can work it out. In detail this means: + * Each `Benchee.Scenario` struct in a formatter will have `:function` and `:input` set to `nil` + * The `inputs` list in `Configuration` retains the input names, but the values will be set to `:scrubbed_see_1_3_0_changelog`. It may be completely scrubbed in the future, use the newly introduced `input_names` instead if you need easy access to all the input names at once. + * Technically speaking formatters haven't generally lost access, only if they are processed in parallel - so not if it's the only formatter or if it's used via a function (`formatters: [fn suite -> MyFormatter.output(suite) end]`. Still, should not be used or relied upon. + +### Features (Plugins) +* `Configuration` now has an `input_names` key that holds the name of all inputs, for the reasoning, see above. + ### Features (Plugins) * `jit_enabled?` is exposed as part of the `suite.system` struct * Yes, `Benchee.System` is now a struct so feel easier about relying on the fields diff --git a/lib/benchee/formatter.ex b/lib/benchee/formatter.ex index 9c251c66..259ab5fa 100644 --- a/lib/benchee/formatter.ex +++ b/lib/benchee/formatter.ex @@ -20,12 +20,35 @@ defmodule Benchee.Formatter do """ @type options :: any + @typedoc """ + A suite scrubbed of heavy data. + + Type to bring awareness to the fact that `format/2` doesn't have access to + _all_ data in `Benchee.Suite` - please read the docs for `format/2` to learn + more. + """ + @type scrubbed_suite :: Suite.t() + @doc """ + Takes a suite and returns a representation `write/2` can use. + Takes the suite and returns whatever representation the formatter wants to use to output that information. It is important that this function **needs to be pure** (aka have no side effects) as Benchee will run `format/1` functions of multiple formatters in parallel. The result will then be passed to - `write/1`. + `write/2`. + + **Note:** Due to memory consumption issues in benchmarks with big inputs, the suite + passed to the formatters **is missing anything referencing big input data** to avoid + huge memory consumption and run time. Namely this constitutes: + * `Benchee.Scenario` will have `function` and `input` set to `nil` + * `Benchee.Configuration` will have `inputs`, but it won't have values only the names, + it may be removed in the future please use `input_names` instead if needed (or + `input_name` of `Benchee.Scenario`) + + Technically speaking this "scrubbing" of `Benchee.Suite` only occurs when formatters + are run in parallel, you still shouldn't rely on those values (and they should not + be needed). If you do need them for some reason, please get in touch/open an issue. """ @callback format(Suite.t(), options) :: any diff --git a/test/benchee/configuration_test.exs b/test/benchee/configuration_test.exs index 43d8e898..f6eca5e4 100644 --- a/test/benchee/configuration_test.exs +++ b/test/benchee/configuration_test.exs @@ -41,6 +41,11 @@ defmodule Benchee.ConfigurationTest do init(inputs: %{"A" => 1, "B" => 2}) end + test "input_names are normalized" do + assert %Suite{configuration: %{input_names: ["a"]}} = + init(inputs: %{a: 1}) + end + test "no inputs, no input_names" do assert %Suite{configuration: %{input_names: []}} = init() end