Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce new benchmark framework. #57

Merged
merged 4 commits into from Sep 28, 2018

Conversation

Projects
None yet
2 participants
@stefanseefeld
Copy link
Member

commented Aug 27, 2018

To make the benchmarks more useful, I propose the following functional requirements:

  • measure timing over a range of sizes, rather than for a single fixed parameter
  • split benchmarks into separate executables, to be able to collect timings per benchmark
  • allow generation of graphs by overlaying plots from different benchmarks (or benchmark runs) for easier (visual) comparison.
  • parametrize the code to more easily select specific (sets of) parameters, such as value-type, dimension ordering, etc.

This PR prototypes the above, using a small sample of operations (matrix/matrix and matrix/vector products).
Once approved, I'd like to convert the (other) existing benchmarks to follow the same pattern.

@stefanseefeld stefanseefeld self-assigned this Aug 27, 2018

@stefanseefeld stefanseefeld force-pushed the stefanseefeld:benchmark branch from f3a9fcd to 3407a38 Aug 29, 2018

@bassoy
Copy link
Collaborator

left a comment

produces compile error 😩

Show resolved Hide resolved benchmarks/mm_prod.cpp
@bassoy

This comment has been minimized.

Copy link
Collaborator

commented Aug 30, 2018

I suggest to build and apply continuous integration for the examples and benchmarks

@bassoy

bassoy approved these changes Aug 31, 2018

Copy link
Collaborator

left a comment

Detailed review

#include <string>
#include <vector>

class benchmark

This comment has been minimized.

Copy link
@bassoy

bassoy Aug 31, 2018

Collaborator

Should not benchmark be an abstract class as each new function_benchmark supposed to inherit from benchmark, right?

This comment has been minimized.

Copy link
@stefanseefeld

stefanseefeld Sep 4, 2018

Author Member

Yes, but why should benchmark be abstract ? There are three virtual member functions. Arguably, the void operation(long) could be made abstract (as there is no point in deriving a subclass without implementing that). The other two (setup() and teardown() may not be needed, so a default noop implementation seems good enough).

virtual void operation(long) {}
virtual void teardown() {}

void run(std::vector<long> const &sizes, unsigned times = 1)

This comment has been minimized.

Copy link
@bassoy

bassoy Aug 31, 2018

Collaborator
  • Isn't std::vector<long> const &sizes forcing tensors and matrices to be square?
  • The standard value of times should be set to 10

This comment has been minimized.

Copy link
@stefanseefeld

stefanseefeld Sep 4, 2018

Author Member

Agreed on times. As to the single sizes parameter: I'm merely trying to provide a unique way to iterate over a "size" parameter to allow to sweep across a range of "sizes". What "size" means for a specific benchmark is still up to the implementor. It could mean the size of a square matrix, or it could be something else.
Given the abstract nature of benchmarks, I don't want to impose any specific meaning. But neither does it seem appropriate to have some benchmarks use a single "size" parameter, and some benchmarks two or even more.

operation(s);
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(clock::now() - start);
teardown();
std::cout << s << ' ' << duration.count()/times << std::endl;

This comment has been minimized.

Copy link
@bassoy

bassoy Aug 31, 2018

Collaborator
  • Maybe we could include an std::ostream member variable to benchmark instead of forcing the results to be displayed on the standard output std::cout.
  • It would be very nice to also output the units (here milliseconds).

This comment has been minimized.

Copy link
@stefanseefeld

stefanseefeld Sep 4, 2018

Author Member

I'm not sure the ability to redirect to an arbitrary std::ostream would any useful functionality. Unix pipes to the rescue.
Yes, the output (in particular: the metainfo block) should contain more info, including units.

auto start = clock::now();
for (unsigned i = 0; i != times; ++i)
operation(s);
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(clock::now() - start);

This comment has been minimized.

Copy link
@bassoy

bassoy Aug 31, 2018

Collaborator

Maybe we could also parameterize benchmark with respect to the time unit?

This comment has been minimized.

Copy link
@stefanseefeld

stefanseefeld Sep 4, 2018

Author Member

Time unit ? I doubt that's useful. Clock type ? Perhaps.

This comment has been minimized.

Copy link
@bassoy

bassoy Sep 4, 2018

Collaborator

Well people might want to set the accuracy and determine the time interval std::chrono::milliseconds or std::chrono::nanoseconds.

@@ -0,0 +1,28 @@
#include <boost/numeric/ublas/vector.hpp>

This comment has been minimized.

Copy link
@bassoy

bassoy Aug 31, 2018

Collaborator

I prefer the initialization functions to be defined within benchmark functions, see class prod, instead of putting it into a init.hpp. This allows the benchmark function writers to initialize their data as they want.

This comment has been minimized.

Copy link
@stefanseefeld

stefanseefeld Sep 4, 2018

Author Member

Hmm, perhaps. Note that you (the benchmark implementer) are free to use the init() functions or not. The reason I didn't put them into the benchmark class is that it's much harder (if not impossible) to overload and template-specialize if the base template is defined in the benchmark base class itself. But I remain open to suggestions.

This comment has been minimized.

Copy link
@bassoy

bassoy Sep 4, 2018

Collaborator

I think it is possible to simply put the init function into the benchmark function so that the initialization becomes private to each benchmark. Adding different types of initializations into init (used by different benchmark functions) in separate files, results in more dependencies between files and functions. It will be harder to maintain if the number of benchmarks increase. While I like free functions, in this case the modularity becomes an issue I think. One might argue that code should not be repeated (DRY) , however the initialization of tensors/matrices/vectors for benchmarks is usualy a one-liner using std::iota, std::fill or std::generate and it can be specialized very easily. No template specialization needed.



template <typename T>
void init(ublas::vector<T> &v, unsigned long size, int max_value)

This comment has been minimized.

Copy link
@bassoy

bassoy Aug 31, 2018

Collaborator

Please consider to parametrize init with all template parameters ublas::vector<T> or ublas::matrix<T,L> , i.e. the storage array. Otherwise, I cannot init and therefore benchmark ublas::vector<T,std::vector<T,A>>

This comment has been minimized.

Copy link
@stefanseefeld

stefanseefeld Sep 4, 2018

Author Member

Good point. We should do that !
(Note that we can't specialize these functions. But we can overload them with additional argument types, such as matrices with specific dimension ordering.)

p.run(std::vector<long>({1, 2, 4, 8, 16, 32, 64, 128, 256, 512}));//, 1024}));
}

int main(int argc, char **argv)

This comment has been minimized.

Copy link
@bassoy

bassoy Aug 31, 2018

Collaborator

I am not sure if the benchmarks should be runtime-variable in terms of the data type and not in terms the vector, matrix and tensor size.

We might need a more generic approach where we can select from one main functions to test and maybe including the type.

This comment has been minimized.

Copy link
@stefanseefeld

stefanseefeld Sep 4, 2018

Author Member

I'm not sure I understand what you have in mind. Can you elaborate ?

This comment has been minimized.

Copy link
@bassoy

bassoy Sep 4, 2018

Collaborator

As the end-user is only interested in the results of the benchmarks, only contributers are interested in adding benchmarks. I prefer all benchmarks to be called from one main, similar to unit-tests. See my gist example.

This comment has been minimized.

Copy link
@stefanseefeld

stefanseefeld Sep 28, 2018

Author Member

Let me come back to this (sorry for the high latency):
I disagree about the distinction between "contributors" and "end-users". I expect benchmarks to be of most value during development, where someone may want to run (and compare !) a specific set of benchmarks.
And to support that, I need benchmarks to be runnable individually, not en bloc.
(In fact, I'd argue the same way for tests: it's good to have a way to run an entire test suite with a single command, but ultimately it must be possible to invoke individual tests, too, especially if they fail and you want to investigate why.)

p.run(std::vector<long>({1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096}));
}

int main(int argc, char **argv)

This comment has been minimized.

Copy link
@bassoy

bassoy Aug 31, 2018

Collaborator

I think there are too many of the same code lines are in mm_prod. Do we then have to repeat all the lines for all new benchmarks? Lets try to stick to DRY.

This comment has been minimized.

Copy link
@stefanseefeld

stefanseefeld Sep 4, 2018

Author Member

Please elaborate.

This comment has been minimized.

Copy link
@bassoy

bassoy Sep 4, 2018

Collaborator

The main of mm_prod and mv_prod are almost the same. I thought that we can do better if the code is not repeated (DRY). See my gist example.


template <typename S> class prod;

template <typename R, typename O1, typename O2>

This comment has been minimized.

Copy link
@bassoy

bassoy Aug 31, 2018

Collaborator

I would like to see different class prod classes for each function (matrix-matrix, matrix-vector, vector-matrix) to be benchmarked because of the initialization of the data structures.

This comment has been minimized.

Copy link
@stefanseefeld

stefanseefeld Sep 4, 2018

Author Member

I'd like the benchmark name to exactly match the operation name (and likewise the signature).
Why can't you do what you have in mind (use different initialization functions) with template specializations ?

This comment has been minimized.

Copy link
@bassoy

bassoy Sep 4, 2018

Collaborator

Yes template specialization would help. But why not simplify things and put the initialization and call of the function into corresponding mv_prod and mm_prod classes and files. We would then have only two files with their corresponding initializations where the repsonsible functions are within one file and not distributed. prod.hpp and init.hpp files become obsolete then.

This comment has been minimized.

Copy link
@stefanseefeld

stefanseefeld Sep 28, 2018

Author Member

I expect that a few initialization functions are going to be reused across most benchmarks. And if not, specific benchmarks may pick specific initialization functions. The two are rather independent, so I chose not to bind the init functions to the benchmark classes themselves.

@stefanseefeld stefanseefeld force-pushed the stefanseefeld:benchmark branch 4 times, most recently from b8fdfaf to f23dee7 Sep 1, 2018

@stefanseefeld stefanseefeld force-pushed the stefanseefeld:benchmark branch from f23dee7 to fde7ab7 Sep 14, 2018

@stefanseefeld stefanseefeld force-pushed the stefanseefeld:benchmark branch 2 times, most recently from 63101de to 4d50031 Sep 27, 2018

@stefanseefeld stefanseefeld force-pushed the stefanseefeld:benchmark branch from 4d50031 to 9dfd60b Sep 28, 2018

@stefanseefeld stefanseefeld merged commit 44e7d7d into boostorg:develop Sep 28, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.