Skip to content

Optimizer control primitives#57

Merged
rmartinho merged 7 commits intodevelfrom
feature/51_optimizer_control
Sep 1, 2016
Merged

Optimizer control primitives#57
rmartinho merged 7 commits intodevelfrom
feature/51_optimizer_control

Conversation

@rmartinho
Copy link
Copy Markdown
Collaborator

@rmartinho rmartinho commented Aug 8, 2016

This adds two primitives for a tiny bit of control over the optimizer.

keep_memory(p); prevents optimization of *p writes that precede it and of *p reads that follow it. This is the "escape" mentioned in #51.

keep_memory(); does the same for all memory. It is currently not available for MSVC (any help with that is appreciated). This is the "clobber" mentioned in #51.


detail::optimizer_barrier();
(*this)(chronometer(model, plan.iterations_per_sample));
detail::optimizer_barrier();
Copy link
Copy Markdown
Contributor

@arximboldi arximboldi Aug 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest putting the optimizer_barrier around the function call to the concrete benchmark, instead of the whole measurement. That way we can also use the return value for the deoptimization, which would make the trick compatible with MSVC and with what the documentation suggests.

template <typename T>
void deoptimize_value(T&& x) { 
    keep_memory(&x); 
}

template <typename Fn, typename... Args>
auto invoke_deoptimized(Fn&& fn, Args&&... args) -> std::enable_if_t<!std::is_same<void, std::result_of_t<Fn(Args...)>>{}> {
    deoptimize_value(std::forward<Fn>(fn) (std::forward<Args>(args...)));
}

template <typename Fn, typename... Args>
auto invoke_deoptimized(Fn&& fn, Args&&... args) -> std::enable_if_t<std::is_same<void, std::result_of_t<Fn(Args...)>>{}> {
    std::forward<Fn>(fn) (std::forward<Args>(args...));
    // maybe call optimizer_barrier() ??
}

And then in chronometer::measure:

        template <typename Fun>
        void measure(Fun&& fun, std::false_type) {
            measure([&fun](int) { return fun(); }); // added return !
        }

        template <typename Fun>
        void measure(Fun&& fun, std::true_type) {
            impl->start();
            for(int i = 0; i < k; ++i) 
                deoptimized_invoke(fun, i);
            impl->finish();
        }

I did not try the code and it is C++14, but you get the idea... What do you think?

@arximboldi
Copy link
Copy Markdown
Contributor

Cool! In general I like the approach and the technique. Only thing is I am not sure about what is the best place and way to put the barrier.

@rmartinho rmartinho merged commit 4c4b970 into devel Sep 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants