Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make functions that do not propagate derivatives return prim types #2473

Closed
wants to merge 4 commits into from

Conversation

t4c1
Copy link
Contributor

@t4c1 t4c1 commented Apr 16, 2021

Summary

Make functions that do not propagate derivatives (ceil, floor, round, trunc) return prim types. This simplifies the code. Working with less autodiff variables could also be faster.

Tests

No changes. This is just a refactor.

Side Effects

None.

Release notes

Functions that do not propagate derivatives (ceil, floor, round, trunc) now return prim types.

Checklist

  • Math issue #(issue number)

  • Copyright holder: Tadej Ciglarič

    The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
    - Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
    - Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

  • the basic tests are passing

    • unit tests pass (to run, use: ./runTests.py test/unit)
    • header checks pass, (make test-headers)
    • dependencies checks pass, (make test-math-dependencies)
    • docs build, (make doxygen)
    • code passes the built in C++ standards checks (make cpplint)
  • the code is written in idiomatic C++ and changes are documented in the doxygen

  • the new changes are tested

@SteveBronder
Copy link
Collaborator

Wouldn't this break?

var x(1.0);
auto y = x + x;
auto z = ceil(y);
z.grad();

@t4c1
Copy link
Contributor Author

t4c1 commented Apr 19, 2021

The exact code you wrote would not work. But if you need to calculate gradient of a variable you should specify its type as var. Than double gets converted and it works fine.

@SteveBronder
Copy link
Collaborator

Personally idt it's a good idea for us to break code that works in user space currently. In the meta I think this also goes against the idea of how we promote to the least upper bound of our scalar types

@bob-carpenter
Copy link
Contributor

Personally idt it's a good idea for us to break code that works in user space currently

This will definitely break backward compatibility in the math lib by changing the return type.

Otherwise, it's safe in that no meaningful derivatives get lost. So re-promoting as @t4c1 suggests is sound.

Functions like ceil() are differentiable as written, it's just that derivatives are 0 at points where they're well defined. What does the ceil() function currently return for derivatives at integer-valued real inputs where the function's discontinuous?

@t4c1
Copy link
Contributor Author

t4c1 commented Apr 20, 2021

What does the ceil() function currently return for derivatives at integer-valued real inputs where the function's discontinuous?

They return 0.

@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.41 3.4 1.0 0.29% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 1.01 1.37% faster
eight_schools/eight_schools.stan 0.11 0.11 1.01 0.52% faster
gp_regr/gp_regr.stan 0.16 0.16 0.96 -3.69% slower
irt_2pl/irt_2pl.stan 6.02 6.09 0.99 -1.13% slower
performance.compilation 92.1 89.53 1.03 2.8% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.88 8.74 1.02 1.49% faster
pkpd/one_comp_mm_elim_abs.stan 31.42 28.91 1.09 7.99% faster
sir/sir.stan 122.08 124.47 0.98 -1.96% slower
gp_regr/gen_gp_data.stan 0.03 0.04 0.95 -5.11% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan 3.02 3.1 0.97 -2.7% slower
pkpd/sim_one_comp_mm_elim_abs.stan 0.39 0.43 0.91 -10.15% slower
arK/arK.stan 1.88 1.83 1.02 2.36% faster
arma/arma.stan 0.74 0.74 1.0 -0.02% slower
garch/garch.stan 0.57 0.57 1.0 0.07% faster
Mean result: 0.996309950158

Jenkins Console Log
Blue Ocean
Commit hash: 773771f


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@t4c1
Copy link
Contributor Author

t4c1 commented May 3, 2021

This PR has been sitting here for a while. Given that there were some disagreements maybe we should vote on whether to merge it or not?

@wds15
Copy link
Contributor

wds15 commented May 3, 2021

Is it an option to discuss this at the monthly math meetings?

@t4c1
Copy link
Contributor Author

t4c1 commented May 3, 2021

Sure. When is the next meeting?

@SteveBronder
Copy link
Collaborator

The next one is scheduled on the 20th, but we could also just talk about this after the regular Stan meeting this week

@t4c1
Copy link
Contributor Author

t4c1 commented May 4, 2021

Either is fine by me. Just tell me when do you want me to attend.

@SteveBronder
Copy link
Collaborator

Let's put it on the agenda for this week's Stan meeting on Thursday

@wds15
Copy link
Contributor

wds15 commented May 4, 2021

I am not sure the Stan meeting is best to resolve right away, but we can try to settle on a path how to resolve.

@t4c1
Copy link
Contributor Author

t4c1 commented May 20, 2021

So on today's meeting we can talk about this.

@wds15
Copy link
Contributor

wds15 commented May 20, 2021

I just saw that I have a conflict today and can't join the meeting. My view is that it is a massive change given that the type changes between inputs and outputs. I do see the potential performance benefit of this and it does makes sense conceptually. One can also argue that the new C++ world is anyway more "auto" style such that this is not an issue. In short I am willing to go down this route, but I would require at least a major version bump.... and I still have my worries with such a change for backward compatibility worries.

@syclik
Copy link
Member

syclik commented Oct 21, 2021

We discussed the pros and cons of this change on May 20, 2021. See brief notes:

https://discourse.mc-stan.org/t/monthly-math-development-meeting-05-20-2021-10-am-edt/22590/9?u=syclik

This is going to introduce a breaking change to the Math library. I added a new label called breaking change and I'll label this and also mark it as a draft so it's not on our active queue of PRs to evaluate.

@syclik syclik added the breaking change Requires a breaking change to the Math library label Oct 21, 2021
@syclik syclik marked this pull request as draft October 21, 2021 14:23
@bob-carpenter
Copy link
Contributor

I don't see what's breaking about this change. Things like the following still work, because double can be promoted to var:

var x = 10.3;
var y = ceil(x);

@syclik wrote on Discourse,

the use of auto could break since a function that returns a var now could return double instead.

Is there a concrete example you're worried about? I don't know what you meant by "break" here. Sure, if we replaced the above with

auto y = ceil(x);

then we're changing the type of y. But I don't understand what will break. If there is an overloaded function f defined for both var and double, we'd want to call the double version, not the var version. I think that's the whole point---cut down on needless autodiff.

And how is this different than the changes that led to matrix functions returning expression templates and accepting them as arguments?

The other downside is that it changes the design of the Math functions… right now, the return type is the always promoted type (almost always now with the Eigen expressions). But with this change, a developer would need to think of when it is not.

Why is making an improved design a downside? This seems similar to where a function designer now needs to think about whether to return a base matrix or a template expression. And it seems to "break" auto in the same way by changing the return type to something that can be assigned to the original type but isn't the original type.

I don't care that much about this issue, but there are 20+ open PRs in math and this one's nearly a year old. So we should either close it or merge it.

@syclik
Copy link
Member

syclik commented Apr 29, 2022

If I recall, the problem we'd have is not with the scalar functions but with containers. For example, if there was code that wrote:

std::vector<var> x = foo(...);

where foo(...) used to return std::vector<var>, but now it returns std::vector<double>, then I believe it'll break. The fix is simple:

  1. use auto
  2. wrap the return in a different function to promote it.

I'm closing the PR.

@syclik syclik closed this Apr 29, 2022
@bob-carpenter
Copy link
Contributor

Thanks, @syclik---just what I was looking for. It totally breaks in C++.

P.S. It'd work in Stan, but that's because @WardBrian updated so containers are covariant in Stan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Requires a breaking change to the Math library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants