New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP:jitting Newton #409

Merged
merged 4 commits into from Jul 18, 2018

Conversation

2 participants
@nikita-astronaut
Copy link
Contributor

nikita-astronaut commented Jul 16, 2018

Hello, @Juanlu001

I jist copied the scipy Newton into our angles.py file and everything worked nice. But as I added @jit everywhere, it screwed units and now test_angles.py miserably fails. I investigated but seems like your help is needed, as you know numba better...

@Juanlu001

This comment has been minimized.

Copy link
Member

Juanlu001 commented Jul 16, 2018

(First of all: PEP8 issues!)

There are two problems with this approach, both related to numba limitations:

  1. Writing our own newton method and passing a function to optimize will be awfully slow because of numba/numba#2952
  2. numba functions won't respect array subclasses, and that includes astropy.units.Unit.

There are no easy solutions for either:

  1. We can't have a separate newton function, so we will have to hardcode the iteration, like it's done here:

@jit
def _halley(p0, T0, ll, tol, maxiter):
"""Find a minimum of time of flight equation using the Halley method.
Note
----
This function is private because it assumes a calling convention specific to
this module and is not really reusable.
"""
for ii in range(maxiter):
y = _compute_y(p0, ll)
fder = _tof_equation_p(p0, y, T0, ll)
fder2 = _tof_equation_p2(p0, y, T0, fder, ll)
if fder2 == 0:
raise RuntimeError("Derivative was zero")
fder3 = _tof_equation_p3(p0, y, T0, fder, fder2, ll)
# Halley step (cubic)
p = p0 - 2 * fder * fder2 / (2 * fder2 ** 2 - fder * fder3)
if abs(p - p0) < tol:
return p
p0 = p
raise RuntimeError("Failed to converge")
@jit
def _householder(p0, T0, ll, M, tol, maxiter):
"""Find a zero of time of flight equation using the Householder method.
Note
----
This function is private because it assumes a calling convention specific to
this module and is not really reusable.
"""
for ii in range(maxiter):
y = _compute_y(p0, ll)
fval = _tof_equation(p0, y, T0, ll, M)
T = fval + T0
fder = _tof_equation_p(p0, y, T, ll)
fder2 = _tof_equation_p2(p0, y, T, fder, ll)
fder3 = _tof_equation_p3(p0, y, T, fder, fder2, ll)
# Householder step (quartic)
p = p0 - fval * ((fder ** 2 - fval * fder2 / 2) /
(fder * (fder ** 2 - fval * fder2) + fder3 * fval ** 2 / 6))
if abs(p - p0) < tol:
return p
p0 = p
raise RuntimeError("Failed to converge")

  1. As done with norm and other functions, we will need a "fast" version and a high level version for everything to work.

Alternatives:

The first three are extremely complex and the last is not an option 😜

@Juanlu001

This comment has been minimized.

Copy link
Member

Juanlu001 commented Jul 16, 2018

What do you think about this idea?

In [29]: def accel(func):
    ...:     func_fast = jit(func)
    ...:     def wrapper(angle, ecc):
    ...:         return func_fast(angle.to(u.rad).value, ecc) * u.rad
    ...: 
    ...:     return wrapper
    ...: 

In [30]: @accel
    ...: def nu_to_E(nu, ecc):
    ...:     return 2 * np.arctan(np.sqrt((1 - ecc) / (1 + ecc)) * np.tan(nu / 2))
    ...: 

In [31]: nu_to_E((1 * u.rad).to(u.deg), 0.1 * u.one)
Out[31]: <Quantity 0.91791204 rad>

In [32]: nu_to_E((1 * u.rad), 0.1 * u.one)
Out[32]: <Quantity 0.91791204 rad>

In [33]: nu_to_E((1 * u.rad), 0.1)
Out[33]: <Quantity 0.91791204 rad>

In [34]: nu_to_E(1, 0.1)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-34-349da992ed75> in <module>()
----> 1 nu_to_E(1, 0.1)

<ipython-input-29-d2463f20b6c7> in wrapper(angle, ecc)
      2     func_fast = jit(func)
      3     def wrapper(angle, ecc):
----> 4         return func_fast(angle.to(u.rad).value, ecc) * u.rad
      5 
      6     return wrapper

AttributeError: 'int' object has no attribute 'to'

In [35]: from poliastro.twobody.angles import nu_to_E as nu_to_E_slow

In [36]: %timeit nu_to_E_slow((1 * u.rad).to(u.deg), 0.1 * u.one)
239 µs ± 12.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [37]: %timeit nu_to_E((1 * u.rad).to(u.deg), 0.1 * u.one)
56.6 µs ± 840 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Pros: We get an order of magnitude improvement.
Cons: The function now requires units, does not work with scalars.

nikita-astronaut added some commits Jul 18, 2018

@codecov

This comment has been minimized.

Copy link

codecov bot commented Jul 18, 2018

Codecov Report

Merging #409 into master will increase coverage by 0.08%.
The diff coverage is 96%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #409      +/-   ##
==========================================
+ Coverage   85.22%   85.31%   +0.08%     
==========================================
  Files          39       39              
  Lines        1922     1947      +25     
  Branches      143      149       +6     
==========================================
+ Hits         1638     1661      +23     
- Misses        248      249       +1     
- Partials       36       37       +1
Impacted Files Coverage Δ
src/poliastro/jit.py 100% <100%> (ø) ⬆️
src/poliastro/twobody/angles.py 98.3% <95%> (-1.7%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 946a462...9740e4a. Read the comment docs.

@Juanlu001

This comment has been minimized.

Copy link
Member

Juanlu001 commented Jul 18, 2018

These are the results:

$ asv run 9740e4a^!
[...]
$ asv compare 19ed5ff 9740e4a

All benchmarks:

       before           after         ratio
     [19ed5ff ]       [9740e4a ]
-      69.7±0.2ms       56.7±0.1ms     0.81  examples.time_propagate_iss_one_period

So now propagate(method=mean_motion) is a 20 % faster! 🎉

However, it's still going to be > 2x slower than kepler:

$ asv compare d32f3ab8 9740e4a

All benchmarks:

       before           after         ratio
     [d32f3ab8]       [9740e4a ]
+     24.3±0.06ms       56.7±0.1ms     2.33  examples.time_propagate_iss_one_period

So @nikita-astronaut I give you two options:

  • You spend some more time playing with the new numba inspect_types(pretty=True) to see if there's something that you can optimize a bit more.
  • You say "I'll optimize this further in a separate pull request", and I'll do a review and after that we can merge.
@nikita-astronaut

This comment has been minimized.

Copy link
Contributor Author

nikita-astronaut commented Jul 18, 2018

Let's do it in the other PR because I also want to speed up forward functions (the ones that don't use Newton)

@Juanlu001
Copy link
Member

Juanlu001 left a comment

Only one question left: it seems that adding @jit to the direct functions would be trivial. Any reason not to do it now? Perhaps other tests breaking?

@Juanlu001

This comment has been minimized.

Copy link
Member

Juanlu001 commented Jul 18, 2018

OK, I'm merging this as an starting point, but I have some reservations. In particular, it seems that the wrapper returned by accel_angles cannot be used inside a numba function, which stops us from jitting more functions above it. Perhaps the approach of trying to work well with everything (i.e. hasattr(angle, "unit")) is the wrong approach, and we have to try something like this:

@jit
def M_to_nu_fast(M, ecc, delta=1e-2):
    ...

M_to_nu = _wrap_units(M_to_nu_fast)

@jit
def M_to_E_fast(M, ecc):
    ...

M_to_E = _wrap_units(M_to_E_fast)

...

def _wrap_units(func):  # inside angles.py
    @wraps(func)
    def wrapper(angle, ecc, **kwargs):
        return func(angle.to(u.rad).value, ecc.value, **kwargs)) * u.rad

And the _fast version should be used inside numba code.

@Juanlu001 Juanlu001 merged commit 533b429 into poliastro:master Jul 18, 2018

9 of 10 checks passed

codeclimate 1 issue to fix
Details
ci/circleci: coverage Your tests passed on CircleCI!
Details
ci/circleci: docs Your tests passed on CircleCI!
Details
ci/circleci: quality Your tests passed on CircleCI!
Details
ci/circleci: test_py35 Your tests passed on CircleCI!
Details
ci/circleci: test_py36 Your tests passed on CircleCI!
Details
codecov/patch 96% of diff hit (target 85.22%)
Details
codecov/project 85.31% (+0.08%) compared to 946a462
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@nikita-astronaut nikita-astronaut deleted the nikita-astronaut:jitting_newton branch Jul 18, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment