-
-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid redundant computations in IRR calculation #60
Conversation
The IRR computation adds and then subtract the previous iteration value before comparing to the tolerance. We can just compute the delta and compare that to the tolerance instead. This should also make the computation more robust if there is a large difference in magnitude between `x` and `delta`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find! I'm going to do some digging on why it's failing the test - it should be an identical result. I'm not quite sure where the errors are being introduced from.
If |
I haven't looked in detail at the tests, but this is all but guaranteed to give you numerical problems:
|
I agree, however, we went from that test passing to it failing. I'd like to investigate what those numerical problems are. Looking at the original issue #15 it looks like it was an issue with the old irr solver, not with the most recent newton's method solver. |
I think so. For practical purposes the cashflow in the test is basically "all the initial payout is lost" so that would represent a loss of 100% (which doesn't have a properly defined behaviour under the assumption of "reinvesting the profits" that IRR is based at). So a solution to that is "something close to -1" which seems to be right by the intended test. In the IRR formula, when |
A potential solution is to use a relative tolerance instead of an absolute one, but we are dealing with a very edge case here. Alternatively, we can have a breakout that returns -1 if |
I've been doing some further digging on this, and found out that, as suspected, the IRR algorithm is NOT converging for the test case (neither for the original nor for my proposed change). This can be seen by inserting a -12538172285.538937
1053069465.3266779
989220413.1617088
928321332.590804
870269688.6044048
814967045.1286157
762319263.6737074
712236751.376529
664634679.6006484
619432894.2146667
576554675.9649655
535921974.7416782
497440630.96024746
460958230.0502432
426148959.4876158
392208772.1405912
357083216.30890393
315714469.10958934
257360879.981704
168611272.6382977
64440716.36582911
7770992.814447409
100879.09078333691
16.736887134227963
4.711100328574186e-07
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
...
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08
-4.2828184805219944e-08 so, the iterations get exhausted without reaching the set tolerance of 1e-12. With the current algo, the floating point error on I found the easiest way to fix this is by setting a better guess. Simply doing something like this: inflow = sum(x for x in values if x > 0)
outflow = -sum(x for x in values if x < 0)
guess = 0.1 if inflow > outflow else inflow / outflow - 1 fixes the convergence problem (by setting an initial guess that is negative if the outflows are larger than the inflows): -0.9999999999202436
4.407969309258772e-12
4.208982322158506e-12
4.017399791083582e-12
3.832991104009405e-12
3.6555317783074664e-12
3.4848033205951103e-12
3.3205930897771546e-12
3.16269416317344e-12
3.0109052056699856e-12
2.8650303418619812e-12
2.7248790311801247e-12
2.5902659460126176e-12
2.461010852855957e-12
2.3369384965514225e-12
2.2178784876925083e-12
2.1036651933242854e-12
1.9941376311010488e-12
1.8891393671276444e-12
1.7885184177871862e-12
1.6921271559599094e-12
1.5998222221745013e-12
1.5114644414171902e-12
1.4269187465742507e-12
1.346054109828449e-12
1.2687434838096503e-12
1.1948637549757807e-12
1.1242957126649819e-12
1.0569240386559779e-12
9.92637324125317e-13 Would it be acceptable to change the default value of |
Actually, setting the guess to |
Thanks for taking the time to investigate. I think we're getting somewhere.
I see. Let's consider this a bug.
Please give me some time to experiment with it myself, but, I like the idea. The current version seems to have some funny results at times, having a better guess could help with that.
When I rewrote IRR, I wanted to match what Google Sheets/Excel had. Is that a good reason? Probably not. We haven't made a release of this version of the IRR, so I'm happy to change it and get it right come release time. |
inflow = sum(x for x in values if x > 0)
outflow = -sum(x for x in values if x < 0)
guess = inflow / outflow - 1 I've played around with this a bit. It seems to work well for me too. Would you like to implement it and see if that helps as part of this PR? I guess this could be tided up/optimised a bit by using NumPy functions, but I don't see this as being the bottleneck anyway. E.g. something like this (untested code) positive_cashflow = values > 0
inflow = values.sum(where=positive_cashflow)
outflow = -values.sum(where=~positive_cashflow) |
Sure, I will update the PR with the changes. While working at this I also thought of getting rid of the denominators altogether, by using |
Committed the proposed changes. Even if performance is slightly hurt in some cases I believe it makes sense to use the gain |
Sorry for taking so long to respond, I've been pretty bust lately – thank you for being patient. I really like what I'm seeing here. The PR looks good, works in theory, and works in practice. In addition, the CI is green.
I'm -0.5 on this. numpy-financial is on the cusp of not being maintained. I think this will add to the maintenance burden for any future contributors. In addition, it is supposed to support simple financial functions. So I think it will be fine to have only one method that works in the general case. |
Makes sense. Then my suggestion would be to use the denominator-free expression, the rationale being that if there is a single method to compute something, then numerical stability takes precedence over pure performance. I also believe the trick of getting rid of the denominators is standard enough that maintainers/contributors shouldn't be baffled by it |
I agree. I think the numerical stability trumps a minor performance gain. Is there anything else you would like to add to this P.R.? I'm happy to merge. |
Cool! I am happy with the current PR state, feel free to merge! |
Merged. Thanks @jlopezpena for taking the time to contribute |
I worry about the logic that this PR adds for the initial guess. Consider a bond with coupon rate
The IRR of the above is But the formula for the initial guess approximates it as See numpy-financial/numpy_financial/_financial.py Line 763 in fb63b04
|
The IRR computation adds and then subtract the previous iteration value before comparing to the tolerance. We can just compute the delta and compare that to the tolerance instead. This should also make the computation more robust if there is a large difference in magnitude between
x
anddelta