ENH: Add PDF, CDF and parameter estimation for Stable Distributions #7374

bsdz · 2017-05-03T22:19:11Z

Add Levy Stable Parameter Estimation using McCulloch 1986 Quantiles method.
Add PDF and CDF calculation using Zolotarev's method and FFT estimate on Continuous Fourier
Integral of Characteristic Function.

method. Add PDF and CDF calculation using FFT estimate on Continous Fourier Integral of Characteristic Function.

Reduce tolerance on fit test (for python 2.7).

Improve documentation. Choose slightly more illustrative example parameters.

scipy/stats/_continuous_distns.py

pv

Some general comments. Not reviewing the algorithm/implementation itself.

pv · 2017-05-07T13:47:58Z

scipy/stats/_continuous_distns.py

+        """Use McCullock 1986 method - Simple Consistent Estimators
+            of Stable Distribution Parameters
+        """
+        return self._fitstart(data, *args, **kwds)


Is this correct? Isnt' the result from fitstart an approximation (it uses interpolation tables).

You are right that fitstart is an approximation and the quantile estimate works well here. However, the MLE method breaks for other reasons (optimiser?) so I thought it best to override fit() so at least one can estimate parameters.

It is really exciting to see stable laws implemented in the scipy. I have spend few years working with different density implementations. There are asymptotic expansions and intergral represenations, which performs pretty well and are more stable numerically-wise then fft. Also there are different parametrizations of the complex parameter in the exponent of characteristic function, which I believe are more intuitive. I guess it could be next development.

@pv I removed the fit() override so this now uses MLE again. using interpolation tables as initial estimate. I simply reordered the test cases and it works; this is because tests switch between different pdf methods and fit() converges best with Zolotarev's method and not FFT. I've added a note to docstring.

pv · 2017-05-07T13:50:45Z

scipy/stats/_continuous_distns.py

+            q = 16 if fft_n_points_two_power is None else fft_n_points_two_power
+
+            density_x, density = levy_stable_gen._pdf_from_cf_with_fft(lambda t: levy_stable_gen._cf(t, _alpha, _beta), h=h, q=q)
+            f = interpolate.InterpolatedUnivariateSpline(density_x, density)


Prefer splrep + BSpline over *UnivariateSpline.

This needs to match the spline as obtained by interp1d. Would the two you suggest do that?

scipy/stats/_continuous_distns.py

pv · 2017-05-07T13:58:40Z

scipy/stats/_continuous_distns.py

+        mu2 = 2
+        g1 = 0. if alpha == 2. else np.NaN
+        g2 = 0. if alpha == 2. else np.NaN 
+        return mu, mu2, g1, g2


Is it correct to return constant results here?

I believe this is correct. Similar to other distributions, ie:

semicircular_gen._stats
uniform_gen._stats
vonmises_gen._stats
wald_gen._stats

bsdz · 2017-11-22T21:18:36Z

Hi @an81, sorry in advance if I have misinterpreted your question. I think if all goes well, and this PR passes review, it will be included for release 1.1.

an81 · 2017-11-22T21:21:46Z

Sure, well, I have spend few years on working on stable laws, written paper about them, wrote a code for densities ... Its not the best idea to use fft, because its numerically unstable. I can add the code and we see if it passes.

…

On 22 November 2017 at 21:18, Blair Azzopardi ***@***.***> wrote: Hi @an81 <https://github.com/an81>, sorry in advance if I have misinterpreted your question. I think if all goes well, and this PR passes review, it will be included for release 1.1. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7374 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIfE8X6I3xOh7NbIm_EIIouKVDwK6eUAks5s5I-xgaJpZM4NQBiP> .

an81 · 2017-11-22T21:23:31Z

So on my behalf, I honestly hope that if all you use is naive fft, that the code will not pass, because it will create tons of problems if somebody would relly on the implementation and blindly used the function without testing it. On 22 November 2017 at 21:21, Andrea Karlova <andrea.karlova@gmail.com> wrote:

…

Sure, well, I have spend few years on working on stable laws, written paper about them, wrote a code for densities ... Its not the best idea to use fft, because its numerically unstable. I can add the code and we see if it passes. On 22 November 2017 at 21:18, Blair Azzopardi ***@***.***> wrote: > Hi @an81 <https://github.com/an81>, sorry in advance if I have > misinterpreted your question. I think if all goes well, and this PR passes > review, it will be included for release 1.1. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#7374 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AIfE8X6I3xOh7NbIm_EIIouKVDwK6eUAks5s5I-xgaJpZM4NQBiP> > . >

bsdz · 2017-11-22T21:39:57Z

hi @an81, thanks for your suggestions. I'm not sure if you've looked at the code in detail but it actually uses FFT or quad. It depends on how many points are requested. This is included in the documentation I provided. I don't think FFT is as bad as you seem to suggest, provided you choose a sufficient number of points. Please do follow the references provided in the code. Also it might be worth reading Stable Paretian Models in Finance by Rachev/Mittnik where they provide a fairly comprehensive coverage of the topic (albeit a little outdated). I've been using a similar FFT approach in various models since 2008 in a commercial setting and a similar approach is used in other popular financial products. Also please do submit code changes and references as I'm sure this PR could be improved. It's certainly not perfect.

an81 · 2017-11-22T21:58:24Z

I dont like the work of Rachev & Mittnik. I know their work, other people did better job. We also looked for the stable for the industry (finance) purpose ... My main criticism for fft is the Gibbs effect, i.e. the tails fluctuates. I guess there is work of Nolan, who literally just re-written Zolotarev's papers. Its easy to workout he integral representation, as Zolotarev did ( and which Nolan had implemented end of 90??? ) Anyway, I think it could be more helpful to just share the work which is done on this topic. Indeed there expansions which works pretty well for certain regions. As you know, you can represent densities also as special functions (Fox's or Meier G- function) so trying this cn get better results and testing environment too. To be fair, I really wonder why the people in finance dont use more of stable laws (tempered stable laws). Maybe because few of them tried fft and realized that the Gibbs effect is not realiable ...

…

On 22 November 2017 at 21:40, Blair Azzopardi ***@***.***> wrote: hi Andrea, thanks for your suggestions. I'm not sure if you've looked at the code in detail but it actually uses FFT or quad. It depends on how many points are requested. This is included in the documentation I provided. I don't think FFT is as bad as you seem to suggest, provided you choose a sufficient number of points. Please do follow the references provided in the code. Also it might be worth reading Stable Paretian Models in Finance by Rachev/Mittnik where they provide a fairly comprehensive coverage of the topic (albeit a little outdated). I've been using a similar FFT approach in various models since 2008 in a commercial setting and a similar approach is used in other popular financial products. Also please do submit code changes and references as I'm sure this PR could be improved. It's certainly not perfect. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7374 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIfE8edO48WPpMS-M4yNC0LrluZGa1bAks5s5JSxgaJpZM4NQBiP> .

bsdz · 2017-11-22T22:14:15Z

@an81, I feel the biggest challenge in finance is more often simplicity, in the sense: can you explain it to someone who doesn't have a phd in maths; and also speed, in implementation and calculation. FFT allows one to produce a near complete density in one go allowing interpolation of several points at once. Perhaps this might be possible with special functions too. Tbh I don't know much about representing densities with Fox or Meier G functions. I'll be happy to take a look at these methods and implement them into this PR as ideally it would be nice to remove/reduce the Gibbs effect. Can you point to any papers with derivation and/or implementation details?

an81 · 2017-11-23T12:14:26Z

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.117.6505&rep=rep1&type=pdf

bsdz · 2017-11-24T20:38:52Z

Hi @an81. Thank you for the 550+ page book. Please can you be a bit more specific? Some sample code goes a long way too. Also can you perhaps test the existing code and highlight where the Gibbs effect might be more prominent? eg low alpha etc; perhaps this can be just documented with a recommendation that users use quad in these cases (already in code). This is until better implementation is available.

an81 · 2017-11-28T10:00:32Z

Blair, is there a way to have a chat via email?

…

On 24 November 2017 at 20:38, Blair Azzopardi ***@***.***> wrote: Hi @an81 <https://github.com/an81>. Thank you for the 550+ page book. Please can you be a bit more specific? Some sample code goes a long way too. Also can you perhaps test the existing code and highlight where the Gibbs effect might be more prominent? eg low alpha etc; perhaps this can be just documented with a recommendation that users use quad in these cases (already in code). This is until better implementation is available. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7374 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AIfE8VNjHdqANvYfUG8Gg6feKb5np_kLks5s5ylhgaJpZM4NQBiP> .

scipy/stats/_continuous_distns.py

rgommers · 2018-06-12T03:03:26Z

Code style looks fine and CI checkers are happy. There's a lot of too long lines, but let's leave those in to not make rebasing gh-8766 on top of this one too difficult.

scipy/stats/_continuous_distns.py

rgommers · 2018-06-12T03:22:37Z

Checked the code coverage (which is good, just 2 branches missing) and that the docs render fine and all formulas given are correct. I think for merging we'll rely mostly on completeness of the tests and comparisons against the output of Nolan's stablec.exe. Further accuracy improvements are then postponed until gh-8766 is ready. I'm +1 on merging this once the few comments I just made are resolved.

coverage of parameters. Correct some bugs in formulas. Improve test coverage and test each method around parameter domains. Add new best methor that incorporates quad and zolatarev methods. Switch off FFT by default altogether for PDF. Add warnings for certain methods and parameters when known to lose precision.

rgommers · 2018-06-13T03:13:33Z

This now has several test failures (https://travis-ci.org/scipy/scipy/jobs/391535771) and a number of PEP8 issues (https://travis-ci.org/scipy/scipy/jobs/391535766)

support numpy 1.8.2 in1d vs newer isin for travis build. more descriptive test failure message.

bsdz · 2018-06-13T11:07:46Z

@rgommers I've fixed the code style and most test failures. Although, it appears for the osx (darwin) build server some of the tests fail with some calculations returning a lower precision. I'm not sure how to deal with this as I don't have access to that platform. One idea is to separate the tests by platform, eg:

tests = [
    ...
    # zolatarev is accurate except at alpha==1
    ['zolotarev', None, 8, lambda r: (sys.platform != 'darwin') & (r['alpha'] != 1)],
    ['zolotarev', None, 6, lambda r: (sys.platform == 'darwin') & (r['alpha'] != 1)],
    ...
]

Although I'd have to test what precision will pass on osx through trial and error by commits to this PR and travis build triggers. Does that sound reasonable or perhaps there's a better way?

rgommers · 2018-06-13T16:17:59Z

Although I'd have to test what precision will pass on osx through trial and error by commits to this PR and travis build triggers. Does that sound reasonable or perhaps there's a better way?

I have a macOS machine, I can give it a try. CI is too slow to do trial and error without it getting annoying.

rgommers · 2018-06-13T16:42:33Z

One idea is to separate the tests by platform, eg:

That's only a good idea if it's clear what in a given platform causes a specific problem. In this case I don't see a reason why macOS should be worse.

The test is not fixable by changing the test precision - the function return has nan in it. The problem seems to be in these lines:

    with np.errstate(all="ignore"):
        intg_max = optimize.minimize_scalar(lambda theta: -f(theta), bounds=[-xi, np.pi/2])
        intg = integrate.quad(f, -xi, np.pi/2, points=[intg_max.x])[0]

ilayn · 2018-06-13T22:07:23Z

Just for the future reference, please make separate branches to work on new features and to send PRs such that your standard working repo does not interfere with the PRs you have submitted. Once a PR is merged you can safely delete that branch and keep working on other branches.

bsdz · 2018-06-13T22:20:02Z

@ilayn Thanks for the heads up. I'll work from a separate branch as suggested and merge once resolved.

@rgommers It appears the integration is failing rather than the minimization. I'll see if I can figure out what parameter values are problematic.

'best' method uses 'zolotarev' for alpha==1 and beta==0. delicate handling of quad inputs as less flexible on windows/macos compared to linux. improve test output on failure.

bsdz · 2018-06-16T16:37:14Z

@rgommers I've fixed the MacOS issue and all tests pass now. It seems that quad() behaviour differs slightly between Linux, MacOS and Windows with Linux being the most forgiving. For example, on linux if we pass in a point outside the integration bounds it will gracefully continue but on Windows and MacOS it will fail. This happens because minimize_scalar() doesn't guarantee returning values within requested bounds. Also Linux will be happy if we pass bounds that are the same (null point integral) and just return zero but this will fail on other platforms. This could be down to floating point differences and in one case I use isclose() to check if the endpoints match as "==" doesn't work on MacOS.

scipy/stats/_continuous_distns.py

rgommers · 2018-06-16T18:30:33Z

This could be down to floating point differences and in one case I use isclose() to check if the endpoints match as "==" doesn't work on MacOS.

In general, you can never rely on exact equality for floating point numbers. You can even get things like 0 and -0 not comparing equal. You normally want to check with isclose or some such function indeed, but with a relative/absolute tolerance that's appropriate (typically 1e-15 or so for float64).

rgommers · 2018-06-16T18:32:44Z

There's 6 merges of scipy master into this branch. Typically we want to avoid that unless absolutely necessary, because it makes the history more messy. In this case let's leave it to not make the rebases in the other PR harder.

rgommers · 2018-06-16T18:42:06Z

These two runtime warnings still appear (at least on macOS):

stats/tests/test_distributions.py::TestLevyStable::()::test_pdf_nolan_samples
  /Users/rgommers/Code/scipy/scipy/stats/_continuous_distns.py:3780: RuntimeWarning: Density calculation unstable for alpha=1 and beta!=0. Use quadrature method instead.
    ' Use quadrature method instead.', RuntimeWarning)
  /Users/rgommers/Code/scipy/scipy/stats/_continuous_distns.py:3880: RuntimeWarning: Density calculations experimental for FFT method. Use combination of zolatarev and quadrature methods instead.
    ' Use combination of zolatarev and quadrature methods instead.', RuntimeWarning)

stats/tests/test_distributions.py::TestLevyStable::()::test_cdf_nolan_samples
  /Users/rgommers/Code/scipy/scipy/stats/_continuous_distns.py:3917: RuntimeWarning: Cumulative density calculations experimental for FFT method. Use zolatarev method instead.
    ' Use zolatarev method instead.', RuntimeWarning)

They should be filtered out inside those tests by using

with suppress_warnings() as sup:
    sup.record(RuntimeWarning, "<start of the warning message>*")

rgommers · 2018-06-16T18:42:51Z

almost there @bsdz :)

rgommers · 2018-06-16T23:50:28Z

All right, in it goes. Thanks @bsdz, nice improvement! And thanks @an81, @mikofski, @pv for the reviews.

Further improvement of these methods in gh-8766.

bsdz added 5 commits May 3, 2017 23:15

Add Levy Stable Parameter Estimation using McCulloch 1986 Quantiles

a912b73

method. Add PDF and CDF calculation using FFT estimate on Continous Fourier Integral of Characteristic Function.

Use numpy dstack instead of stack to remain backward compatible.

aa20a38

Reduce tolerance on fit test (for python 2.7).

Correct styling for pep8 and pyflakes.

8908593

Separate tests. Declare quad test as slow.

f1bb0ea

Added Stable moments.

2d3f9ba

Improve documentation. Choose slightly more illustrative example parameters.

bsdz changed the title ~~Add PDF, CDF and parameter estimation for Stable Distributions~~ ENH: Add PDF, CDF and parameter estimation for Stable Distributions May 5, 2017

pv reviewed May 7, 2017

View reviewed changes

scipy/stats/_continuous_distns.py Outdated Show resolved Hide resolved

pv reviewed May 7, 2017

View reviewed changes

bsdz added 2 commits May 7, 2017 15:19

Recommended changes for references and array broadcasting.

01c5425

Take real part of density before interpolating (as suggested by pv).

317a5c9

ev-br added enhancement A new feature or improvement scipy.stats labels Jun 14, 2017

rgommers added this to the 1.1.0 milestone Sep 16, 2017

bsdz and others added 5 commits November 11, 2017 12:03

Merge branch 'master' into master

c0a4a1b

do not use TestCase. remove slow decorator

f918b13

Merge branch 'master' of github.com:bsdz/scipy

f982322

fix merge break

a4770e0

move warning suppression to test file

d20c46f

bsdz added 3 commits December 3, 2017 17:13

Add Zolotarev method for PDF calculation using S_0 parameterization.

257a447

fix flake issues + missing cr

7906dd5

fix reference formatting for doctest

8cea10c

rgommers reviewed Jun 12, 2018

View reviewed changes

scipy/stats/_continuous_distns.py Outdated Show resolved Hide resolved

rgommers reviewed Jun 12, 2018

View reviewed changes

scipy/stats/_continuous_distns.py Show resolved Hide resolved

rgommers reviewed Jun 12, 2018

View reviewed changes

scipy/stats/_continuous_distns.py Show resolved Hide resolved

bsdz added 2 commits June 13, 2018 00:47

correct typo in test comment

5985551

bsdz added 3 commits June 13, 2018 09:27

Merge branch 'master' of https://github.com/scipy/scipy

d539ca8

fix code style issues.

6db69dd

support numpy 1.8.2 in1d vs newer isin for travis build. more descriptive test failure message.

correct typo in1d not is1d

030feea

add warning when some pre calculations fails

8848063

bsdz added 2 commits June 16, 2018 15:05

Merge branch 'master' of https://github.com/scipy/scipy

edce8cc

improve handling of quad inputs.

47e9a9b

'best' method uses 'zolotarev' for alpha==1 and beta==0. delicate handling of quad inputs as less flexible on windows/macos compared to linux. improve test output on failure.

rgommers reviewed Jun 16, 2018

View reviewed changes

scipy/stats/_continuous_distns.py Outdated Show resolved Hide resolved

call isclose to fp precision. supress known warnings in test.

9fab9aa

rgommers merged commit d48843d into scipy:master Jun 16, 2018

chrisb83 mentioned this pull request Aug 13, 2018

NotImplementedError with scipy.stats Levy_Stable distribution #8263

Closed

rgommers mentioned this pull request Mar 1, 2019

ENH: improvements to the Stable distribution #9523

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add PDF, CDF and parameter estimation for Stable Distributions #7374

ENH: Add PDF, CDF and parameter estimation for Stable Distributions #7374

bsdz commented May 3, 2017 •

edited

pv left a comment

pv May 7, 2017

bsdz May 7, 2017 •

edited

an81 Nov 14, 2017

bsdz Mar 11, 2018 •

edited

pv May 7, 2017

bsdz May 7, 2017

pv May 7, 2017

bsdz May 7, 2017

bsdz commented Nov 22, 2017

an81 commented Nov 22, 2017 via email

an81 commented Nov 22, 2017 via email

bsdz commented Nov 22, 2017 •

edited

an81 commented Nov 22, 2017 via email

bsdz commented Nov 22, 2017 •

edited

an81 commented Nov 23, 2017 •

edited

bsdz commented Nov 24, 2017

an81 commented Nov 28, 2017 via email

rgommers commented Jun 12, 2018

rgommers commented Jun 12, 2018

rgommers commented Jun 13, 2018

bsdz commented Jun 13, 2018

rgommers commented Jun 13, 2018

rgommers commented Jun 13, 2018

ilayn commented Jun 13, 2018

bsdz commented Jun 13, 2018

bsdz commented Jun 16, 2018

rgommers commented Jun 16, 2018

rgommers commented Jun 16, 2018

rgommers commented Jun 16, 2018

rgommers commented Jun 16, 2018

rgommers commented Jun 16, 2018

ENH: Add PDF, CDF and parameter estimation for Stable Distributions #7374

ENH: Add PDF, CDF and parameter estimation for Stable Distributions #7374

Conversation

bsdz commented May 3, 2017 • edited

pv left a comment

Choose a reason for hiding this comment

pv May 7, 2017

Choose a reason for hiding this comment

bsdz May 7, 2017 • edited

Choose a reason for hiding this comment

an81 Nov 14, 2017

Choose a reason for hiding this comment

bsdz Mar 11, 2018 • edited

Choose a reason for hiding this comment

pv May 7, 2017

Choose a reason for hiding this comment

bsdz May 7, 2017

Choose a reason for hiding this comment

pv May 7, 2017

Choose a reason for hiding this comment

bsdz May 7, 2017

Choose a reason for hiding this comment

bsdz commented Nov 22, 2017

an81 commented Nov 22, 2017 via email

an81 commented Nov 22, 2017 via email

bsdz commented Nov 22, 2017 • edited

an81 commented Nov 22, 2017 via email

bsdz commented Nov 22, 2017 • edited

an81 commented Nov 23, 2017 • edited

bsdz commented Nov 24, 2017

an81 commented Nov 28, 2017 via email

rgommers commented Jun 12, 2018

rgommers commented Jun 12, 2018

rgommers commented Jun 13, 2018

bsdz commented Jun 13, 2018

rgommers commented Jun 13, 2018

rgommers commented Jun 13, 2018

ilayn commented Jun 13, 2018

bsdz commented Jun 13, 2018

bsdz commented Jun 16, 2018

rgommers commented Jun 16, 2018

rgommers commented Jun 16, 2018

rgommers commented Jun 16, 2018

rgommers commented Jun 16, 2018

rgommers commented Jun 16, 2018

bsdz commented May 3, 2017 •

edited

bsdz May 7, 2017 •

edited

bsdz Mar 11, 2018 •

edited

bsdz commented Nov 22, 2017 •

edited

bsdz commented Nov 22, 2017 •

edited

an81 commented Nov 23, 2017 •

edited