Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed parameter range option currently not working for lognormal distribution #26

Closed
bostockm opened this issue Jul 29, 2015 · 3 comments

Comments

@bostockm
Copy link

Hi Jeff,

The other day I asked you for the example2 data from this link
http://nbviewer.ipython.org/gist/jeffalstott/3b69b400bbd8461c02c4
because I couldn’t get the forced positive mu
to work with my data set and I wanted first to see if I could duplicate the results from
your notebook. I can’t. I suspect it’s a bug.

I attach two files test1.py and test2.py that include the same instructions as the “actual data”
examples (without and with forced positive mu, respectively) in your note book. The outputs
are the same (both produce the same negative mu), unlike the output in the notebook.

Thanks for your thoughts,

Michael

*********************Output from test1.py

In [46]: run test1.py
Values less than or equal to 0 in data. Throwing out 0 or negative values
Calculating best minimal value for power law fit
/Users/bostock/anaconda/lib/python3.4/site-packages/powerlaw.py:693: RuntimeWarning: invalid value encountered in true_divide
  (Theoretical_CDF * (1 - Theoretical_CDF))
Power law's alpha: 3.531867
Exponential's lambda: 0.119016
R: 61.774285, p: 0.000891
/Users/bostock/anaconda/lib/python3.4/site-packages/powerlaw.py:693: RuntimeWarning: divide by zero encountered in true_divide
  (Theoretical_CDF * (1 - Theoretical_CDF))
Lognormal's sigma: 15.197246, mu: -579.325239
R: -0.955970, p: 0.151001


**********************Output
In [47]: run test2.py
Values less than or equal to 0 in data. Throwing out 0 or negative values
Calculating best minimal value for power law fit
/Users/bostock/anaconda/lib/python3.4/site-packages/powerlaw.py:693: RuntimeWarning: invalid value encountered in true_divide
  (Theoretical_CDF * (1 - Theoretical_CDF))
/Users/bostock/anaconda/lib/python3.4/site-packages/powerlaw.py:693: RuntimeWarning: divide by zero encountered in true_divide
  (Theoretical_CDF * (1 - Theoretical_CDF))
Lognormal's sigma: 15.197246, mu: -579.325239
R: -0.955970, p: 0.151001

*******************************************************
test1.py 
import powerlaw
import numpy as np

data2 = np.genfromtxt('Example2.csv', delimiter=' ')

d = data2
d = d[~np.isnan(d)]

fit = powerlaw.Fit(d)
fit.plot_ccdf(linewidth=4)
fit.power_law.plot_ccdf()
fit.exponential.plot_ccdf()
#fit.lognormal.plot_ccdf()

print("Power law's alpha: %f"%fit.power_law.alpha)
print("Exponential's lambda: %f"%fit.exponential.Lambda)
print("R: %f, p: %f"%fit.distribution_compare('power_law', 'exponential'))

print("Lognormal's sigma: %f, mu: %f"%(fit.lognormal.sigma, fit.lognormal.mu))
print("R: %f, p: %f"%fit.distribution_compare('power_law', 'lognormal'))

*******************************************************
test2.py 

import numpy as np
import powerlaw

data2 = np.genfromtxt('Example2.csv', delimiter=' ')

d = data2
d = d[~np.isnan(d)]

fit_positive = powerlaw.Fit(d)

range_dict = {'mu': [0.0, None]}
fit_positive.lognormal.parameter_range(range_dict)

print("Lognormal's sigma: %f, mu: %f"%(fit_positive.lognormal.sigma, fit_positive.lognormal.mu))
print("R: %f, p: %f"%fit_positive.distribution_compare('power_law', 'lognormal'))

fit_positive.plot_pdf(linewidth=4)
fit_positive.power_law.plot_pdf()
fit_positive.lognormal.plot_pdf()

@jeffalstott
Copy link
Owner

At the moment it looks like the issue is with setting the boundary for 'mu' to exactly zero. If you set it to something like 0.001 it seems to work out. Work with that for now and I will try to isolate the cause of this problem.

jeffalstott added a commit that referenced this issue Jul 29, 2015
@jeffalstott
Copy link
Owner

Ah! Figured it out. The problem stems from trying to be too clever with if statements and Nones at lines 903-906:

                if upper_bound:
                    result *= getattr(self, k) < upper_bound
                if lower_bound:
                    result *= getattr(self, k) > lower_bound

The intent of the if statements is to check that the upper bound or the lower bound is not None. However, if 0 also evaluates to false. Hence the problem. I have now fixed it in the latest commit:

                if upper_bound is not None:
                    result *= getattr(self, k) < upper_bound
                if lower_bound is not None:
                    result *= getattr(self, k) > lower_bound

I will wait until we sort out the lognormal CDF issue before updating the new version on PyPI.

@bostockm
Copy link
Author

Thanks Jeff - sounds good!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants