Fixed parameter range option currently not working for lognormal distribution #26

bostockm · 2015-07-29T17:03:51Z

Hi Jeff,

The other day I asked you for the example2 data from this link
http://nbviewer.ipython.org/gist/jeffalstott/3b69b400bbd8461c02c4
because I couldn’t get the forced positive mu
to work with my data set and I wanted first to see if I could duplicate the results from
your notebook. I can’t. I suspect it’s a bug.

I attach two files test1.py and test2.py that include the same instructions as the “actual data”
examples (without and with forced positive mu, respectively) in your note book. The outputs
are the same (both produce the same negative mu), unlike the output in the notebook.

Thanks for your thoughts,

Michael

*********************Output from test1.py

In [46]: run test1.py
Values less than or equal to 0 in data. Throwing out 0 or negative values
Calculating best minimal value for power law fit
/Users/bostock/anaconda/lib/python3.4/site-packages/powerlaw.py:693: RuntimeWarning: invalid value encountered in true_divide
  (Theoretical_CDF * (1 - Theoretical_CDF))
Power law's alpha: 3.531867
Exponential's lambda: 0.119016
R: 61.774285, p: 0.000891
/Users/bostock/anaconda/lib/python3.4/site-packages/powerlaw.py:693: RuntimeWarning: divide by zero encountered in true_divide
  (Theoretical_CDF * (1 - Theoretical_CDF))
Lognormal's sigma: 15.197246, mu: -579.325239
R: -0.955970, p: 0.151001


**********************Output
In [47]: run test2.py
Values less than or equal to 0 in data. Throwing out 0 or negative values
Calculating best minimal value for power law fit
/Users/bostock/anaconda/lib/python3.4/site-packages/powerlaw.py:693: RuntimeWarning: invalid value encountered in true_divide
  (Theoretical_CDF * (1 - Theoretical_CDF))
/Users/bostock/anaconda/lib/python3.4/site-packages/powerlaw.py:693: RuntimeWarning: divide by zero encountered in true_divide
  (Theoretical_CDF * (1 - Theoretical_CDF))
Lognormal's sigma: 15.197246, mu: -579.325239
R: -0.955970, p: 0.151001

*******************************************************
test1.py 
import powerlaw
import numpy as np

data2 = np.genfromtxt('Example2.csv', delimiter=' ')

d = data2
d = d[~np.isnan(d)]

fit = powerlaw.Fit(d)
fit.plot_ccdf(linewidth=4)
fit.power_law.plot_ccdf()
fit.exponential.plot_ccdf()
#fit.lognormal.plot_ccdf()

print("Power law's alpha: %f"%fit.power_law.alpha)
print("Exponential's lambda: %f"%fit.exponential.Lambda)
print("R: %f, p: %f"%fit.distribution_compare('power_law', 'exponential'))

print("Lognormal's sigma: %f, mu: %f"%(fit.lognormal.sigma, fit.lognormal.mu))
print("R: %f, p: %f"%fit.distribution_compare('power_law', 'lognormal'))

*******************************************************
test2.py 

import numpy as np
import powerlaw

data2 = np.genfromtxt('Example2.csv', delimiter=' ')

d = data2
d = d[~np.isnan(d)]

fit_positive = powerlaw.Fit(d)

range_dict = {'mu': [0.0, None]}
fit_positive.lognormal.parameter_range(range_dict)

print("Lognormal's sigma: %f, mu: %f"%(fit_positive.lognormal.sigma, fit_positive.lognormal.mu))
print("R: %f, p: %f"%fit_positive.distribution_compare('power_law', 'lognormal'))

fit_positive.plot_pdf(linewidth=4)
fit_positive.power_law.plot_pdf()
fit_positive.lognormal.plot_pdf()

The text was updated successfully, but these errors were encountered:

jeffalstott · 2015-07-29T17:12:08Z

At the moment it looks like the issue is with setting the boundary for 'mu' to exactly zero. If you set it to something like 0.001 it seems to work out. Work with that for now and I will try to isolate the cause of this problem.

Addresses issue 26: #26

jeffalstott · 2015-07-29T17:21:06Z

Ah! Figured it out. The problem stems from trying to be too clever with if statements and Nones at lines 903-906:

                if upper_bound:
                    result *= getattr(self, k) < upper_bound
                if lower_bound:
                    result *= getattr(self, k) > lower_bound

The intent of the if statements is to check that the upper bound or the lower bound is not None. However, if 0 also evaluates to false. Hence the problem. I have now fixed it in the latest commit:

                if upper_bound is not None:
                    result *= getattr(self, k) < upper_bound
                if lower_bound is not None:
                    result *= getattr(self, k) > lower_bound

I will wait until we sort out the lognormal CDF issue before updating the new version on PyPI.

bostockm · 2015-07-29T17:30:08Z

Thanks Jeff - sounds good!

jeffalstott added a commit that referenced this issue Jul 29, 2015

Fixed parameter_range restrictions not respecting a bound of 0.

3430c80

Addresses issue 26: #26

jeffalstott closed this as completed Jul 29, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed parameter range option currently not working for lognormal distribution #26

Fixed parameter range option currently not working for lognormal distribution #26

bostockm commented Jul 29, 2015

jeffalstott commented Jul 29, 2015

jeffalstott commented Jul 29, 2015

bostockm commented Jul 29, 2015

Fixed parameter range option currently not working for lognormal distribution #26

Fixed parameter range option currently not working for lognormal distribution #26

Comments

bostockm commented Jul 29, 2015

jeffalstott commented Jul 29, 2015

jeffalstott commented Jul 29, 2015

bostockm commented Jul 29, 2015