Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T Distribution Weirdness #23

Closed
angelgeek opened this issue Nov 1, 2022 · 2 comments
Closed

T Distribution Weirdness #23

angelgeek opened this issue Nov 1, 2022 · 2 comments

Comments

@angelgeek
Copy link

We are using distfit to try to determine if some data we have can be modelled parametrically. For some of the data, the best fitting distribution was a t. Scale and loc are clearly documented, and that is great. There is one remaining parameter to fit a t distribution, and that is degrees of freedom. Except, the one parameter in the distfit output that isn't a scale or loc value is less than one. Obviously, degrees of freedom can't be less than one. So what is that parameter and why isn't degrees of freedom included in the output? It would be helpful for automating our process.

@erdogant
Copy link
Owner

erdogant commented Nov 1, 2022

Maybe I missed this one but I store all the parameters returned after the distribution fitting.

As an example:

from distfit import distfit
X = np.random.normal(0, 2, 1000)
y = [-8, -6, 0, 1, 2, 3, 4, 5, 6]
dist = distfit(stats='ks', distr=['expon', 't', 'gamma', 'lognorm'])
results = dist.fit_transform(X)

print(dist.model)
{'distr': <scipy.stats._continuous_distns.t_gen at 0x2d4882810f0>,
 'stats': 'ks',
 'params': (3518324.248643998, -0.08180702912809554, 2.0838347069246876),
 'name': 't',
 'model': <scipy.stats._distn_infrastructure.rv_continuous_frozen at 0x2d49debda80>,
 'score': 0.40237077133797083,
 'loc': -0.08180702912809554,
 'scale': 2.0838347069246876,
 'arg': (3518324.248643998,),
 'CII_min_alpha': -3.5094110072794593,
 'CII_max_alpha': 3.345796949023267}

When I now do the fit manually for only the t-distribution, the following parameters are returned:

import scipy.stats as st
# fit dist to data
params = st.t.fit(X)
print(params)
(3518324.248643998, -0.08180702912809554, 2.0838347069246876)

# Separate parts of parameters
arg = params[:-2]
loc = params[-2]
scale = params[-1]

If I now compare the returned parameters and the stored ones in distfit, it is exactly the same:

params==dist.model['params']
True

@erdogant
Copy link
Owner

erdogant commented Dec 1, 2022

I am closing this issue. Reopen if required.

@erdogant erdogant closed this as completed Dec 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants