Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: no docstrings describes diagnostic part of regression summary #9099

Open
josef-pkt opened this issue Dec 18, 2023 · 3 comments
Open

DOC: no docstrings describes diagnostic part of regression summary #9099

josef-pkt opened this issue Dec 18, 2023 · 3 comments

Comments

@josef-pkt
Copy link
Member

I don't find a docstring that includes description and defintion summary info.

current point: I was not sure whether kurtosis is fisher or pearson, i.e. 0 or 3 in normal case.
It is pearson, normal = 3, after looking at the code in jarque_bera.

@luke396
Copy link
Contributor

luke396 commented Apr 21, 2024

Hi @josef-pkt, I'd love to help improve the document. Could you provide more details?

Do you think the page at https://www.statsmodels.org/devel/examples/notebooks/generated/regression_diagnostics.html#Regression-diagnostics needs a more detailed description of the data imported via URL? Or do you have other suggestions for improving the page?

@josef-pkt
Copy link
Member Author

What I remember when I opened the issue, the missing docs are specifically for the summary results on linear regression.
For example in a Notes section in https://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.summary.html

The two top tables are mostly self explanatory. It's mainly the diagnostic bottom table that needs more description.

and/or a link to a notebook that explains the regression summary output.
I have seen something like that in a blog article some time ago, but don't remember where.

The regression_diagnostics.html doc notbook is more for what extra functions we have available for diagnostic and specification testing.
It's for functions that the user can call, but does not directly explain what the statistics in the summary table are besides normality.
jarque-bera also does not make it explicit that kurtosis is standard kurtosis and not excess kurtosis.
It's confusing because "kurtosis" is often used as name for excess kurtosis.

e.g. "kurtosis is the sample kurtosis, not the excess kurtosis. A sample from the normal distribution has kurtosis equal to 3."

Durbin Watson needs explanation that a value around 2 indicates no serial correlation.

asides:
Including Durbin Watson was a "econometrics tradition" from a very long time ago.
Ancient issue #87 but I never decided how we can improve it.

@josef-pkt
Copy link
Member Author

I also just see that both diagnostic notebooks do not include autocorrelation tests
https://www.statsmodels.org/dev/stats.html#module-statsmodels.stats.stattools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants