New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_glm_weights commented-out/unused #4397
Comments
In addition, I probably should have better documented this... Essentially, the comparisons came from R, and R took shortcuts for a couple families, so matching isn't possible/sensible.
To answer your other questions, "Known to fail" might be replaced with "prob_weights not yet implemented". The |
@thequackdaddy thanks for helping clear this up. Double-checking to make sure all the bases are covered:
Are you also a good person to ask about test_glm or other test files (results_glm would be a big win)? |
|
Totally fair, you've done your good deed for the day. @josef-pkt the last unresolved commented-out stuff in results_glm is within reach:
It looks like there might be a couple of usable references in there, but most of this is adding more confusion than its resolving. Thoughts? |
This are comments from Skipper's initial implementation. This is part of the maintainer and developer documentation until inverse gaussian gets a proper review. Until then, I wouldn't have any idea about which information is useful and which is not. Similarly, gamma has numerical or convergence problems with the canonical link, but I never looked closely enough to learn anything about it. (gamma works fine with exp link which does not run into corner cases at zero, so figuring out the canonical inverse link is indefinitely postponed.) |
@jseabold Can you help clear up this last-unsolved-mystery in glm_results? @josef-pkt It's totally reasonable to put off dealing with this until the right person shows up, but this is distinctly not documentation. |
@jbrockmendel It's maintainer and developer documentation! It's distinctly not user documentation and it's not part of it. So it just takes a bit of effort to figure this out, if somebody takes the time. |
The word you're looking for is "technical debt". If it a) explains/documents nothing and b) needs to be explained/documented, then it is not documentation of any variety. I acknowledged the paper references as being usable. Its the commented-out code and the "This isn't correct" that are unhelpful. |
so I guess the corrected code should be something like |
"technical debt" is ok, it's another term for "agile software development". :) ( |
I'm not convinced "the corrected code" needs to exist. I think its more likely this was an older attempt at producing results for glm_results that is superceded by a newer version, should be deleted entirely. I'll wait for skipper to chime in. |
What do you think that word means? I don't think it means what you think it means.
And from https://www.atlassian.com/agile/software-development/technical-debt
Ring any bells? |
I think both wikipedia and the technical-debt link describe pretty well what we are doing, except for timing and formal organization which we lack because we are not commercial, i.e. we have volunteer work, lack of maintainer wo/man power and turnover in "team membership". "Focusing on delivering new functionality may result in increased technical debt. The team must allow themselves time for defect remediation and refactoring." The atlassian link is good: each () feature gets tested immediately and we refine and enhance in iterations. "Technical debt is the difference between what was promised and what was actually delivered." I think we are pretty good in not overselling things. Both freq_weights and var_weights have very good test coverage at least for all the basic results, But some things are not yet adjusted to handle those weights, and we say so. Prob_weights are still WIP, and it's not clear yet how various results are supposed to be defined in that case (i.e. I don't like how Stata defines things for pweights). I don't like rigorous adherence to any philosophy, whether software development or statistics or economics, but I think our software development is pretty test driven and agile. To the invgauss case: |
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/genmod/tests/test_glm_weights.py#L213
"Known to fail" is not a helpful error message.
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/genmod/tests/test_glm_weights.py#L231
Is the
cov_kwds={'use_correction': False}
needed? As usual: should be deleted or explained. This shows up in a couple of places in this file.https://github.com/statsmodels/statsmodels/blob/master/statsmodels/genmod/tests/test_glm_weights.py#L275
Does the "no wnobs..." comment relate to the commented-out sqrt?
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/genmod/tests/test_glm_weights.py#L903
y
defined but unused. What is it for?https://github.com/statsmodels/statsmodels/blob/master/statsmodels/genmod/tests/test_glm_weights.py#L201
aweights
is defined but unused. Should it be? This shows up a couple of times in this file.The text was updated successfully, but these errors were encountered: