Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit Test Suite #4966

Open
jbrockmendel opened this issue Aug 23, 2018 · 0 comments
Open

Audit Test Suite #4966

jbrockmendel opened this issue Aug 23, 2018 · 0 comments

Comments

@jbrockmendel
Copy link
Contributor

There is a lot of heterogeneity in the quality of the tests. An audit-like process should attempt to identify/address (in no particular order):

  • What parts of the code have only smoke tests?
  • What parts of the tests are not getting run?
    • Commented-out
    • mangled names
    • miscellany likedef junk in discrete.tests.test_constrained
    • stranded in __main__ sections
    • incorrectly located in __main__ sections of non-test files.
  • Are there any xfailed tests that have been fixed? Or can be marked with strict=True?
  • A lot of effort went into creating the results files to compare against (props to our sm forebearers). Are these reproducible? (and if not, can they be made reproducible?)
  • In some cases results were subsequently "hand-edited" for various reasons. Are these well-documented?
  • Some "example" files have snuck into test directories; where should they go?
  • Can test runtime be significantly reduced by efficient use of pytest.fixtures?
  • Are there combinations of parameters that can be tested more thoroughly using pytest.mark.parametrize?
  • Can the tests be otherwise be made less verbose/clearer?
  • Other modernizations that should be made? e.g. IIUC assert_almost_equal is discouraged and assert_allclose should be used instead.
  • Are there places where assert_allclose tolerances can be reduced?
  • grep turns up 5 occurrences of "FIXME" and 218 occurrences of "TODO" in test directories
  • There are a whole bunch of occurrences of things like:
        cls.res1 = mymodel.fit(method="lbfgs", disp=0, maxiter=50000,
                #m=12, pgtol=1e-7, factr=1e3, # 5 failures
                #m=20, pgtol=1e-8, factr=1e2, # 3 failures
                #m=30, pgtol=1e-9, factr=1e1, # 1 failure
                m=40, pgtol=1e-10, factr=5e0,
                loglike_and_score=mymodel.loglike_and_score)
        get_robustcov_results(cls.res1._results, 'cluster',
                                                  groups=group,
                                                  use_correction=True,
                                                  df_correction=True,  #TODO has no effect
                                                  use_t=False, #True,
                                                  use_self=True)
        model = sm.Logit(y_bin, x)  #, exposure=np.ones(nobs), offset=np.zeros(nobs)) #bug with default

The TODO is reasonably clear and helpful, but the commented-out True and the commented-out fit parameters are not helpful in their current form.


This is a pretty huge task. A few steps in this direction: #4941, #4936, #4932, #4907, #4875, #4863, #4506, #4488, #4305

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant