Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SUMM: Roundup of sandbox exposure, unused files, redundant code, ... #5773

Open
5 of 92 tasks
jbrockmendel opened this issue May 22, 2019 · 3 comments
Open
5 of 92 tasks

Comments

@jbrockmendel
Copy link
Contributor

jbrockmendel commented May 22, 2019

Collecting these in one place to avoid Issue proliferation; will update as appropriate.

Sandbox

Parts of sandbox that are imported from non-sandbox (cc @ChadFulton if you were wondering how much of the code implicitly relies on sandbox):

  • sandbox.stats.multicomp
    • stats.multicomp imports tukeyhsd and MultiComparison (there is only one function defined directly in stats.multicomp)
    • examples.try_tukey_hsd gets tukeyhsd and MultiComparison directly from the sandbox module, could get it from the non-sandbox file.
    • stats.contrast imports contrast_allpairs (also duplicated in sandbox.stats.contrast_tools)
    • stats.tests.test_qsturng uses get_tukeyQcrit
  • sandbox.tsa.fftarma.Armafft
    • In several of the places this is imported it is only for the generate_sample method, which is defined on the non-sandbox parent class ArmaProcess. Use ArmaProcess directly.
    • The only remaining place where fftarma is imported is in test_arima_process, where it has two tests for Armafft. These could be moved to a test file in sandbox specific to fftarma.
  • sandbox.stats.diagnostic almost all of stats.diagnostic comes from here.
    • The good news is that aside from this pass-through, the only direct import from sandbox.diagnostics is in examples/ex_arch_canada.py for acorr_lm.
  • statsmodels.sandbox.regression.predstd is imported in numerous test and example files to get wls_prediction_std (looks like 10 imports total, 13 if we count notebook files)
    • Also in stats.outliers_influence and graphics.regressionplots (i.e. user-facing code)
  • sandbox.regression.penalized is imported in several test files for TheilGLS. This includes a test_theil file in regression.tests that sends mixed messages as to the status of this file.
  • sandbox.tsa.garch is imported in statsmodels/examples/tsa/ex_arma.py for its Arma class
  • sandbox.nonparametric.kernels is imported into 2 files in nonparametric and 3 files in nonparametric.tests.
    • sandbox.nonparametric files are imported into 6 examples/ files
  • sandbox.distributions.mv_normal is imported into distributions.mixture_rvs and distributions.tests.test_mixture
    • sandbox.distributions.sppatch is imported into examples/ex_generic_mle_tdist.py
      • sppatch modified scipy objects in-place. Can we avoid this?
  • sandbox.tools.cross_val.LeaveOneOut is imported into stats.outlier_influence. This class is also duplicated in nonparametric._kernel_baseSlightly different behavior with the same name.
  • sandbox.stats.runs provides Runs, runstest_1samp, and runstest_2samp directly to stats.api.
    • The same three are imported in stats.tests.test_nonparametric, along with sandbox.stats.runs.mcnemar
  • statsmodels.sandbox.panel.random_panel is imported for PanelSample in regression.tests.test_theil

De Facto Sandbox

Non-Sandbox code that belongs in sandbox or dustbin:

Chopping Block

Partially moved from #5137
Files that appear unlikely to become useful.

Duplication

Copy/Pasted code that should be imported instead:

Disabled Code

Commenting out code is a way of temporarily disabling it that tends to become permanent and make things more difficult for subsequent readers.

Namespace Issues

Unsorted

  • miscmodels.tests.test_generic_mle contains tests for GenericLikelihoodModel, not anything in miscmodels (miscmodels.tests.test_generic_mle doesnt import miscmodels #5763). If, as I suspect, miscmodels serves primarily as tests/examples for GenericLikelihoodModel, that should be made explicit and implement systematically.
  • Many files like e.g. sandbox.regression.example_kernridge that may belong in e.g. andbox/regression/examples/?
  • Overlap between datasets/README.txt, tools/dataset_rst.py, sandbox/dataset_notes.rst, possibly others in docs/?
@jbrockmendel
Copy link
Contributor Author

jbrockmendel commented May 22, 2019

Moving from #5145

Mangled, Commented-Out, or Otherwise Disabled Tests

With pytest.mark.skip or pytest.mark.xfail we can easily find things that need attention. Other methods of "temporarily" skipping a test get lost easily. See also #5768.

Unmarked Smoke Tests

@jbrockmendel
Copy link
Contributor Author

jbrockmendel commented May 22, 2019

Unused/Unmaintained __main__ Sections

Moved from #5128

#3515 (comment)

Searching 1493 files for "__main__" (regex, case sensitive)

  • setup.py
  • statsmodels/base/tests/test_data.py
  • statsmodels/base/tests/test_shrink_pickle.py
  • statsmodels/base/wrapper.py
  • statsmodels/compat/tests/test_itercompat.py
  • statsmodels/discrete/tests/test_discrete.py
  • statsmodels/discrete/tests/test_sandwich_cov.py
  • statsmodels/distributions/edgeworth.py
  • statsmodels/distributions/empirical_distribution.py
  • statsmodels/distributions/mixture_rvs.py
  • statsmodels/distributions/tests/test_discrete.py
  • statsmodels/distributions/tests/test_edgeworth.py
  • statsmodels/distributions/tests/test_mixture.py
  • statsmodels/duration/tests/test_phreg.py
  • statsmodels/examples/ex_kernel_regression2.py
  • statsmodels/examples/ex_kernel_regression3.py
  • statsmodels/examples/ex_kernel_regression_censored2.py
  • statsmodels/examples/ex_kernel_regression_dgp.py
  • statsmodels/examples/ex_kernel_regression_sigtest.py
  • statsmodels/examples/ex_kernel_semilinear_dgp.py
  • statsmodels/examples/ex_kernel_singleindex_dgp.py
  • statsmodels/examples/ex_kernel_test_functional.py
  • statsmodels/examples/ex_kernel_test_functional_li_wang.py
  • statsmodels/examples/ex_multivar_kde.py
  • statsmodels/examples/ex_outliers_influence.py
  • statsmodels/examples/ex_pairwise.py
  • statsmodels/examples/ex_rootfinding.py
  • statsmodels/examples/l1_demo/demo.py
  • statsmodels/examples/try_fit_constrained.py
  • statsmodels/examples/try_power2.py
  • statsmodels/genmod/_tweedie_compound_poisson.py
  • statsmodels/genmod/families/tests/test_link.py
  • statsmodels/genmod/generalized_linear_model.py
  • statsmodels/genmod/tests/gee_categorical_simulation_check.py
  • statsmodels/genmod/tests/gee_gaussian_simulation_check.py
  • statsmodels/genmod/tests/gee_poisson_simulation_check.py
  • statsmodels/genmod/tests/results/gee_generate_tests.py
  • statsmodels/genmod/tests/test_glm.py
  • statsmodels/graphics/tests/test_mosaicplot.py
  • statsmodels/graphics/tests/test_regressionplots.py
  • statsmodels/imputation/tests/test_mice.py
  • statsmodels/iolib/foreign.py
  • statsmodels/iolib/summary.py
  • statsmodels/iolib/tests/test_foreign.py
  • statsmodels/iolib/tests/test_summary.py
  • statsmodels/miscmodels/nonlinls.py
  • statsmodels/miscmodels/try_mlecov.py
  • statsmodels/nonparametric/kde.py
  • statsmodels/nonparametric/tests/test_bandwidths.py
  • statsmodels/nonparametric/tests/test_kde.py
  • statsmodels/nonparametric/tests/test_kernel_density.py
  • statsmodels/nonparametric/tests/test_kernel_regression.py
  • statsmodels/nonparametric/tests/test_kernels.py
  • statsmodels/nonparametric/tests/test_lowess.py
  • statsmodels/regression/linear_model.py
  • statsmodels/regression/tests/test_glsar_gretl.py
  • statsmodels/regression/tests/test_glsar_stata.py
  • statsmodels/robust/robust_linear_model.py
  • statsmodels/stats/_adnorm.py
  • statsmodels/stats/anova.py
  • statsmodels/stats/descriptivestats.py
  • statsmodels/stats/libqsturng/tests/test_qsturng.py
  • statsmodels/stats/tabledist.py
  • statsmodels/stats/tests/test_anova.py
  • statsmodels/stats/tests/test_diagnostic.py
  • statsmodels/stats/tests/test_groups_sw.py
  • statsmodels/stats/tests/test_moment_helpers.py
  • statsmodels/stats/tests/test_power.py
  • statsmodels/stats/tests/test_proportion.py
  • statsmodels/stats/tests/test_sandwich.py
  • statsmodels/stats/tests/test_statstools.py
  • statsmodels/stats/tests/test_tost.py
  • statsmodels/tools/catadd.py
  • statsmodels/tools/dump2module.py
  • statsmodels/tools/grouputils.py
  • statsmodels/tools/numdiff.py
  • statsmodels/tools/print_version.py
  • statsmodels/tools/tests/test_catadd.py
  • statsmodels/tools/tests/test_eval_measures.py
  • statsmodels/tools/tests/test_numdiff.py
  • statsmodels/tsa/ar_model.py
  • statsmodels/tsa/arima_model.py
  • statsmodels/tsa/base/tsa_model.py
  • statsmodels/tsa/coint_tables.py
  • statsmodels/tsa/filters/cf_filter.py
  • statsmodels/tsa/filters/tests/test_filters.py
  • statsmodels/tsa/interp/denton.py
  • statsmodels/tsa/interp/tests/test_denton.py
  • statsmodels/tsa/kalmanf/kalmanfilter.py
  • statsmodels/tsa/seasonal.py
  • statsmodels/tsa/tests/test_tsa_tools.py
  • statsmodels/tsa/varma_process.py
  • statsmodels/tsa/vector_ar/dynamic.py
  • statsmodels/tsa/vector_ar/var_model.py

Some of these are in test files and are harmless-albeit-annoying.

@josef-pkt
Copy link
Member

one comment to a group of __main__ issues as in statsmodels/examples/ex_kernel_regression_dgp.py

Those are/were required (at least on Windows) for multiprocessing.
I don't remember the details nor whether it would be really necessary in each case or whether the same pattern was used "just in case" if there is multiprocessing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants