Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SUMM: Non-reproducible file references #5755

Open
23 tasks
jbrockmendel opened this issue May 19, 2019 · 4 comments
Open
23 tasks

SUMM: Non-reproducible file references #5755

jbrockmendel opened this issue May 19, 2019 · 4 comments

Comments

@jbrockmendel
Copy link
Contributor

jbrockmendel commented May 19, 2019

Most of these are paths to R or Stata scripts that are used to create a results files. These need to be pointed towards relative paths in the repo in order for these to be reproducible.

  • koul_and_mc.py calls np.genfromtxt('/home/justin/rverify.csv', ...)
  • iolib/tests/gen_dates.do insheet using "/home/skipper/statsmodels/statsmodels-skipper/statsmodels/iolib/tests/stata_dates.csv"
  • sandbox/examples/example_sysreg.py data3 = np.genfromtxt('/home/skipper/school/MetricsII/Greene 150 TableF5-1.txt', names=True)
  • sandbox/tests/GreeneEx15_1.s dta <- read.table('/home/skipper/school/MetricsII/Greene\ TableF5-1.txt', header = TRUE)
  • stats/tests/results/anova.R source('/home/skipper/statsmodels/statsmodels/tools/topy.R')
  • tsa/tests/arima.do insheet using "/home/skipper/statsmodels/statsmodels-skipper/statsmodels/datasets/macrodata/macrodata.csv" (and a couple others)
  • tsa/tests/arima112.do insheet using "/home/skipper/statsmodels/statsmodels-skipper/statsmodels/datasets/macrodata/macrodata.csv" (and a couple others)
  • tsa/tests/arima211.do insheet using "/home/skipper/statsmodels/statsmodels-skipper/statsmodels/datasets/macrodata/macrodata.csv" (and a couple others)
  • tsa/tests/results/arima.R dta <- read.csv("/home/skipper/statsmodels/statsmodels-skipper/statsmodels/datasets/macrodata/macrodata.csv")
  • tsa/tests/results/arima_forecast.inp open /home/skipper/statsmodels/statsmodels/statsmodels/datasets/macrodata/macrodata.csv (one other)
  • tsa/tests/results/arma_forecast.inp open /home/skipper/statsmodels/statsmodels/statsmodels/tsa/tests/results/y_arma_data.csv
  • tsa/tests/results/corrgram.do insheet using "/home/skipper/statsmodels/statsmodels-skipper/statsmodels/datasets/macrodata/macrodata.csv", double clear (one other)
  • tsa/vector_ar/tests/svar.do insheet using "/home/skipper/statsmodels/statsmodels-skipper/scikits/statsmodels/datasets/macrodata/macrodata.csv", double clear
  • tsa/vector_ar/tests/svartest.R data <- read.csv("C:\\statsmodels\\statsmodels-bartbkr\\scikits\\statsmodels\\datasets\\macrodata\\macrodata.csv") (and 2 more commented-out)
  • tsa/vector_ar/tests/var.R data <- read.csv('/home/wesm/code/statsmodels/scikits/statsmodels/datasets/macrodata/macrodata.csv')
  • stats/libqsturng/CH.r setwd('D:\\USERS\\roger\\programming\\python\\development\\qsturng')
  • gam/tests/results/results_mpg_bs_poisson.r source("M:\\josef_new\\eclipse_ws\\statsmodels\\statsmodels_py34_pr\\tools\\R2nparray\\R\\R2nparray.R") (@josef-pkt I've seen R2nparray used elsewhere with just library(R2nparray); any chance that would work here?)
  • tsa/statespace/tests/results/test_var.R dta <- read.dta13('~/projects/statsmodels/statsmodels/tsa/tests/results/lutkepohl2.dta') (one other)
  • tsa/statespace/tests/results/test_exact_diffuse_filtering_multivariate.R setwd('~/projects/statsmodels-0.9/statsmodels/')
  • tsa/statespace/tests/results/test_exact_diffuse_filtering.R setwd('~/projects/statsmodels-0.9/statsmodels/')
  • regression/tests/results/test_rls.R macrodata <- read.csv('/Users/fulton/projects/statsmodels/statsmodels/datasets/macrodata/macrodata.csv') (one other)
  • regression/tests/results/test_rls.do insheet using /Users/fulton/projects/statsmodels/statsmodels/datasets/macrodata/macrodata.csv, clear
  • compare_arma.py references r'C:\Josef\eclipsegworkspace\statsmodels-josef-experimental-gsoc\scikits\statsmodels\tsa\y_arma22.txt'; that file doesn't appear to exist anymore.
@jbrockmendel jbrockmendel changed the title SUMM: Invalid file references SUMM: Non-reproducible file references May 19, 2019
@josef-pkt
Copy link
Member

Doesn't work or not worth the trouble.
Every one who wants to run the Stata or R files can replace the path.

I never tried to figure out if there would be a way to find the statsmodels file location from within Stata.
In any case, it is not worth the effort.
Same for R

(I'm running my Stata files from a stataworks directory and then copy the relevant files into statsmodels, and similarly for R.)

@jbrockmendel
Copy link
Contributor Author

I care quite a bit about reproducibility, so do judge it worth the effort.

Some of these can just be updated to point towards local paths. For others the original file is not in the repo, and they need to be tracked down.

@josef-pkt
Copy link
Member

Have you written a doc notebook yet? Or some additional unit tests using R or Stata?

We are not going to impose extra work on contributors that do provide R/Stata verified unit tests.

@jbrockmendel
Copy link
Contributor Author

Any thoughts on the R2nparray question? Also #5719 should be answerable in one sentence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants