Linear regression is not reproducible #27

mgralle · 2018-12-28T17:22:14Z

#!/usr/bin/env python3

-- coding: utf-8 --

"""
Created on Thu Dec 27 11:24:48 2018

Debugging script for the dowhy package, using the Lalonde data example.

Repetition of estimation using propensity score matching or weighting gives reproducible values, as expected. However, repetition of estimation using linear regression gives different values.
"""

#To simplify debugging, I obtained the Lalonde data as described on the DoWhy
#page and wrote it to a CSV file:

#from rpy2.robjects import r as R
#%load_ext rpy2.ipython
##%R install.packages("Matching")
#%R library(Matching)
#%R data(lalonde)
#%R -o lalonde
#lfile("lalonde.csv","w")
#lalonde.to_csv(lfile,index=False)
#lfile.close()

import pandas as pd
lalonde=pd.read_csv("lalonde.csv")

print("Lalonde data frame:")
print(lalonde.describe())

from dowhy.do_why import CausalModel

1. Propensity score weighting

model=CausalModel(
data = lalonde,
treatment='treat',
outcome='re78',
common_causes='nodegr+black+hisp+age+educ+married'.split('+'))
identified_estimand = model.identify_effect()

psw_estimate = model.estimate_effect(identified_estimand,
method_name="backdoor.propensity_score_weighting")
print("\n(1) Causal Estimate from PS weighting is " + str(psw_estimate.value))

psw_estimate = model.estimate_effect(identified_estimand,
method_name="backdoor.propensity_score_weighting")
print("\n(2) Causal Estimate from PS weighting is " + str(psw_estimate.value))

#2. Propensity score matching
psm_estimate = model.estimate_effect(identified_estimand,
method_name="backdoor.propensity_score_matching")
print("\n(1) Causal estimate from PS matching is " + str(psm_estimate.value))

psm_estimate = model.estimate_effect(identified_estimand,
method_name="backdoor.propensity_score_matching")
print("\n(2) Causal estimate from PS matching is " + str(psm_estimate.value))

#3. Linear regression
linear_estimate = model.estimate_effect(identified_estimand,
method_name="backdoor.linear_regression",
test_significance=True)
print("\n(1) Causal estimate from linear regression is " + str(linear_estimate.value))

linear_estimate = model.estimate_effect(identified_estimand,
method_name="backdoor.linear_regression",
test_significance=True)
print("\n(2) Causal estimate from linear regression is " + str(linear_estimate.value))

Recreate model from scratch for linear regression

model=CausalModel(
data = lalonde,
treatment='treat',
outcome='re78',
common_causes='nodegr+black+hisp+age+educ+married'.split('+'))

identified_estimand = model.identify_effect()

linear_estimate = model.estimate_effect(identified_estimand,
method_name="backdoor.linear_regression",
test_significance=True)
print("\n(3) Causal estimate from linear regression is " + str(linear_estimate.value))

print("\nLalonde Data frame hasn't changed:")
print(lalonde.describe())

amit-sharma · 2019-01-21T11:34:11Z

Thanks for raising this @mgralle ! I am trying to understand the source of the error. When I try this script on my local machine, I obtain the same value each time (1671.13) for linear regression.
Can you share samples of the different estimate values that you got?

mgralle · 2019-01-21T19:30:14Z

Thanks for paying attention to this problem!
When running my script on python 3.7 in Spyder 3.3.1 in Anaconda, I get the following estimated values from linear regression:
(1) Causal estimate from linear regression is 1671.1308841235102
(2) Causal estimate from linear regression is 937.2998778128333
(3) Causal estimate from linear regression is -333.2514015372243

Re-running the entire script, I get:
(1) Causal estimate from linear regression is 1671.1308841235102
(2) Causal estimate from linear regression is -782.0656222357087
(3) Causal estimate from linear regression is -861.948525208911

After exiting Spyder and re-entering, I get:
(1) Causal estimate from linear regression is 1671.1308841235111
(2) Causal estimate from linear regression is 741.7360515239643
(3) Causal estimate from linear regression is -74.26662536330865

Hope that helps!

akelleh · 2019-02-22T20:30:37Z

unable to recreate on python3.6. Making a fresh install of python3.7 to test your script there

akelleh · 2019-02-22T21:25:00Z

on a fresh install of python3.7, i get

(1) Causal estimate from linear regression is 1671.130884123515
(2) Causal estimate from linear regression is 1671.130884123515
(3) Causal estimate from linear regression is 1671.130884123515

so can't reproduce. @mgralle can you provide minimal steps to reproduce in a fresh venv?

mgralle · 2019-02-23T01:21:26Z

I will see if I manage to build a docker image where I can reproduce the error. Em sex, 22 de fev de 2019 às 18:25, Adam Kelleher <notifications@github.com> escreveu:

…

on a fresh install of python3.7, i get (1) Causal estimate from linear regression is 1671.130884123515 (2) Causal estimate from linear regression is 1671.130884123515 (3) Causal estimate from linear regression is 1671.130884123515 so can't reproduce. @mgralle <https://github.com/mgralle> can you provide minimal steps to reproduce in a fresh venv? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#27 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AbpBPjKFtTrPGgKdhF9WLt-TqCu0dQBSks5vQGAugaJpZM4ZkDzx> .

akelleh · 2019-02-23T11:21:49Z

perfect! Thanks so much.

amit-sharma · 2019-02-27T12:18:48Z

Thanks @mgralle for spending the time. Let us know when you have an update.

mgralle · 2019-03-01T15:13:02Z

Hi, sorry it took me some days, I had other urgent tasks... I created a minimal virtual environment "dowhy_test_list.txt" for the dowhy_lalonde_debug.py file (the one I posted on github) using conda, and everything works just fine. On the other hand, using the much fuller "base" virtual environment that anaconda initially installed, I continue getting the bizarre behavior I posted. It is clear that the problem is with some other package and not with dowhy_test itself. Both environments run python 3.7.2. Just in case you are interested, I have attached the output of "conda list -n base" and "conda list -n dowhy_test". Thanks a lot for taking care of this seeming bug! Best, Matthias

on a fresh install of python3.7, i get (1) Causal estimate from linear regression is 1671.130884123515 (2) Causal estimate from linear regression is 1671.130884123515 (3) Causal estimate from linear regression is 1671.130884123515 so can't reproduce. @mgralle can you provide minimal steps to reproduce in a fresh venv? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

# packages in environment at /Users/mgralle/anaconda3: # # Name Version Build Channel _ipyw_jlab_nb_ext_conf 0.1.0 py37_0 _r-mutex 1.0.0 anacondar_1 alabaster 0.7.11 py37_0 anaconda 5.3.1 py37_0 anaconda-client 1.7.2 py37_0 anaconda-navigator 1.9.6 py37_0 anaconda-project 0.8.2 py37_0 appdirs 1.4.3 py37h28b3542_0 appnope 0.1.0 py37_0 appscript 1.0.1 py37h1de35cc_1 asn1crypto 0.24.0 py37_0 astroid 2.0.4 py37_0 astropy 3.0.4 py37h1de35cc_0 atomicwrites 1.2.1 py37_0 attrs 18.2.0 py37h28b3542_0 automat 0.7.0 py37_0 babel 2.6.0 py37_0 backcall 0.1.0 py37_0 backports 1.0 py37_1 backports.shutil_get_terminal_size 1.0.0 py37_2 beautifulsoup4 4.6.3 py37_0 bitarray 0.8.3 py37h1de35cc_0 bkcharts 0.2 py37_0 blas 1.0 mkl blaze 0.11.3 py37_0 bleach 2.1.4 py37_0 blosc 1.14.4 hd9629dc_0 bokeh 0.13.0 py37_0 boto 2.49.0 py37_0 bottleneck 1.2.1 py37h1d22016_1 bwidget 1.9.11 1 bzip2 1.0.6 h1de35cc_5 ca-certificates 2019.1.23 0 cairo 1.14.12 hc4e6be7_4 cctools 895 h7512d6f_0 certifi 2018.11.29 py37_0 cffi 1.11.5 py37h6174b99_1 chardet 3.0.4 py37_1 clang 4.0.1 h662ec87_0 clang_osx-64 4.0.1 h1ce6c1d_11 clangxx 4.0.1 hc9b4283_0 clangxx_osx-64 4.0.1 h22b1bf0_11 click 6.7 py37_0 cloudpickle 0.5.5 py37_0 clyent 1.2.2 py37_1 colorama 0.3.9 py37_0 compiler-rt 4.0.1 h5487866_0 conda 4.6.7 py37_0 conda-build 3.15.1 py37_0 conda-env 2.6.0 1 constantly 15.1.0 py37h28b3542_0 contextlib2 0.5.5 py37_0 cryptography 2.5 py37ha12b0ac_0 curl 7.63.0 ha441bb4_1000 cycler 0.10.0 py37_0 cyrus-sasl 2.1.26 hb48c43a_4 cython 0.28.5 py37h0a44026_0 cytoolz 0.9.0.1 py37h1de35cc_1 dask 0.19.1 py37_0 dask-core 0.19.1 py37_0 datashape 0.5.4 py37_1 dbus 1.13.2 h760590f_1 decorator 4.3.0 py37_0 defusedxml 0.5.0 py37_1 distributed 1.23.1 py37_0 docutils 0.14 py37_0 entrypoints 0.2.3 py37_2 et_xmlfile 1.0.1 py37_0 expat 2.2.6 h0a44026_0 fastcache 1.0.2 py37h1de35cc_2 filelock 3.0.8 py37_0 flask 1.0.2 py37_1 flask-cors 3.0.6 py37_0 font-ttf-dejavu-sans-mono 2.37 h6964260_0 font-ttf-inconsolata 2.001 hcb22688_0 font-ttf-source-code-pro 2.030 h7457263_0 font-ttf-ubuntu 0.83 h8b1ccd4_0 fontconfig 2.13.0 h5d5b041_1 fonts-anaconda 1 h8fa9717_0 freetype 2.9.1 hb4e5f40_0 fribidi 1.0.5 h1de35cc_0 get_terminal_size 1.0.0 h7520d66_0 gettext 0.19.8.1 h15daf44_3 gevent 1.3.6 py37h1de35cc_0 gfortran_osx-64 4.8.5 h22b1bf0_5 glib 2.56.2 hd9629dc_0 glob2 0.6 py37_0 gmp 6.1.2 hb37e062_1 gmpy2 2.0.8 py37h6ef4df4_2 graphite2 1.3.12 h2098e52_2 greenlet 0.4.15 py37h1de35cc_0 gsl 2.4 h1de35cc_4 h5py 2.8.0 py37h878fce3_3 harfbuzz 1.8.8 hb8d4a28_0 hdf5 1.10.2 hfa1e0ec_1 heapdict 1.0.0 py37_2 html5lib 1.0.1 py37_0 hyperlink 18.0.0 py37_0 icu 58.2 h4b95b61_1 idna 2.7 py37_0 imageio 2.4.1 py37_0 imagesize 1.1.0 py37_0 incremental 17.5.0 py37_0 intel-openmp 2019.0 118 ipykernel 4.9.0 py37_1 ipython 6.5.0 py37_0 ipython_genutils 0.2.0 py37_0 ipywidgets 7.4.1 py37_0 isort 4.3.4 py37_0 itsdangerous 0.24 py37_1 jbig 2.1 h4d881f8_0 jdcal 1.4 py37_0 jedi 0.12.1 py37_0 jinja2 2.10 py37_0 jpeg 9b he5867d9_2 jsonschema 2.6.0 py37_0 jupyter 1.0.0 py37_7 jupyter_client 5.2.3 py37_0 jupyter_console 5.2.0 py37_1 jupyter_core 4.4.0 py37_0 jupyterlab 0.34.9 py37_0 jupyterlab_launcher 0.13.1 py37_0 keyring 13.2.1 py37_0 kiwisolver 1.0.1 py37h0a44026_0 krb5 1.16.1 hddcf347_7 lazy-object-proxy 1.3.1 py37h1de35cc_2 ld64 274.2 h7c2db76_0 libcurl 7.63.0 h051b688_1000 libcxx 4.0.1 h579ed51_0 libcxxabi 4.0.1 hebd6815_0 libdb 6.1.26 h0a44026_0 libedit 3.1.20170329 hb402a30_2 libffi 3.2.1 h475c297_4 libgfortran 3.0.1 h93005f0_2 libiconv 1.15 hdd342a3_7 libntlm 1.4 h1de35cc_2 libopenblas 0.3.3 hdc02c5d_3 libpng 1.6.34 he12f830_0 libsodium 1.0.16 h3efe00b_0 libssh2 1.8.0 ha12b0ac_4 libtiff 4.0.9 hcb84e12_2 libxml2 2.9.8 hab757c2_1 libxslt 1.1.32 hb819dd2_0 llvm 4.0.1 hc748206_0 llvm-lto-tapi 4.0.1 h6701bc3_0 llvm-openmp 4.0.1 hcfea43d_1 llvmlite 0.24.0 py37hc454e04_0 locket 0.2.0 py37_1 lxml 4.2.5 py37hef8c89e_0 lzo 2.10 h362108e_2 make 4.2.1 h3efe00b_1 markupsafe 1.0 py37h1de35cc_1 matplotlib 2.2.3 py37h54f8f79_0 mccabe 0.6.1 py37_1 mistune 0.8.3 py37h1de35cc_1 mkl 2019.0 118 mkl-service 1.1.2 py37h6b9c3cc_5 mkl_fft 1.0.4 py37h5d10147_1 mkl_random 1.0.1 py37h5d10147_1 more-itertools 4.3.0 py37_0 mpc 1.1.0 h6ef4df4_1 mpfr 4.0.1 h3018a27_3 mpmath 1.0.0 py37_2 msgpack-python 0.5.6 py37h04f5b5a_1 multipledispatch 0.6.0 py37_0 navigator-updater 0.2.1 py37_0 nbconvert 5.4.0 py37_1 nbformat 4.4.0 py37_0 ncurses 6.1 h0a44026_0 networkx 2.1 py37_0 nltk 3.3.0 py37_0 nose 1.3.7 py37_2 notebook 5.6.0 py37_0 numba 0.39.0 py37h6440ff4_0 numexpr 2.6.8 py37h1dc9127_0 numpy 1.15.1 py37h6a91979_0 numpy-base 1.15.1 py37h8a80b8c_0 numpydoc 0.8.0 py37_0 odo 0.5.1 py37_0 olefile 0.46 py37_0 openpyxl 2.5.6 py37_0 openssl 1.1.1b h1de35cc_0 packaging 17.1 py37_0 pandas 0.23.4 py37h6440ff4_0 pandoc 1.19.2.1 ha5e8f32_1 pandocfilters 1.4.2 py37_1 pango 1.42.4 h060686c_0 parso 0.3.1 py37_0 partd 0.3.8 py37_0 path.py 11.1.0 py37_0 pathlib2 2.3.2 py37_0 patsy 0.5.0 py37_0 pcre 8.42 h378b8a2_0 pep8 1.7.1 py37_0 pexpect 4.6.0 py37_0 pickleshare 0.7.4 py37_0 pillow 5.2.0 py37hb68e598_0 pip 10.0.1 py37_0 pixman 0.34.0 hca0a616_3 pkginfo 1.4.2 py37_1 pluggy 0.7.1 py37h28b3542_0 ply 3.11 py37_0 prometheus_client 0.3.1 py37h28b3542_0 prompt_toolkit 1.0.15 py37_0 psutil 5.4.7 py37h1de35cc_0 ptyprocess 0.6.0 py37_0 py 1.6.0 py37_0 pyasn1 0.4.4 py37h28b3542_0 pyasn1-modules 0.2.2 py37_0 pycodestyle 2.4.0 py37_0 pycosat 0.6.3 py37h1de35cc_0 pycparser 2.18 py37_1 pycrypto 2.6.1 py37h1de35cc_9 pycurl 7.43.0.2 py37ha12b0ac_0 pydot 1.4.1 pypi_0 pypi pyflakes 2.0.0 py37_0 pygments 2.2.0 py37_0 pyhamcrest 1.9.0 py_2 conda-forge pylint 2.1.1 py37_0 pyodbc 4.0.24 py37h0a44026_0 pyopenssl 18.0.0 py37_0 pyparsing 2.2.0 py37_1 pyqt 5.9.2 py37h655552a_2 pysocks 1.6.8 py37_0 pytables 3.4.4 py37h13cba08_0 pytest 3.8.0 py37_0 pytest-arraydiff 0.2 py37h39e3cac_0 pytest-astropy 0.4.0 py37_0 pytest-doctestplus 0.1.3 py37_0 pytest-openfiles 0.3.0 py37_0 pytest-remotedata 0.3.0 py37_0 python 3.7.2 haf84260_0 python-dateutil 2.7.3 py37_0 python.app 2 py37_8 pytz 2018.5 py37_0 pywavelets 1.0.0 py37h1d22016_0 pyyaml 3.13 py37h1de35cc_0 pyzmq 17.1.2 py37h1de35cc_0 qt 5.9.6 h45cd832_2 qtawesome 0.4.4 py37_0 qtconsole 4.4.1 py37_0 qtpy 1.5.0 py37_0 r-abind 1.4_5 r351hf348343_0 r r-assertthat 0.2.0 r351hf348343_0 r-backports 1.1.2 r351h6402f54_0 r-base 3.5.1 h539fb6c_1 r-base64enc 0.1_3 r351h6402f54_4 r-bh 1.66.0_1 r351hf348343_0 r-bindr 0.1.1 r351hf348343_0 r-bindrcpp 0.2.2 r351h32998d9_0 r-bit 1.1_14 r351h6402f54_0 r-bit64 0.9_7 r351h6402f54_0 r-bitops 1.0_6 r351h6402f54_4 r-blob 1.1.1 r351hf348343_0 r-broom 0.5.0 r351hf348343_0 r-callr 2.0.4 r351hf348343_0 r r-car 3.0_0 r351hf348343_0 r r-cardata 3.0_1 r351hf348343_0 r r-catools 1.17.1.1 r351h32998d9_0 r-cellranger 1.1.0 r351hf348343_0 r-cli 1.0.0 r351h6115d3f_0 r-clipr 0.4.1 r351hf348343_0 r r-colorspace 1.3_2 r351h6402f54_0 r r-config 0.3 r351hf348343_0 r-crayon 1.3.4 r351hf348343_0 r-curl 3.2 r351h6402f54_0 r-data.table 1.11.4 r351h6402f54_0 r r-dbi 1.0.0 r351hf348343_0 r-dbplyr 1.2.2 r351hf348343_0 r-dichromat 2.0_0 r351hf348343_4 r r-digest 0.6.15 r351h6402f54_0 r-dplyr 0.7.6 r351h32998d9_0 r-evaluate 0.11 r351hf348343_0 r-fansi 0.2.3 r351h6402f54_0 r-forcats 0.3.0 r351hf348343_0 r-foreign 0.8_71 r351h6402f54_0 r r-ggplot2 3.0.0 r351hf348343_0 r r-glue 1.3.0 r351h6402f54_0 r-gtable 0.2.0 r351hf348343_0 r r-haven 1.1.2 r351h32998d9_0 r-highr 0.7 r351hf348343_0 r-hms 0.4.2 r351hf348343_0 r-htmltools 0.3.6 r351h32998d9_0 r-htmlwidgets 1.2 r351hf348343_0 r-httpuv 1.4.5 r351h32998d9_0 r-httr 1.3.1 r351hf348343_0 r-jsonlite 1.5 r351h6402f54_0 r-knitr 1.20 r351hf348343_0 r-labeling 0.3 r351hf348343_4 r r-later 0.7.3 r351h32998d9_0 r-lattice 0.20_35 r351h6402f54_0 r-lavaan 0.6_2 r351hf348343_0 r-lazyeval 0.2.1 r351h6402f54_0 r-lme4 1.1_17 r351h32998d9_0 r r-lubridate 1.7.4 r351h32998d9_0 r r-magrittr 1.5 r351hf348343_4 r-maptools 0.9_3 r351h6402f54_0 r r-markdown 0.8 r351h6402f54_0 r-mass 7.3_50 r351h6402f54_0 r r-matching 4.9_3 r351h9d2a408_0 conda-forge r-matrix 1.2_14 r351h6402f54_0 r r-matrixmodels 0.4_1 r351hf348343_4 r r-memoise 1.1.0 r351hf348343_0 r-mgcv 1.8_24 r351h6402f54_0 r r-mime 0.5 r351h6402f54_0 r-miniui 0.1.1.1 r351hf348343_0 r-minqa 1.2.4 r351h4496799_4 r r-mnormt 1.5_5 r351h0b560c1_0 r-modelr 0.1.2 r351hf348343_0 r r-mongolite 1.6 r351h46e59ec_1 r-munsell 0.5.0 r351hf348343_0 r r-nlme 3.1_137 r351h0b560c1_0 r-nloptr 1.0.4 r351h32998d9_4 r r-nnet 7.3_12 r351h6402f54_0 r r-numderiv 2016.8_1 r351hf348343_0 r-odbc 1.1.5 r351h0a44026_0 r-openssl 1.0.2 r351h46e59ec_1 r-openxlsx 4.1.0 r351h32998d9_0 r r-packrat 0.4.9_3 r351hf348343_0 r-pbivnorm 0.6.0 r351h0b560c1_1 r-pbkrtest 0.4_7 r351hf348343_0 r r-pillar 1.3.0 r351hf348343_0 r-pkgconfig 2.0.1 r351hf348343_0 r-pki 0.1_5.1 r351h46e59ec_1 r-plogr 0.2.0 r351hf348343_0 r-plyr 1.8.4 r351h32998d9_0 r-praise 1.0.0 r351hf348343_4 r r-prettyunits 1.0.2 r351hf348343_0 r-processx 3.1.0 r351h32998d9_0 r r-profvis 0.3.5 r351h6402f54_0 r-promises 1.0.1 r351h32998d9_0 r-purrr 0.2.5 r351h6402f54_0 r-quantreg 5.36 r351h0b560c1_0 r r-r6 2.2.2 r351hf348343_0 r-rappdirs 0.3.1 r351h6402f54_0 r-rcolorbrewer 1.1_2 r351hf348343_0 r r-rcpp 0.12.18 r351h32998d9_0 r-rcppeigen 0.3.3.4.0 r351h32998d9_0 r r-rcurl 1.95_4.11 r351h6402f54_0 r-readr 1.1.1 r351h32998d9_0 r-readxl 1.1.0 r351h32998d9_0 r-rematch 1.0.1 r351hf348343_0 r-reprex 0.2.0 r351hf348343_0 r r-reshape2 1.4.3 r351h32998d9_0 r-rio 0.5.10 r351hf348343_0 r r-rjava 0.9_10 r351h6402f54_0 r-rjdbc 0.2_7.1 r351hf348343_0 r-rjsonio 1.3_0 r351h32998d9_4 r-rlang 0.2.1 r351h6402f54_0 r-rmarkdown 1.10 r351hf348343_0 r-rprojroot 1.3_2 r351hf348343_0 r-rsconnect 0.8.8 r351hf348343_0 r-rsqlite 2.1.1 r351h32998d9_0 r-rstudioapi 0.7 r351hf348343_0 r-rvest 0.3.2 r351hf348343_0 r r-scales 0.5.0 r351h32998d9_0 r r-selectr 0.4_1 r351hf348343_0 r r-shiny 1.1.0 r351hf348343_0 r-sourcetools 0.1.7 r351h32998d9_0 r-sp 1.3_1 r351h6402f54_0 r r-sparklyr 0.8.4 r351hf348343_0 r-sparsem 1.77 r351h0b560c1_0 r r-stringi 1.2.4 r351h32998d9_0 r-stringr 1.3.1 r351hf348343_0 r-testthat 2.0.0 r351h32998d9_0 r r-tibble 1.4.2 r351h6402f54_0 r-tidyr 0.8.1 r351h32998d9_0 r-tidyselect 0.2.4 r351h32998d9_0 r-tidyverse 1.2.1 r351hf348343_0 r r-tinytex 0.6 r351hf348343_0 r-utf8 1.1.4 r351h6402f54_0 r-viridislite 0.3.0 r351hf348343_0 r r-whisker 0.3_2 r351hf348343_4 r r-withr 2.1.2 r351hf348343_0 r-xfun 0.3 r351hf348343_0 r-xml2 1.2.0 r351h32998d9_0 r-xtable 1.8_2 r351hf348343_0 r-yaml 2.2.0 r351h6402f54_0 r-zip 1.0.0 r351h6402f54_0 r readline 7.0 h1de35cc_5 requests 2.19.1 py37_0 rope 0.11.0 py37_0 rpy2 2.9.4 py37r351h1d22016_0 rstudio 1.1.456 h04f5b5a_1 ruamel_yaml 0.15.46 py37h1de35cc_0 scikit-image 0.14.0 py37h0a44026_1 scikit-learn 0.19.2 py37h4f467ca_0 scipy 1.1.0 py37h28f7352_1 seaborn 0.9.0 py37_0 send2trash 1.5.0 py37_0 service_identity 17.0.0 py37h28b3542_0 setuptools 40.2.0 py37_0 simplegeneric 0.8.1 py37_2 singledispatch 3.4.0.3 py37_0 sip 4.19.8 py37h0a44026_0 six 1.11.0 py37_1 snappy 1.1.7 he62c110_3 snowballstemmer 1.2.1 py37_0 sortedcollections 1.0.1 py37_0 sortedcontainers 2.0.5 py37_0 sphinx 1.7.9 py37_0 sphinxcontrib 1.0 py37_1 sphinxcontrib-websupport 1.1.0 py37_1 spyder 3.3.1 py37_1 spyder-kernels 0.2.6 py37_0 sqlalchemy 1.2.11 py37h1de35cc_0 sqlite 3.26.0 ha441bb4_0 statsmodels 0.9.0 py37h1d22016_0 sympy 1.2 py37_0 tblib 1.3.2 py37_0 terminado 0.8.1 py37_1 testpath 0.3.1 py37_0 tk 8.6.8 ha441bb4_0 tktable 2.10 h1de35cc_0 toolz 0.9.0 py37_0 tornado 5.1 py37h1de35cc_0 tqdm 4.26.0 py37h28b3542_0 traitlets 4.3.2 py37_0 twisted 18.9.0 py37h470a237_0 conda-forge tzlocal 1.5.1 py37_0 unicodecsv 0.14.1 py37_0 unixodbc 2.3.7 h1de35cc_0 urllib3 1.23 py37_0 wcwidth 0.1.7 py37_0 webencodings 0.5.1 py37_1 werkzeug 0.14.1 py37_0 wheel 0.31.1 py37_0 widgetsnbextension 3.4.1 py37_0 wrapt 1.10.11 py37h1de35cc_2 xlrd 1.1.0 py37_1 xlsxwriter 1.1.0 py37_0 xlwings 0.11.8 py37_0 xlwt 1.3.0 py37_0 xz 5.2.4 h1de35cc_4 yaml 0.1.7 hc338f04_2 zeromq 4.2.5 h0a44026_1 zict 0.1.3 py37_0 zlib 1.2.11 hf3cbc9b_2 zope 1.0 py37_1 zope.interface 4.5.0 py37h1de35cc_0 # packages in environment at /Users/mgralle/anaconda3/envs/dowhy_test: # # Name Version Build Channel blas 1.0 mkl ca-certificates 2019.1.23 0 certifi 2018.11.29 py37_0 decorator 4.3.2 py37_0 fastcache 1.0.2 py37h1de35cc_2 gmp 6.1.2 hb37e062_1 gmpy2 2.0.8 py37h6ef4df4_2 intel-openmp 2019.1 144 libcxx 4.0.1 hcfea43d_1 libcxxabi 4.0.1 hcfea43d_1 libedit 3.1.20181209 hb402a30_0 libffi 3.2.1 h475c297_4 libgfortran 3.0.1 h93005f0_2 mkl 2019.1 144 mkl_fft 1.0.10 py37h5e564d8_0 mkl_random 1.0.2 py37h27c97d8_0 mpc 1.1.0 h6ef4df4_1 mpfr 4.0.1 h3018a27_3 mpmath 1.1.0 py37_0 ncurses 6.1 h0a44026_1 networkx 2.2 py37_1 numpy 1.15.4 py37hacdab7b_0 numpy-base 1.15.4 py37h6575580_0 openssl 1.1.1b h1de35cc_0 pandas 0.24.1 py37h0a44026_0 pip 1.5.4 pypy_0 quasiben python 3.7.2 haf84260_0 python-dateutil 2.7.5 py37_0 pytz 2018.9 py37_0 readline 7.0 h1de35cc_5 scikit-learn 0.20.2 py37h27c97d8_0 scipy 1.2.1 py37h1410ff5_0 setuptools 40.8.0 py37_0 six 1.12.0 py37_0 sqlite 3.26.0 ha441bb4_0 sympy 1.3 py37_0 tk 8.6.8 ha441bb4_0 xz 5.2.4 h1de35cc_4 zlib 1.2.11 h1de35cc_3

mgralle · 2019-03-01T18:32:03Z

Hi, I couldn't resist digging a bit deeper. As I wrote in my last message, with the originally installed "base" environment, I get the bizarre behavior. I cloned "base" and removed 1) all packages related to R: still bizarre (sorry, didn't export a yml file), so I removed also 2) spyder, which gave the following output: The following packages will be REMOVED: anaconda-5.3.1-py37_0 mkl-service-1.1.2-py37h6b9c3cc_5 mkl_fft-1.0.4-py37h5d10147_1 mkl_random-1.0.1-py37h5d10147_1 numpy-base-1.15.1-py37h8a80b8c_0 scikit-learn-0.19.2-py37h4f467ca_0 spyder-3.3.1-py37_1 The following packages will be DOWNGRADED: mkl 2019.0-118 --> 2018.0.3-1 numpy 1.15.1-py37h6a91979_0 --> 1.11.3-py37heee0a97_5 scipy 1.1.0-py37h28f7352_1 --> 1.1.0-py37hf5b7bf4_0 and then added back either scikit-learn=0.19.2 or sciki-learn-0.20.1: expected behavior Since the minimal dowhy_test environment has mkl2019.1, numpy 1.15.4 and scipy 1.2.1, the problem probably resides in anaconda or spyder (see attached environment.yml files). I suppose I won't be using spyder anymore! Thanks for keeping in touch! Em qua, 27 de fev de 2019 às 09:18, Amit Sharma <notifications@github.com> escreveu:

…

Thanks @mgralle <https://github.com/mgralle> for spending the time. Let us know when you have an update. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#27 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AbpBPncXtKASL-7I_lW7DkYGZIwv2UCtks5vRneqgaJpZM4ZkDzx> .

mgralle · 2019-03-01T19:12:12Z

Just to make it a bit more concise. Virtual environment dowhy_test: expected behavior On including newest version of anaconda: expected behavior (only anaconda-custom installed) The base version of Anaconda+Spyder that I was using had anaconda=5.3.1, so I tried this. On including anaconda=5.3.1 with attendant installation and downgrading of other packages: bizarre behavior Virtual environment base: bizarre behavior Em qua, 27 de fev de 2019 às 09:18, Amit Sharma <notifications@github.com> escreveu:

…

Thanks @mgralle for spending the time. Let us know when you have an update. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

akelleh · 2019-03-06T17:56:29Z

great! It sounds like the solution might be to say dowhy requires anaconda>5.3.1, and fixing package versions in requirements.txt to ones that are tested to work. The alternative would be to dig deep into details of which packages are failing, and that's a lot of time that could be spent on higher priorities (other bug fixes; adding features; documentation; tech debt). Do you agree?

mgralle · 2019-03-06T18:07:39Z

Yes, in fact I agree that it's not worth digging into the details of the packages. My hunch is that there is a problem with the downgrading of some packages forced by anaconda=5.3.1. In any case, for myself I won't use anaconda and spyder anymore since they seem to introduce unnecessary complications. Em qua, 6 de mar de 2019 às 14:56, Adam Kelleher <notifications@github.com> escreveu:

…

great! It sounds like the solution might be to say dowhy requires anaconda>5.3.1, and fixing package versions in requirements.txt to ones that are tested to work. The alternative would be to dig deep into details of which packages are failing, and that's a lot of time that could be spent on higher priorities (other bug fixes; adding features; documentation; tech debt). Do you agree? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#27 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AbpBPrAq2LRo8eYDDL-K-zcS8KqJd4DKks5vUAFRgaJpZM4ZkDzx> .

mgralle · 2019-03-06T18:07:45Z

Yes, in fact I agree that it's not worth digging into the details of the packages. My hunch is that there is a problem with the downgrading of some packages forced by anaconda=5.3.1. In any case, for myself I won't use anaconda and spyder anymore since they seem to introduce unnecessary complications.

mgralle closed this as completed Mar 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linear regression is not reproducible #27

Linear regression is not reproducible #27

mgralle commented Dec 28, 2018

amit-sharma commented Jan 21, 2019

mgralle commented Jan 21, 2019

akelleh commented Feb 22, 2019

akelleh commented Feb 22, 2019

mgralle commented Feb 23, 2019 via email

akelleh commented Feb 23, 2019

amit-sharma commented Feb 27, 2019

mgralle commented Mar 1, 2019 via email

mgralle commented Mar 1, 2019 via email

mgralle commented Mar 1, 2019 via email

akelleh commented Mar 6, 2019

mgralle commented Mar 6, 2019 via email

mgralle commented Mar 6, 2019

Linear regression is not reproducible #27

Linear regression is not reproducible #27

Comments

mgralle commented Dec 28, 2018

-- coding: utf-8 --

1. Propensity score weighting

Recreate model from scratch for linear regression

amit-sharma commented Jan 21, 2019

mgralle commented Jan 21, 2019

akelleh commented Feb 22, 2019

akelleh commented Feb 22, 2019

mgralle commented Feb 23, 2019 via email

akelleh commented Feb 23, 2019

amit-sharma commented Feb 27, 2019

mgralle commented Mar 1, 2019 via email

mgralle commented Mar 1, 2019 via email

mgralle commented Mar 1, 2019 via email

akelleh commented Mar 6, 2019

mgralle commented Mar 6, 2019 via email

mgralle commented Mar 6, 2019