# Regression Beyond Food

While we were able to fit models relating various measures of risk of death from non-communicable diseases, we would expect other variables to also play some role in the variation not captured by those relationships we found. One approach we could use to determine whether other features i.e. exercise, access to doctors could also be important would be to use the residuals of the models we fitted as the responses for new linear regression models with the non-food variables as features.

In [1]:
# importing modules
import pickle
import pandas as pd
import numpy as np
import statsmodels.formula.api as sm
import sklearn as sk
from sklearn import linear_model
%matplotlib inline

First we need to load in both the new data and the models we calculated earlier.

In [19]:
# loading in the data
out = open('data/final/beds_1970_2000.p', 'r')
beds_1970_2000 = pickle.load(out)
out.close()
out = open('data/final/doctors_1970_2000.p', 'r')
doctors_1970_2000 = pickle.load(out)
out.close()
out = open('data/clean/exer.p', 'r')
exercise = pickle.load(out)
out.close()
# also loading in the index for the countries we need to drop
out = open('data/clean/countries_to_drop.p', 'r')
countries_to_drop = pickle.load(out)
out.close()

In [22]:
# dropping countries that we dropped during model fitting
exercise_cleaned = exercise.drop(countries_to_drop)

Now we need to identify which countries we have data for in each of the new datasets and perform regression using only those countries for both features and residuals.

In [32]:
# dropping countries without data
exercise_cleaned = exercise_cleaned.dropna()
# getting the index for later use
exercise_countries = exercise_cleaned.index

beds_1970_2000_cleaned = beds_1970_2000.dropna()
# getting the index
beds_countries = beds_1970_2000_cleaned.index

doctors_1970_2000_cleaned = doctors_1970_2000.dropna()
# getting the index
doctors_countries = doctors_1970_2000_cleaned.index

In [3]:
# loading in the lasso models we selected from the scoring notebook
out = open('data/models/risk_results_lasso.p', 'r')
risk_results_lasso = pickle.load(out)
out.close()
out = open('data/models/deaths_all_results_lasso.p', 'r')
deaths_all_results_lasso = pickle.load(out)
out.close()
out = open('data/models/deaths_cancer_results_lasso.p', 'r')
deaths_cancer_results_lasso = pickle.load(out)
out.close()
out = open('data/models/deaths_cardio_results_lasso.p', 'r')
deaths_cardio_results_lasso = pickle.load(out)
out.close()
out = open('data/models/deaths_diabetes_results_lasso.p', 'r')
deaths_diabetes_results_lasso = pickle.load(out)
out.close()
out = open('data/models/deaths_resp_results_lasso.p', 'r')
deaths_resp_results_lasso = pickle.load(out)
out.close()

Now we need to collect the residuals of each model so we can use them as responses.

In [11]:
# collecting the residuals from each model
risk_lasso_resid = risk_results_lasso.resid
deaths_all_lasso_resid = deaths_all_results_lasso.resid
deaths_cancer_lasso_resid = deaths_cancer_results_lasso.resid
deaths_cardio_lasso_resid = deaths_cardio_results_lasso.resid
deaths_diabetes_lasso_resid = deaths_diabetes_results_lasso.resid
deaths_resp_lasso_resid = deaths_resp_results_lasso.resid

## Exercise

And now we can fit models with each relevant subset of these residual datasets as the responses and each of the third party datasets as the features. First we will use exercise data.

In [36]:
risk_exercise_model = sm.OLS(risk_lasso_resid[exercise_countries], exercise_cleaned)
risk_exercise_results = risk_exercise_model.fit()
risk_exercise_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.005
Model:,OLS,Adj. R-squared:,-0.003
Method:,Least Squares,F-statistic:,0.6043
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.438
Time:,21:32:21,Log-Likelihood:,-349.58
No. Observations:,120,AIC:,701.2
Df Residuals:,119,BIC:,703.9
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
Percent,0.0116,0.015,0.777,0.438,-0.018 0.041

0,1,2,3
Omnibus:,16.397,Durbin-Watson:,2.09
Prob(Omnibus):,0.0,Jarque-Bera (JB):,21.709
Skew:,0.733,Prob(JB):,1.93e-05
Kurtosis:,4.48,Cond. No.,1.0


Based on these results the exercise data we collected does not seem to have a significant relationship with percent risk of death. Let's run the regression models against the residuals for the age-standardized mortality rate lasso models as well.

In [37]:
deaths_all_exercise_model = sm.OLS(deaths_all_lasso_resid[exercise_countries], exercise_cleaned)
deaths_all_exercise_results = deaths_all_exercise_model.fit()
deaths_all_exercise_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.0
Model:,OLS,Adj. R-squared:,-0.008
Method:,Least Squares,F-statistic:,0.0182
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.893
Time:,21:34:28,Log-Likelihood:,-728.79
No. Observations:,120,AIC:,1460.0
Df Residuals:,119,BIC:,1462.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
Percent,0.0473,0.351,0.135,0.893,-0.647 0.742

0,1,2,3
Omnibus:,11.281,Durbin-Watson:,1.992
Prob(Omnibus):,0.004,Jarque-Bera (JB):,13.081
Skew:,0.58,Prob(JB):,0.00144
Kurtosis:,4.128,Cond. No.,1.0


In [38]:
deaths_cancer_exercise_model = sm.OLS(deaths_cancer_lasso_resid[exercise_countries], exercise_cleaned)
deaths_cancer_exercise_results = deaths_cancer_exercise_model.fit()
deaths_cancer_exercise_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.0
Model:,OLS,Adj. R-squared:,-0.008
Method:,Least Squares,F-statistic:,0.009
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.925
Time:,21:35:02,Log-Likelihood:,-566.34
No. Observations:,120,AIC:,1135.0
Df Residuals:,119,BIC:,1137.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
Percent,-0.0086,0.091,-0.095,0.925,-0.188 0.171

0,1,2,3
Omnibus:,63.451,Durbin-Watson:,1.657
Prob(Omnibus):,0.0,Jarque-Bera (JB):,355.576
Skew:,1.706,Prob(JB):,6.13e-78
Kurtosis:,10.712,Cond. No.,1.0


In [39]:
deaths_cardio_exercise_model = sm.OLS(deaths_cardio_lasso_resid[exercise_countries], exercise_cleaned)
deaths_cardio_exercise_results = deaths_cardio_exercise_model.fit()
deaths_cardio_exercise_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.0
Model:,OLS,Adj. R-squared:,-0.008
Method:,Least Squares,F-statistic:,0.005395
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.942
Time:,21:35:25,Log-Likelihood:,-664.61
No. Observations:,120,AIC:,1331.0
Df Residuals:,119,BIC:,1334.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
Percent,0.0151,0.205,0.073,0.942,-0.392 0.422

0,1,2,3
Omnibus:,2.085,Durbin-Watson:,2.233
Prob(Omnibus):,0.353,Jarque-Bera (JB):,2.124
Skew:,0.303,Prob(JB):,0.346
Kurtosis:,2.758,Cond. No.,1.0


In [40]:
deaths_diabetes_exercise_model = sm.OLS(deaths_diabetes_lasso_resid[exercise_countries], exercise_cleaned)
deaths_diabetes_exercise_results = deaths_diabetes_exercise_model.fit()
deaths_diabetes_exercise_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.007
Model:,OLS,Adj. R-squared:,-0.002
Method:,Least Squares,F-statistic:,0.7836
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.378
Time:,21:35:47,Log-Likelihood:,-500.34
No. Observations:,120,AIC:,1003.0
Df Residuals:,119,BIC:,1005.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
Percent,0.0463,0.052,0.885,0.378,-0.057 0.150

0,1,2,3
Omnibus:,68.572,Durbin-Watson:,1.874
Prob(Omnibus):,0.0,Jarque-Bera (JB):,349.388
Skew:,1.936,Prob(JB):,1.35e-76
Kurtosis:,10.409,Cond. No.,1.0


In [41]:
deaths_resp_exercise_model = sm.OLS(deaths_resp_lasso_resid[exercise_countries], exercise_cleaned)
deaths_resp_exercise_results = deaths_resp_exercise_model.fit()
deaths_resp_exercise_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.002
Model:,OLS,Adj. R-squared:,-0.006
Method:,Least Squares,F-statistic:,0.255
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.614
Time:,21:36:16,Log-Likelihood:,-542.68
No. Observations:,120,AIC:,1087.0
Df Residuals:,119,BIC:,1090.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
Percent,0.0376,0.074,0.505,0.614,-0.110 0.185

0,1,2,3
Omnibus:,48.033,Durbin-Watson:,1.983
Prob(Omnibus):,0.0,Jarque-Bera (JB):,146.647
Skew:,1.462,Prob(JB):,1.43e-32
Kurtosis:,7.559,Cond. No.,1.0


Based on all of these models, we failed to reject the null hypothesis of no significant relationship between the amount of exercise and the residuals of any of the models we fitted. This is not altogether surprising because of the data's sparse nature, since we have only one year and not all of the countries had any data.

## Doctors per 1000 people

Now we will check if the number of doctors available per 1000 people has a significant relationship with any of the unexplained variance in the lasso models.

In [42]:
risk_doctors_model = sm.OLS(risk_lasso_resid[doctors_countries], doctors_1970_2000_cleaned)
risk_doctors_results = risk_doctors_model.fit()
risk_doctors_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.0
Model:,OLS,Adj. R-squared:,-0.007
Method:,Least Squares,F-statistic:,0.02247
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.881
Time:,21:40:57,Log-Likelihood:,-388.65
No. Observations:,134,AIC:,779.3
Df Residuals:,133,BIC:,782.2
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
0.0,0.0320,0.213,0.150,0.881,-0.390 0.454

0,1,2,3
Omnibus:,17.128,Durbin-Watson:,2.054
Prob(Omnibus):,0.0,Jarque-Bera (JB):,22.358
Skew:,0.726,Prob(JB):,1.4e-05
Kurtosis:,4.378,Cond. No.,1.0


Based on these results it does not appear as if doctors per 1000 people has a significant relationship with the unexplained variance in the percent risk of death model. Now we will check the models that consider mortality rate.

In [43]:
deaths_all_doctors_model = sm.OLS(deaths_all_lasso_resid[doctors_countries], doctors_1970_2000_cleaned)
deaths_all_doctors_results = deaths_all_doctors_model.fit()
deaths_all_doctors_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.0
Model:,OLS,Adj. R-squared:,-0.007
Method:,Least Squares,F-statistic:,0.02327
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.879
Time:,21:44:03,Log-Likelihood:,-807.77
No. Observations:,134,AIC:,1618.0
Df Residuals:,133,BIC:,1620.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
0.0,-0.7425,4.867,-0.153,0.879,-10.370 8.885

0,1,2,3
Omnibus:,13.649,Durbin-Watson:,1.873
Prob(Omnibus):,0.001,Jarque-Bera (JB):,17.218
Skew:,0.605,Prob(JB):,0.000182
Kurtosis:,4.272,Cond. No.,1.0


In [44]:
deaths_cancer_doctors_model = sm.OLS(deaths_cancer_lasso_resid[doctors_countries], doctors_1970_2000_cleaned)
deaths_cancer_doctors_results = deaths_cancer_doctors_model.fit()
deaths_cancer_doctors_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.008
Model:,OLS,Adj. R-squared:,0.001
Method:,Least Squares,F-statistic:,1.103
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.295
Time:,21:44:36,Log-Likelihood:,-631.19
No. Observations:,134,AIC:,1264.0
Df Residuals:,133,BIC:,1267.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
0.0,1.3689,1.303,1.050,0.295,-1.209 3.946

0,1,2,3
Omnibus:,66.411,Durbin-Watson:,1.815
Prob(Omnibus):,0.0,Jarque-Bera (JB):,354.158
Skew:,1.659,Prob(JB):,1.2500000000000001e-77
Kurtosis:,10.24,Cond. No.,1.0


In [45]:
deaths_cardio_doctors_model = sm.OLS(deaths_cardio_lasso_resid[doctors_countries], doctors_1970_2000_cleaned)
deaths_cardio_doctors_results = deaths_cardio_doctors_model.fit()
deaths_cardio_doctors_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.0
Model:,OLS,Adj. R-squared:,-0.007
Method:,Least Squares,F-statistic:,0.02745
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.869
Time:,21:45:04,Log-Likelihood:,-734.9
No. Observations:,134,AIC:,1472.0
Df Residuals:,133,BIC:,1475.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
0.0,-0.4682,2.825,-0.166,0.869,-6.057 5.121

0,1,2,3
Omnibus:,2.449,Durbin-Watson:,2.114
Prob(Omnibus):,0.294,Jarque-Bera (JB):,2.317
Skew:,0.321,Prob(JB):,0.314
Kurtosis:,2.944,Cond. No.,1.0


In [46]:
deaths_diabetes_doctors_model = sm.OLS(deaths_diabetes_lasso_resid[doctors_countries], doctors_1970_2000_cleaned)
deaths_diabetes_doctors_results = deaths_diabetes_doctors_model.fit()
deaths_diabetes_doctors_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.004
Model:,OLS,Adj. R-squared:,-0.003
Method:,Least Squares,F-statistic:,0.5663
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.453
Time:,21:45:25,Log-Likelihood:,-562.94
No. Observations:,134,AIC:,1128.0
Df Residuals:,133,BIC:,1131.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
0.0,-0.5893,0.783,-0.753,0.453,-2.138 0.960

0,1,2,3
Omnibus:,58.167,Durbin-Watson:,1.877
Prob(Omnibus):,0.0,Jarque-Bera (JB):,256.46
Skew:,1.486,Prob(JB):,2.04e-56
Kurtosis:,9.091,Cond. No.,1.0


In [47]:
deaths_resp_doctors_model = sm.OLS(deaths_resp_lasso_resid[doctors_countries], doctors_1970_2000_cleaned)
deaths_resp_doctors_results = deaths_resp_doctors_model.fit()
deaths_resp_doctors_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.01
Model:,OLS,Adj. R-squared:,0.002
Method:,Least Squares,F-statistic:,1.305
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.255
Time:,21:46:08,Log-Likelihood:,-597.45
No. Observations:,134,AIC:,1197.0
Df Residuals:,133,BIC:,1200.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
0.0,-1.1574,1.013,-1.142,0.255,-3.161 0.846

0,1,2,3
Omnibus:,60.299,Durbin-Watson:,1.972
Prob(Omnibus):,0.0,Jarque-Bera (JB):,230.152
Skew:,1.622,Prob(JB):,1.05e-50
Kurtosis:,8.541,Cond. No.,1.0


Because the p values for all the coefficients never fell below the 0.05 significance threshold, based on these models we failed to reject the null hypothesis of no significant relationship between the number of doctors per 1000 people and the unexplained variance in the mortality rates from different non-communicable diseases. Once again, because of the relative sparsity of the data this result is not altogether surprising.

## Hospital Beds per 1000 People

Lastly we will consider if the number of hospital beds available per 1000 people has a significant relationship with the unexplained variance in the lasso models.

In [48]:
risk_beds_model = sm.OLS(risk_lasso_resid[beds_countries], beds_1970_2000_cleaned)
risk_beds_results = risk_beds_model.fit()
risk_beds_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.0
Model:,OLS,Adj. R-squared:,-0.008
Method:,Least Squares,F-statistic:,0.006134
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.938
Time:,21:52:16,Log-Likelihood:,-381.36
No. Observations:,132,AIC:,764.7
Df Residuals:,131,BIC:,767.6
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
0.0,-0.0050,0.064,-0.078,0.938,-0.131 0.121

0,1,2,3
Omnibus:,17.589,Durbin-Watson:,2.015
Prob(Omnibus):,0.0,Jarque-Bera (JB):,23.894
Skew:,0.728,Prob(JB):,6.48e-06
Kurtosis:,4.492,Cond. No.,1.0


Based on the summary we fail to reject the null hypothesis of no significant relationship between the number of hospital beds available per 1000 people and the unexplained variance of the percent risk of death model. Next we will look at the residuals of the models considering mortality rate from different non-communicable diseases.

In [49]:
deaths_all_beds_model = sm.OLS(deaths_all_lasso_resid[beds_countries], beds_1970_2000_cleaned)
deaths_all_beds_results = deaths_all_beds_model.fit()
deaths_all_beds_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.001
Model:,OLS,Adj. R-squared:,-0.007
Method:,Least Squares,F-statistic:,0.1194
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.73
Time:,21:56:44,Log-Likelihood:,-793.39
No. Observations:,132,AIC:,1589.0
Df Residuals:,131,BIC:,1592.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
0.0,-0.5000,1.447,-0.345,0.730,-3.363 2.363

0,1,2,3
Omnibus:,13.753,Durbin-Watson:,1.795
Prob(Omnibus):,0.001,Jarque-Bera (JB):,18.521
Skew:,0.582,Prob(JB):,9.51e-05
Kurtosis:,4.419,Cond. No.,1.0


In [50]:
deaths_cancer_beds_model = sm.OLS(deaths_cancer_lasso_resid[beds_countries], beds_1970_2000_cleaned)
deaths_cancer_beds_results = deaths_cancer_beds_model.fit()
deaths_cancer_beds_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.003
Model:,OLS,Adj. R-squared:,-0.005
Method:,Least Squares,F-statistic:,0.3691
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.545
Time:,21:57:07,Log-Likelihood:,-622.82
No. Observations:,132,AIC:,1248.0
Df Residuals:,131,BIC:,1251.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
0.0,0.2415,0.397,0.608,0.545,-0.545 1.028

0,1,2,3
Omnibus:,63.44,Durbin-Watson:,1.852
Prob(Omnibus):,0.0,Jarque-Bera (JB):,319.51
Skew:,1.613,Prob(JB):,4.1600000000000004e-70
Kurtosis:,9.906,Cond. No.,1.0


In [51]:
deaths_cardio_beds_model = sm.OLS(deaths_cardio_lasso_resid[beds_countries], beds_1970_2000_cleaned)
deaths_cardio_beds_results = deaths_cardio_beds_model.fit()
deaths_cardio_beds_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.0
Model:,OLS,Adj. R-squared:,-0.007
Method:,Least Squares,F-statistic:,0.03532
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.851
Time:,21:57:40,Log-Likelihood:,-721.53
No. Observations:,132,AIC:,1445.0
Df Residuals:,131,BIC:,1448.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
0.0,-0.1578,0.840,-0.188,0.851,-1.819 1.503

0,1,2,3
Omnibus:,1.704,Durbin-Watson:,2.04
Prob(Omnibus):,0.427,Jarque-Bera (JB):,1.6
Skew:,0.268,Prob(JB):,0.449
Kurtosis:,2.935,Cond. No.,1.0


In [52]:
deaths_diabetes_beds_model = sm.OLS(deaths_diabetes_lasso_resid[beds_countries], beds_1970_2000_cleaned)
deaths_diabetes_beds_results = deaths_diabetes_beds_model.fit()
deaths_diabetes_beds_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.005
Model:,OLS,Adj. R-squared:,-0.003
Method:,Least Squares,F-statistic:,0.6253
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.431
Time:,21:57:57,Log-Likelihood:,-552.57
No. Observations:,132,AIC:,1107.0
Df Residuals:,131,BIC:,1110.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
0.0,-0.1846,0.233,-0.791,0.431,-0.646 0.277

0,1,2,3
Omnibus:,60.249,Durbin-Watson:,1.882
Prob(Omnibus):,0.0,Jarque-Bera (JB):,300.846
Skew:,1.516,Prob(JB):,4.7000000000000004e-66
Kurtosis:,9.746,Cond. No.,1.0


In [53]:
deaths_resp_beds_model = sm.OLS(deaths_resp_lasso_resid[beds_countries], beds_1970_2000_cleaned)
deaths_resp_beds_results = deaths_resp_beds_model.fit()
deaths_resp_beds_results.summary()

0,1,2,3
Dep. Variable:,y,R-squared:,0.018
Model:,OLS,Adj. R-squared:,0.01
Method:,Least Squares,F-statistic:,2.373
Date:,"Wed, 14 Dec 2016",Prob (F-statistic):,0.126
Time:,21:58:13,Log-Likelihood:,-585.61
No. Observations:,132,AIC:,1173.0
Df Residuals:,131,BIC:,1176.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5
,coef,std err,t,P>|t|,[95.0% Conf. Int.]
0.0,-0.4619,0.300,-1.541,0.126,-1.055 0.131

0,1,2,3
Omnibus:,62.51,Durbin-Watson:,1.903
Prob(Omnibus):,0.0,Jarque-Bera (JB):,264.709
Skew:,1.664,Prob(JB):,3.3e-58
Kurtosis:,9.087,Cond. No.,1.0


Based on these models, we fail to reject the null hypotheses of no significant relationship between the number of available hospital beds per 1000 people and the unexplained variance in any of these models.

# Conclusion

None of the third party datasets were significantly related to the unexplained variance in the lasso models, which is not that surprising considering the sparsity of all three datasets. Moreover, the differences in units and original time frames of the individual datasets also would have made any significant relationships unlikely. However, this does not mean that exercise and the number of available doctors and hospital beds have no relationship whatsoever with these measures of percent risk of death and mortality rates from non-communicable diseases. Using more informative datasets could very well have led to different conclusions than the ones we reached here.