### HSML 6295 s2 Software Practice -- Linear Regression in R

#### I. Load the "Wealth and Health" Data Set


In [None]:
d = read.csv(file = "HSML 6295 ds Wealth and Health.csv", header=TRUE, sep=",")
library(stargazer)
rownames(d) = d[,1]
d = d[,-1]
colnames(d)[colnames(d) == 'Life.Expectancy.at.Birth..Years.'] = 'Life.Expectancy'
colnames(d)[colnames(d) == 'Health.Spending.per.Capita...000.US..'] = 'Health.Spending'
colnames(d)[colnames(d) == 'GDP.per.Capita...000.US..'] = 'GDP'
colnames(d)[colnames(d) == 'Countries.A.I'] = 'AI'
attach(d)
names(d)


#### II. Plot Life Expectancy Against GDP per Capita



In [None]:
plot(GDP, Life.Expectancy,
     ylab = "Life Expectancy at Birth (Years)", xlab = "GDP per Capita ($`000)")
with(d, text(Life.Expectancy ~ GDP, labels = row.names(d), pos = 1, cex = 0.6))


#### III. Linear Regression

Estimate a `Life.Expectancy` as a **linear** function of `GDP` per capita and show the regression results.


In [None]:
m = lm(Life.Expectancy ~ GDP, data = d)

stargazer(m,
          type="text", 
          dep.var.labels=c("Life Expectancy at Birth (Years)"), 
          covariate.labels=c("Constant", "GDP per Capita ('000 US$)"),
          report = "vcsp",
          intercept.bottom = FALSE,
          df = FALSE)


Estimate a life expectancy as a **quadratic** function (a polynomial of degree 2) of GDP per capita.



In [None]:
m = lm(Life.Expectancy ~ poly(GDP,2), data = d)

stargazer(m,
          type="text", 
          dep.var.labels=c("Life Expectancy at Birth (Years)"), 
          covariate.labels=c("Constant", "GDP per Capita", "GDP per Capita Squared"),
          report = "vcsp",
          intercept.bottom = FALSE,
          df = FALSE)


Estimate a life expectancy as a **cubic** function (a polynomial of degree 3) of GDP per capita.



In [None]:
m = lm(Life.Expectancy ~ poly(GDP,3), data = d)

stargazer(m,
          type="text", 
          dep.var.labels=c("Life Expectancy at Birth (Years)"), 
          covariate.labels=c("Constant", "GDP per Capita", "GDP per Capita Squared", "GDP per Capita Cubed"),
          report = "vcsp",
          intercept.bottom = FALSE,
          df = FALSE)


Now add a linear term of the predictor `Health.Spending` to the equation.



In [None]:
m = lm(Life.Expectancy ~ poly(GDP,3) + Health.Spending, data = d)

stargazer(m,
          type="text", 
          dep.var.labels=c("Life Expectancy at Birth (Years)"), 
          covariate.labels=c("Constant", "GDP per Capita", "GDP per Capita Squared", "GDP per Capita Cubed", "Health Spending"),
          report = "vcsp",
          intercept.bottom = FALSE,
          df = FALSE)


Now add a quadratic term of the predictor `Health.Spending` to the equation.



In [None]:
m = lm(Life.Expectancy ~ poly(GDP,3) + poly(Health.Spending,2), data = d)

stargazer(m,
          type="text", 
          dep.var.labels=c("Life Expectancy at Birth (Years)"), 
          covariate.labels=c("Constant", "GDP per Capita", "GDP per Capita Squared", "GDP per Capita Cubed", "Health Spending", "Health Spending Squared"),
          report = "vcsp",
          intercept.bottom = FALSE,
          df = FALSE)


Now add a cubic term of the predictor `Health.Spending` to the equation.



In [None]:
m = lm(Life.Expectancy ~ poly(GDP,3) + poly(Health.Spending,3), data = d)

stargazer(m,
          type="text", 
          dep.var.labels=c("Life Expectancy at Birth (Years)"), 
          covariate.labels=c("Constant", "GDP per Capita", "GDP per Capita Squared", "GDP per Capita Cubed", "Health Spending", "Health Spending Squared", "Health Spending Cubed"),
          report = "vcsp",
          intercept.bottom = FALSE,
          df = FALSE)


Add the predictor `ABC.Rank` to the equation.



In [None]:
m = lm(Life.Expectancy ~ poly(GDP,3) + poly(Health.Spending,3) + ABC.Rank, data = d)

stargazer(m,
          type="text", 
          dep.var.labels=c("Life Expectancy at Birth (Years)"), 
          covariate.labels=c("Constant", "GDP per Capita", "GDP per Capita Squared", "GDP per Capita Cubed", "Health Spending", "Health Spending Squared", "Health Spending Cubed", "ABC.Rank"),
          report = "vcsp",
          intercept.bottom = FALSE,
          df = FALSE)


Finally, add a quadratic term of the predictor `ABC.Rank` to the equation.



In [None]:
m = lm(Life.Expectancy ~ poly(GDP,3) + poly(Health.Spending,3) + poly(ABC.Rank,2), data = d)

stargazer(m,
          type="text", 
          dep.var.labels=c("Life Expectancy at Birth (Years)"), 
          covariate.labels=c("Constant", "GDP per Capita", "GDP per Capita Squared", "GDP per Capita Cubed", "Health Spending", "Health Spending Squared", "Health Spending Cubed", "ABC.Rank", "ABC.Rank Squared"),
          report = "vcsp",
          intercept.bottom = FALSE,
          df = FALSE)



How do the values of the R-squared and the adjusted R-squared statistics change as we add more terms to the equation?
