This repository includes the simulated data for Assignment #7. Fork this repository and add your analysis as described in the canvas assignment.
The csv file for cohort in the raw-data folder includes 5,000 observations with variables smoke, female, age, cardiac, and cost.
The code generates a table of means and standard deviations of all variables by smoking status and also the standradized mean differences in those covariate values by smoking status. I then fit a linear model predicting cost using age and smoking status and present that analysis in a figure.
I find that cost increases $17.9 per year of life and is $638.6 higher for smokers than for non-smokers.