Skip to content

NSS-Data-Science-Projects/un-linear-regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Linear Regression Practice

For this exercise, you've been provided with a csv file, gdp_le.csv, which contains the gdp per capita and life expectancy values that you were working with on the UN data exercise.

  1. Start by fitting a linear regression model with target being life expectancy and predictor variable year.
    a. What coefficients do you get? Interpret the meaning of these coefficents.
    b. Using just the year makes the intercept term difficult to interpret. Create a new model, but this time use years since 1990 as your predictor variable. Hint: You can the patsy identity function to modify your predictors in your model formula. Inspect the new coefficients and interpret the meaning of them. Are they statistically significant?
    c. Compare the actual mean life expectancy per year to the model's estimate. How well does it do?
    d. Plot the actual values against your model's estimates for the mean life expectancy.
    e. Inspect the R-squared value for the model. does it make sense, given the plot?

  2. Filter the full dataset down to just the year 2021. Fit a linear regression model with target being life expectancy and predictor variable gdp per capita.
    a. What coefficients do you get? Interpret the meaning of those coefficients.
    b. Refit your model, but this time use thousands of dollars of gdp per capita as your predictor. How does this change your coefficients?
    c. Are the coefficients statistically significant?
    d. What does your model estimate for the mean life expectancy for a country whose gdp per capita is $50,000? What about one whose gdp per capita is $100,000? e. Plot the actual values compared to your model's estimates for mean life expectancy. How would you assess the model's fit?

  3. Now, fit a model for life expectancy based on the log of gdp per capita.
    a. Inspect the coefficients for this model. Are they statistically significant? b. Interpret these coefficients. What does the model estimate for the average life expectancy for countries with a gdp per capita of $50,000? What about for those with a gdp per capita of $100,000? c. Plot the actual values compared to your models' estimates for the mean life expectancy. How does this compare to the non-logged model?

  4. Finally, return to the full dataset. a. First, fit a linear regression model for life expectancy based on the log of gdp per capita. b. Then, add the year variable to your model. How can you interpret the coefficient associated with year? What limitations or weaknesses might this model have?

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published