This project explores the application of linear regression techniques on various datasets, aiming to understand and evaluate the impact of multiple variables on target outcomes. The project encompasses the following key components:
- Model Development: Started with a basic linear regression model incorporating a few key variables, progressively adding more variables to enhance model performance.
- Statistical Evaluation: Performed in-depth statistical evaluations of each model iteration to assess accuracy, goodness-of-fit, and overall predictive capabilities.
- Outlier Analysis: Identified and examined outliers that could potentially skew model results, assessing their impact on overall model performance.
- High Leverage Points: Investigated high leverage points to understand their influence on regression coefficients and model predictions.
- Interaction Terms: Incorporated interaction terms to capture the combined effect of variables on the response, enhancing model complexity where necessary.
- Linearity Assumption: Contested the linearity assumption of the model by evaluating residual plots and conducting appropriate statistical tests.
The project culminates in a comprehensive understanding of linear regression dynamics, providing insights into variable relationships, model robustness, and the effects of various factors on predictions.