In [32]:
# Q1. Explain the concept of R-squared in linear regression models. How is it calculated, and what does it
# represent?

In [33]:
# R-squared in Linear Regression
#     Concept: R-squared (R²) measures how much of the variation in the dependent variable can be explained by the independent
#     variables in the model.

# Calculation:
#     R-squared is calculated as:
#         1 minus the ratio of the sum of squared residuals (the differences between the observed and predicted values) to the
#         total sum of squares (the differences between the observed values and the mean of the observed values).
        
# Representation:
#     Range: R-squared ranges from 0 to 1.
#         0: Indicates that the model explains none of the variance in the dependent variable.
#         1: Indicates that the model explains all of the variance in the dependent variable.
#         Closer to 1: The closer R-squared is to 1, the better the model fits the data.

In [34]:
# Q2. Define adjusted R-squared and explain how it differs from the regular R-squared.

In [35]:
# Definition: Adjusted R-squared is a modified version of R-squared that adjusts for the number of predictors in the
# model. It accounts for the complexity of the model, providing a more accurate measure of how well the independent variables 
# explain the variance in the dependent variable.

In [36]:
# Differences: Adjusted R-squared penalizes unnecessary predictors, discouraging overfitting, while R-squared does not.

In [37]:
# Q3. When is it more appropriate to use adjusted R-squared?

In [38]:
# Use Adjusted R-squared: 
#     When you have multiple predictors, need to compare models with different complexities, want to prevent overfitting, or
#     are dealing with complex models.
    
# Why: It adjusts for the number of predictors, offering a more accurate measure of model fit.

In [39]:
# Q4. What are RMSE, MSE, and MAE in the context of regression analysis? How are these metrics
# calculated, and what do they represent?

In [40]:
# MAE: Measures the average absolute error. Less sensitive to outliers.
# MSE: Measures the average squared error. More sensitive to large errors.
# RMSE: The square root of MSE. Provides error in the same units as the target variable.

# Use Cases:
# MAE: Preferred when all errors are equally important.
# MSE: Used when larger errors are more significant and should be penalized more.
# RMSE: Useful for interpreting model performance in the same units as the dependent variable.

In [41]:
# Q5. Discuss the advantages and disadvantages of using RMSE, MSE, and MAE as evaluation metrics in
# regression analysis.

In [42]:
# MAE:
#     Pros: Easy to interpret, robust to outliers.
#     Cons: Does not penalize larger errors heavily.
    
# MSE:
#     Pros: Penalizes large errors, smooth derivatives for optimization.
#     Cons: Sensitive to outliers, less interpretable due to squared units.
    
# RMSE:
#     Pros: Interpretable in the same unit as the target variable, penalizes large errors.
#     Cons: Sensitive to outliers, more complex to calculate than MAE.
    
# Use Cases
#     MAE: When you want a straightforward measure of average error and need robustness against outliers.
#     MSE: When larger errors need to be penalized more, and you're using gradient-based optimization.
#     RMSE: When you want the interpretability of the metric in the same unit as the target and need to penalize larger errors more than MAE.

In [43]:
# Q6. Explain the concept of Lasso regularization. How does it differ from Ridge regularization, and when is
# it more appropriate to use?

In [44]:
# Lasso Regularization:
#     Adds L1 penalty (absolute values).
#     Can shrink some coefficients to zero, leading to feature selection.
#     Use when you need a simpler model with fewer predictors.
    
# Ridge Regularization:
#     Adds L2 penalty (squared values).
#     Shrinks coefficients but keeps all predictors.
#     Use when dealing with multicollinearity and when all features are believed to be relevant.

In [45]:
# Q7. How do regularized linear models help to prevent overfitting in machine learning? Provide an
# example to illustrate.

In [46]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

In [47]:
X = np.array([[1500, 3], [1600, 3], [1700, 3], [1800, 4], [1900, 4], [2000, 4], [2100, 5]])
y = np.array([300000, 320000, 340000, 360000, 380000, 400000, 420000])

In [48]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [49]:
model = LinearRegression()
model.fit(X_train, y_train)

In [50]:
y_pred_test = model.predict(X_test)
test_error = mean_squared_error(y_test, y_pred_test)

In [51]:
print(f'Test Error (No Regularization): {test_error}')

Test Error (No Regularization): 1.1293772630057337e-21


In [52]:
from sklearn.linear_model import Ridge

In [53]:
ridge_model = Ridge(alpha=1.0) 
ridge_model.fit(X_train, y_train)

In [54]:
y_pred_test_ridge = ridge_model.predict(X_test)
test_error_ridge = mean_squared_error(y_test, y_pred_test_ridge)

In [55]:
print(f'Test Error (Ridge): {test_error_ridge}')

Test Error (Ridge): 1.0746941791861084


In [56]:
# Q8. Discuss the limitations of regularized linear models and explain why they may not always be the best
# choice for regression analysis.

In [57]:
# Model Complexity: 
#     Regularized linear models may struggle with complex, non-linear relationships and interactions.
#     Feature Selection: Lasso may exclude important features, especially in the presence of correlated features.
    
# Handling Non-Linear Relationships: 
#     They assume linear relationships, which may not be suitable for all datasets.
    
# Sensitivity to Hyperparameters:
#     Performance relies on selecting the correct regularization strength.
        
# Interpretability:
#     Regularization can complicate the interpretation of model coefficients.
#     Performance with Small Datasets: Regularization might introduce excessive bias in small datasets.
    
#     When to Consider Alternatives:
#         For capturing non-linear patterns, consider using non-linear models like decision trees, random forests, or neural
#         networks.
#         For feature selection in high-dimensional datasets with correlated features, techniques like Elastic Net or 
#         advanced feature selection methods might be more effective.

In [58]:
# Q9. You are comparing the performance of two regression models using different evaluation metrics.
# Model A has an RMSE of 10, while Model B has an MAE of 8. Which model would you choose as the better
# performer, and why? Are there any limitations to your choice of metric?

In [59]:
# Model Comparison:
#     Model A: RMSE of 10 (sensitive to large errors and outliers).
#     Model B: MAE of 8 (less sensitive to outliers and gives a clear average error).
    
# Choosing the Better Model:
#     Model B might be better if you want to minimize the average error and deal with fewer large errors.
#     Model A might be better if large errors need to be penalized more heavily.
    
# Limitations:
#     RMSE: Can be skewed by outliers and might not reflect typical performance well.
#     MAE: Provides a straightforward average error but doesn’t account for the impact of larger errors.
    
# Metric Selection:
#     Choose based on what matters more: average error (MAE) or penalizing large errors (RMSE).

In [60]:
# Q10. You are comparing the performance of two regularized linear models using different types of
# regularization. Model A uses Ridge regularization with a regularization parameter of 0.1, while Model B
# uses Lasso regularization with a regularization parameter of 0.5. Which model would you choose as the
# better performer, and why? Are there any trade-offs or limitations to your choice of regularization
# method?

In [61]:
# Choose Lasso Regularization (Model B) if you need feature selection and want a simpler model with fewer features.

# Choose Ridge Regularization (Model A) if you want to handle multicollinearity and retain all features.

# Trade-Offs: Ridge regularization handles multicollinearity but doesn’t simplify the model, while Lasso regularization
# simplifies the model but might exclude relevant features.