In [None]:
# Q1. What is Ridge Regression, and How Does It Differ from Ordinary Least Squares (OLS) Regression?
# Ridge Regression is a form of regularized linear regression that adds a penalty to the loss function
# proportional to the square of the magnitude of the coefficients (L2 regularization). The objective is to minimize:
# Loss function = RSS (Residual Sum of Squares) + λ * (sum of squares of coefficients)
# Where λ is the regularization parameter.
# OLS regression, on the other hand, minimizes only the RSS without any penalty term.
# Ridge Regression helps prevent overfitting by reducing the impact of large coefficients.

# Example of Ridge Regression
from sklearn.linear_model import Ridge
import numpy as np
import matplotlib.pyplot as plt

# Create a sample dataset
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([1, 2, 3, 4, 5])

# Fit a Ridge model
ridge_model = Ridge(alpha=1.0)
ridge_model.fit(X, y)
print(f"Ridge Coefficients: {ridge_model.coef_}")

# Q2. What are the assumptions of Ridge Regression?
# The assumptions of Ridge Regression are similar to those of OLS regression with the added feature of regularization:
# 1. Linearity: The relationship between the independent and dependent variables is linear.
# 2. Independence of errors: The residuals (errors) should be independent of each other.
# 3. Homoscedasticity: Constant variance of residuals (errors).
# 4. No perfect multicollinearity: The independent variables should not be highly correlated with each other, although Ridge can help handle high correlation.
# 5. Normally distributed residuals: While this assumption is not mandatory for Ridge, it is typically assumed for inference.

# Q3. How Do You Select the Value of the Tuning Parameter (λ) in Ridge Regression?
# The tuning parameter λ (also called alpha) controls the strength of the regularization.
# A larger λ results in more regularization (shrinking coefficients more), and a smaller λ results in less regularization.
# The optimal value of λ is often selected using cross-validation. Grid search or random search are commonly used for this purpose.

from sklearn.model_selection import GridSearchCV

# Grid search to find the best alpha value
parameters = {'alpha': [0.01, 0.1, 1, 10, 100]}
ridge_grid = GridSearchCV(Ridge(), parameters, cv=5)
ridge_grid.fit(X, y)
print(f"Best alpha value: {ridge_grid.best_params_['alpha']}")

# Q4. Can Ridge Regression be Used for Feature Selection? If Yes, How?
# Ridge Regression does not perform feature selection in the same way that Lasso regression does.
# Lasso (L1 regularization) can shrink some coefficients to zero, effectively removing them from the model.
# Ridge, on the other hand, only shrinks coefficients towards zero without completely removing them.
# Thus, while Ridge helps with regularization, it does not provide a direct method for feature selection.
# However, it can still help to improve the model by reducing the influence of irrelevant features.

# Q5. How Does the Ridge Regression Model Perform in the Presence of Multicollinearity?
# Ridge Regression performs well in the presence of multicollinearity (when independent variables are highly correlated).
# In ordinary least squares (OLS) regression, multicollinearity can lead to large variances for coefficient estimates,
# making the model sensitive to small changes in data.
# Ridge addresses this by adding a penalty to the size of coefficients, which stabilizes the estimates even in the presence of multicollinearity.

# Q6. Can Ridge Regression Handle Both Categorical and Continuous Independent Variables?
# Ridge Regression can handle both categorical and continuous independent variables, but the categorical variables must be properly encoded.
# Categorical variables are typically encoded using techniques like one-hot encoding or label encoding before being passed into the model.
# Continuous variables are directly used in the regression model as they are.

from sklearn.preprocessing import OneHotEncoder

# Example: Handling categorical data using one-hot encoding
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline

# Example data with categorical variable
X_categorical = np.array([['A'], ['B'], ['C'], ['A'], ['B']])
y = np.array([1, 2, 3, 4, 5])

# One-hot encoding for the categorical variable
encoder = ColumnTransformer(
    transformers=[('cat', OneHotEncoder(), [0])],
    remainder='passthrough'
)

# Ridge model with preprocessing
ridge_with_encoding = Pipeline(steps=[('preprocessor', encoder),
                                      ('regressor', Ridge(alpha=1))])

ridge_with_encoding.fit(X_categorical, y)
print(f"Ridge Coefficients: {ridge_with_encoding.named_steps['regressor'].coef_}")

# Q7. How Do You Interpret the Coefficients of Ridge Regression?
# The coefficients of Ridge Regression are interpreted in the same way as those of OLS regression:
# - A positive coefficient means that as the corresponding independent variable increases, the dependent variable increases.
# - A negative coefficient means that as the independent variable increases, the dependent variable decreases.
# - The magnitude of the coefficient indicates the strength of the relationship between the independent and dependent variables.
# Ridge regression coefficients are generally smaller due to the penalty imposed by regularization, especially when λ is large.

# Q8. Can Ridge Regression Be Used for Time-Series Data Analysis? If Yes, How?
# Yes, Ridge Regression can be used for time-series data analysis, but with some considerations:
# - Time-series data often have autocorrelation (dependency between observations), which needs to be handled.
# - Ridge can be used in time-series forecasting models, especially when there are multiple predictors (lag features, external variables).
# - It's important to account for time dependencies using techniques like lag features, differencing, or using time-series specific models (e.g., ARIMA).
# Ridge Regression can work with lagged values or rolling window features that represent past data points to predict future values.

# Example: Using Ridge Regression for Time-Series Forecasting (simplified)
import pandas as pd

# Create a simple time-series dataset
data = pd.DataFrame({
    't': [1, 2, 3, 4, 5],
    'y': [1, 2, 3, 4, 5]
})

# Use previous value (lag) as feature for Ridge
data['y_lag'] = data['y'].shift(1)

# Drop the first row (because of NaN in the lag column)
data = data.dropna()

X_ts = data[['y_lag']]
y_ts = data['y']

# Fit Ridge model
ridge_ts = Ridge(alpha=1.0)
ridge_ts.fit(X_ts, y_ts)
print(f"Ridge Coefficients (Time-Series): {ridge_ts.coef_}")
