### **Q1. What is Ridge Regression, and how does it differ from Ordinary Least Squares (OLS) Regression?**

**Ridge Regression** is a regularized version of linear regression that adds a penalty (called **L2 regularization**) to the loss function.  
This penalty term shrinks the coefficients to prevent overfitting, especially when predictors are highly correlated.

**OLS Regression** minimizes the sum of squared residuals.  
**Ridge Regression** minimizes:  
`Sum of Squared Errors + λ * Sum of Squares of Coefficients`

➡️ **Key difference**: Ridge includes a penalty term controlled by **lambda (λ)**.

---

### **Q2. What are the assumptions of Ridge Regression?**

Ridge shares most assumptions with OLS regression:
1. **Linearity** – The relationship between features and target is linear.
2. **Independence** – Observations are independent of each other.
3. **Homoscedasticity** – Constant variance of residuals.
4. **No perfect multicollinearity** – Ridge helps *reduce* multicollinearity, but it still assumes it's not perfect.
5. **Normality of errors** – Not strictly required but useful for inference.

---

### **Q3. How do you select the value of the tuning parameter (lambda) in Ridge Regression?**

Use **cross-validation** (like K-Fold CV) to find the best lambda (α in `Ridge(alpha=...)` in Python).

Common approaches:
- **Grid Search**: Try multiple values and pick the one with lowest error.
- **Built-in functions**: `RidgeCV` in scikit-learn automatically selects the best lambda using CV.

---

### **Q4. Can Ridge Regression be used for feature selection?**

Not directly.

- **Ridge shrinks** coefficients close to zero but **not exactly zero**.
- It **keeps all features**, unlike **Lasso**, which can zero out coefficients (i.e., remove features).

So Ridge is **not ideal** for feature selection but great for reducing model complexity.

---

### **Q5. How does Ridge Regression perform with multicollinearity?**

Ridge is **especially useful** in the presence of multicollinearity.

- It **stabilizes** the regression coefficients by shrinking them.
- Helps prevent extreme values due to high correlation among features.

---

### **Q6. Can Ridge Regression handle both categorical and continuous independent variables?**

Yes, but:
- **Categorical variables** must be **converted** (e.g., using One-Hot Encoding).
- Ridge works only with **numerical** data, so preprocessing is necessary.

---

### **Q7. How do you interpret the coefficients of Ridge Regression?**

Interpretation is similar to linear regression:
- Each coefficient shows the **expected change** in the output for a one-unit change in that variable (holding others constant).

However:
- Due to regularization, coefficients are **shrunk**, so they may be **less interpretable**.
- They are biased but have **lower variance**, which can improve prediction.

---

### **Q8. Can Ridge Regression be used for time-series data analysis? If yes, how?**

Yes, but with care.

- You must handle **time dependencies** using lag features, rolling averages, etc.
- Use **time-based validation** (not random splits) when tuning lambda.

➡️ Ridge helps if your time-series model has **many features or collinearity** among lagged variables.