Q1. Explain the difference between simple linear regression and multiple linear regression. Provide an
example of each.

Q2. Discuss the assumptions of linear regression. How can you check whether these assumptions hold in
a given dataset?

Q3. How do you interpret the slope and intercept in a linear regression model? Provide an example using
a real-world scenario.

Q4. Explain the concept of gradient descent. How is it used in machine learning?

Q5. Describe the multiple linear regression model. How does it differ from simple linear regression?

Q6. Explain the concept of multicollinearity in multiple linear regression. How can you detect and
address this issue?

Q7. Describe the polynomial regression model. How is it different from linear regression?

Q8. What are the advantages and disadvantages of polynomial regression compared to linear
regression? In what situations would you prefer to use polynomial regression?

### Q1: Difference between Simple Linear Regression and Multiple Linear Regression
- **Simple Linear Regression**: This model explains the relationship between two variables—one independent variable (predictor) and one dependent variable (response). The equation is:

  \[
  Y = b_0 + b_1X
  \]

  Example: Predicting house price based on square footage. 
  - **Y**: House price (dependent)
  - **X**: Square footage (independent)

- **Multiple Linear Regression**: In this model, the dependent variable is predicted based on two or more independent variables. The equation is:

  \[
  Y = b_0 + b_1X_1 + b_2X_2 + \dots + b_nX_n
  \]

  Example: Predicting house price based on square footage, number of bedrooms, and location. 
  - **Y**: House price
  - **X₁**: Square footage
  - **X₂**: Number of bedrooms
  - **X₃**: Location rating

### Q2: Assumptions of Linear Regression
1. **Linearity**: The relationship between the independent and dependent variables should be linear.
2. **Independence**: Observations should be independent of each other.
3. **Homoscedasticity**: Constant variance of errors (residuals).
4. **Normality of Errors**: The residuals should follow a normal distribution.
5. **No Multicollinearity**: Independent variables should not be highly correlated.

  **How to check these assumptions**:
  - **Linearity**: Check residual plots.
  - **Independence**: Use Durbin-Watson test.
  - **Homoscedasticity**: Check residual vs. fitted plot.
  - **Normality**: Use Q-Q plot or Shapiro-Wilk test.
  - **Multicollinearity**: Check variance inflation factor (VIF).

### Q3: Interpretation of Slope and Intercept
- **Intercept (b₀)**: The predicted value of the dependent variable when all independent variables are zero. In the house price example, if square footage = 0, the intercept would represent the baseline price of a property with no square footage.
  
- **Slope (b₁, b₂, etc.)**: The change in the dependent variable for a one-unit increase in the independent variable. For instance, if **b₁** = 50, it means that for each additional square foot, the price increases by $50.

  **Example**: If the model is:

  \[
  \text{Price} = 10000 + 50 \times \text{Square footage}
  \]
  - The intercept (10,000) suggests that a house with 0 square feet has a base price of $10,000.
  - The slope (50) indicates that for every extra square foot, the price increases by $50.

### Q4: Concept of Gradient Descent
Gradient descent is an optimization algorithm used to minimize the cost function in machine learning. It updates the model's parameters iteratively to find the best-fit line by calculating the gradient of the cost function.

In linear regression, the cost function is typically Mean Squared Error (MSE), and gradient descent minimizes this by updating the weights using the formula:

\[
\theta_j = \theta_j - \alpha \frac{\partial}{\partial \theta_j} J(\theta)
\]

Where:
- \(\alpha\) is the learning rate
- \(J(\theta)\) is the cost function

### Q5: Multiple Linear Regression Model
Multiple linear regression extends the simple linear regression model to include more than one independent variable:

\[
Y = b_0 + b_1X_1 + b_2X_2 + \dots + b_nX_n
\]

The primary difference is the addition of more predictor variables in multiple regression. Each predictor variable has its own slope (coefficient), and together they account for more complex relationships.

### Q6: Multicollinearity in Multiple Linear Regression
**Multicollinearity** occurs when two or more independent variables are highly correlated, leading to unstable coefficients and inflated standard errors, making it hard to determine the effect of each variable.

**Detection**:
- Variance Inflation Factor (VIF): A VIF above 5 or 10 indicates multicollinearity.
- Correlation Matrix: Check for high correlations between independent variables.

**Addressing multicollinearity**:
- Remove or combine highly correlated variables.
- Use regularization techniques like Ridge or Lasso regression.

### Q7: Polynomial Regression Model
Polynomial regression is an extension of linear regression that models the relationship between the dependent and independent variable as an \(n\)-degree polynomial:

\[
Y = b_0 + b_1X + b_2X^2 + \dots + b_nX^n
\]

This allows for a curved relationship, unlike linear regression, which only models straight-line relationships.

**Difference**:
- **Linear regression** fits a straight line.
- **Polynomial regression** fits a curved line by introducing higher-degree terms of the independent variable.

### Q8: Advantages and Disadvantages of Polynomial Regression
- **Advantages**:
  - Can capture more complex relationships compared to linear regression.
  - More flexibility in fitting data points.

- **Disadvantages**:
  - Overfitting: Polynomial models can fit the training data too well, leading to poor generalization.
  - Higher complexity and interpretation difficulty.

**When to use polynomial regression**:
- When the relationship between variables is nonlinear.
- For example, in modeling growth rates that accelerate or decelerate over time, such as the spread of diseases.

In contrast, **linear regression** is preferred when the relationship between variables is approximately linear and simple.