### Question 1: What is Simple Linear Regression?
Simple Linear Regression is a statistical method used to model the relationship between a dependent variable (target) and a single independent variable (predictor). It assumes a linear relationship between the variables, represented by the equation: 

**Y = β₀ + β₁X + ε**, where:
- Y is the dependent variable,
- X is the independent variable,
- β₀ is the y-intercept,
- β₁ is the slope of the line,
- ε is the error term.

### Question 2: What are the key assumptions of Simple Linear Regression?
1. Linearity: The relationship between X and Y is linear.
2. Independence: Observations are independent of each other.
3. Homoscedasticity: Constant variance of errors.
4. Normality: Residuals (errors) are normally distributed.
5. No multicollinearity (though more relevant to multiple regression).

### Question 3: What is heteroscedasticity, and why is it important to address in regression models?
Heteroscedasticity occurs when the variance of the residuals is not constant across all levels of the independent variable. 
It violates the assumption of homoscedasticity, leading to inefficient estimates and unreliable statistical tests (e.g., t-tests and F-tests). 
Addressing it ensures more reliable regression results.

### Question 4: What is Multiple Linear Regression?
Multiple Linear Regression is a statistical technique used to model the relationship between one dependent variable and two or more independent variables. 
It extends simple linear regression by incorporating multiple predictors. The model is:
**Y = β₀ + β₁X₁ + β₂X₂ + ... + βₙXₙ + ε**

### Question 5: What is polynomial regression, and how does it differ from linear regression?
Polynomial regression is a type of regression where the relationship between the independent variable and the dependent variable is modeled as an nth-degree polynomial.
It captures non-linear relationships, unlike simple linear regression which assumes a straight-line relationship.

### Question 6: Simple Linear Regression Implementation

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Sample data
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
Y = np.array([2.1, 4.3, 6.1, 7.9, 10.2])

# Model fitting
model = LinearRegression()
model.fit(X, Y)
Y_pred = model.predict(X)

# Plot
plt.scatter(X, Y, color='blue', label='Data')
plt.plot(X, Y_pred, color='red', label='Regression Line')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.title('Simple Linear Regression')
plt.show()

### Question 7: Multiple Linear Regression and VIF

In [None]:
import pandas as pd
from statsmodels.stats.outliers_influence import variance_inflation_factor
from sklearn.linear_model import LinearRegression

# Data
data = pd.DataFrame({
    'Area': [1200, 1500, 1800, 2000],
    'Rooms': [2, 3, 3, 4],
    'Price': [250000, 300000, 320000, 370000]
})

# Features and target
X = data[['Area', 'Rooms']]
y = data['Price']

# VIF Calculation
vif_data = pd.DataFrame()
vif_data['feature'] = X.columns
vif_data['VIF'] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]

# Model fitting
model = LinearRegression()
model.fit(X, y)

vif_data

### Question 8: Polynomial Regression

In [None]:
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression

# Data
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
Y = np.array([2.2, 4.8, 7.5, 11.2, 14.7])

# Polynomial transformation
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)

# Model
model = LinearRegression()
model.fit(X_poly, Y)
Y_pred = model.predict(X_poly)

# Plot
plt.scatter(X, Y, color='blue', label='Data')
plt.plot(X, Y_pred, color='green', label='Polynomial Fit')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.title('Polynomial Regression')
plt.show()

### Question 9: Residuals Plot and Heteroscedasticity

In [None]:
# Data
X = np.array([10, 20, 30, 40, 50]).reshape(-1, 1)
Y = np.array([15, 35, 40, 50, 65])

# Model
model = LinearRegression()
model.fit(X, Y)
Y_pred = model.predict(X)
residuals = Y - Y_pred

# Plot
plt.scatter(X, residuals)
plt.axhline(y=0, color='red', linestyle='--')
plt.title('Residuals Plot')
plt.xlabel('X')
plt.ylabel('Residuals')
plt.show()

### Question 10: Addressing Heteroscedasticity and Multicollinearity
To address **heteroscedasticity**:
- Use transformations (e.g., log, sqrt) on the dependent variable.
- Apply Weighted Least Squares (WLS).
- Use robust standard errors.

To address **multicollinearity**:
- Remove or combine correlated predictors.
- Use Principal Component Analysis (PCA).
- Regularization techniques like Ridge or Lasso Regression.