# 1. What is Simple Linear Regression?
Simple Linear Regression models the relationship between one independent variable X and one dependent variable Y.
Equation: Y = m*X + c

# 2. What are the key assumptions of Simple Linear Regression?
1) Linearity: Y is a linear function of X.
2) Independence: Observations are independent.
3) Homoscedasticity: Constant variance of residuals.
4) Normality: Residuals are approximately normally distributed.
5) No perfect multicollinearity (not applicable with single predictor).

# 3. What does the coefficient m represent in the equation Y = mX + c?
m is the slope. It represents the expected change in Y for a one-unit increase in X.

# 4. What does the intercept c represent in the equation Y = mX + c?
c is the intercept (bias). It is the predicted value of Y when X = 0.

# 5. How do we calculate the slope m in Simple Linear Regression?
Formula:
m = Σ((Xi - X̄)*(Yi - Ȳ)) / Σ((Xi - X̄)²)

# 6. What is the purpose of the least squares method in Simple Linear Regression?
To find m and c that minimize the sum of squared residuals:
SSE = Σ(Yi - (m*Xi + c))²

# 7. How is the coefficient of determination (R²) interpreted?
R² measures the proportion of variance in Y explained by the model.
Formula:
R² = 1 - (SS_res / SS_tot)

# 8. What is Multiple Linear Regression?
A regression with two or more independent variables predicting Y.
Equation:
Y = b0 + b1*X1 + b2*X2 + ... + bn*Xn

# 9. What is the main difference between Simple and Multiple Linear Regression?
Simple uses 1 predictor (X). Multiple uses 2 or more predictors (X1, X2, ...).

# 10. What are the key assumptions of Multiple Linear Regression?
1) Linearity in parameters
2) Independence of errors
3) Homoscedasticity
4) Normality of residuals
5) No multicollinearity

# 11. What is heteroscedasticity, and how does it affect results?
Heteroscedasticity = non-constant variance of residuals (e.g., funnel shape).
It can bias standard errors, leading to incorrect significance tests.

# 12. How can you improve a Multiple Linear Regression model with high multicollinearity?
- Remove highly correlated variables
- Use PCA (Principal Component Analysis)
- Apply Ridge or Lasso regression

# 13. What are some common techniques for transforming categorical variables?
- One-hot encoding
- Label encoding
- Target encoding

# 14. What is the role of interaction terms in Multiple Linear Regression?
Interaction terms capture the combined effect of two variables (e.g., X1*X2).

# 15. How can the interpretation of intercept differ between Simple and Multiple Linear Regression?
Simple: Y when X = 0.
Multiple: Y when all X variables = 0.

# 16. What is the significance of the slope in regression analysis?
It shows how much Y changes per 1-unit change in X.

# 17. How does the intercept in a regression model provide context?
It gives the baseline Y value when all predictors = 0.

# 18. What are the limitations of using R² as a sole measure of model performance?
- Does not imply causation
- Increases with irrelevant predictors
- Does not measure bias or model validity

# 19. How would you interpret a large standard error for a regression coefficient?
It means the coefficient is unstable and less reliable.

# 20. How can heteroscedasticity be identified in residual plots?
Look for patterns or funnel shapes in residual plots.

# 21. What does it mean if a Multiple Linear Regression model has a high R² but low adjusted R²?
It suggests irrelevant variables are inflating R² artificially.

# 22. Why is it important to scale variables in Multiple Linear Regression?
To prevent predictors with large ranges from dominating and to help regularization methods.

# 23. What is polynomial regression?
A regression where the relationship is modeled as an nth-degree polynomial.
Equation:
Y = b0 + b1X + b2X² + ... + bnX^n

# 24. How does polynomial regression differ from linear regression?
Polynomial regression models curved relationships; linear regression models straight lines.

# 25. When is polynomial regression used?
When data shows a non-linear relationship between variables.

# 26. What is the general equation for polynomial regression?
Y = b0 + b1X + b2X² + ... + bnX^n

# 27. Can polynomial regression be applied to multiple variables?
Yes, by including polynomial terms for each predictor.

# 28. What are the limitations of polynomial regression?
- High degree polynomials may overfit
- Sensitive to outliers
- Poor at extrapolation

# 29. What methods can be used to evaluate model fit when selecting the degree of a polynomial?
- Cross-validation
- Adjusted R²
- AIC/BIC

# 30. Why is visualization important in polynomial regression?
To ensure the polynomial fits well without overfitting.

# 31. How is polynomial regression implemented in Python?
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression

# Example:
# X = [[value1], [value2], ...]
# y = [target1, target2, ...]
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)
model = LinearRegression().fit(X_poly, y)
