# Simple Linear Regression

# 1. What is Simple Linear Regression?
# - A statistical method that models the relationship between a dependent variable (Y)
#   and an independent variable (X) using the equation: Y = mX + c.

# 2. What are the key assumptions of Simple Linear Regression?
# - Linearity: Relationship between X and Y is linear.
# - Independence: Observations are independent.
# - Homoscedasticity: Constant variance of residuals.
# - Normality: Residuals should follow a normal distribution.
# - No Multicollinearity: Only one independent variable.

# 3. What does the coefficient m represent in Y = mX + c?
# - The slope m represents the rate of change in Y for a one-unit change in X.

# 4. What does the intercept c represent in Y = mX + c?
# - The intercept c is the value of Y when X = 0.

# 5. How do we calculate the slope m?
# - Formula: m = (Σ(Xi - X̄)(Yi - Ȳ)) / (Σ(Xi - X̄)²)

# 6. Purpose of the least squares method:
# - It minimizes the sum of squared residuals to find the best-fit line.

# 7. Interpretation of R² in Simple Linear Regression:
# - R² measures how well the regression line fits the data (closer to 1 = better fit).

# Multiple Linear Regression

# 8. What is Multiple Linear Regression?
# - A model that uses two or more independent variables to predict Y.

# 9. Difference between Simple and Multiple Linear Regression:
# - Simple: 1 independent variable.
# - Multiple: 2 or more independent variables.

# 10. Key assumptions of Multiple Linear Regression:
# - Linearity, Independence, Homoscedasticity, Normality, No Multicollinearity.

# 11. What is heteroscedasticity, and how does it affect the model?
# - It refers to non-constant variance in residuals, leading to biased estimates.

# 12. How to handle multicollinearity?
# - Remove correlated predictors, use PCA, or apply Ridge/Lasso Regression.

# 13. Transforming categorical variables for regression:
# - One-hot encoding, Label encoding, Dummy variables.

# 14. Role of interaction terms:
# - Captures the combined effects of independent variables.

# 15. Interpretation of intercept in Simple vs. Multiple Linear Regression:
# - Simple: Intercept represents Y when X = 0.
# - Multiple: Intercept represents Y when all independent variables = 0.

# 16. Significance of slope in regression analysis:
# - Determines the effect of an independent variable on Y.

# 17. How does the intercept provide context in regression?
# - Represents the baseline value of Y when all X values are zero.

# 18. Limitations of R² as a model performance measure:
# - Does not indicate overfitting, does not validate correctness, ignores predictor count.

# 19. Interpretation of a large standard error for a regression coefficient:
# - Indicates high variability in the coefficient estimate.

# 20. How to identify heteroscedasticity in residual plots?
# - Residuals forming a funnel shape indicate heteroscedasticity.

# 21. Meaning of high R² but low adjusted R² in Multiple Linear Regression:
# - Suggests that additional predictors are not contributing meaningful information.

# 22. Importance of scaling variables in Multiple Linear Regression:
# - Prevents large feature values from dominating and affecting model accuracy.

# Polynomial Regression

# 23. What is polynomial regression?
# - A regression model that fits a polynomial equation instead of a straight line.

# 24. Difference between Linear & Polynomial Regression:
# - Linear Regression fits a straight line; Polynomial Regression fits a curved trendline.

# 25. When is polynomial regression used?
# - When the relationship between X and Y is non-linear.

# 26. General equation for polynomial regression:
# - Y = b0 + b1X + b2X² + ... + bnXⁿ

# 27. Can polynomial regression be applied to multiple variables?
# - Yes, it can be extended to Multiple Polynomial Regression.

# 28. Limitations of polynomial regression:
# - Overfitting when the degree is too high, difficult interpretation, computationally expensive.

# 29. Methods to evaluate polynomial regression model fit:
# - R², Adjusted R², Mean Squared Error (MSE), Cross-validation.

# 30. Why is visualization important in polynomial regression?
# - Helps detect patterns, check model fit, and avoid overfitting.

# 31. Python Implementation of Polynomial Regression


In [None]:
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
import numpy as np

# Example dataset
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([2, 5, 10, 17, 26])

# Convert to polynomial features
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)

# Train the model
model = LinearRegression()
model.fit(X_poly, y)

# Predict
y_pred = model.predict(X_poly)
print(y_pred)

[ 2.  5. 10. 17. 26.]
