# 📌 Simple Linear Regression

# 1. What is Simple Linear Regression?
# Simple Linear Regression models the relationship between a dependent variable (Y)
# and an independent variable (X) using the equation: Y = mX + c.

# 2. What are the key assumptions of Simple Linear Regression?
# - Linearity: The relationship between X and Y is linear.
# - Independence: Observations are independent.
# - Homoscedasticity: Constant variance of residuals.
# - Normality: Residuals should follow a normal distribution.
# - No Multicollinearity: Only one independent variable.

# 3. What does the coefficient m represent in the equation Y = mX + c?
# The coefficient m represents the slope, indicating how much Y changes when X increases by 1 unit.

# 4. What does the intercept c represent in the equation Y = mX + c?
# The intercept c is the value of Y when X = 0.

# 5. How do we calculate the slope m in Simple Linear Regression?
# Formula: m = (Σ(Xi - X̄)(Yi - Ȳ)) / (Σ(Xi - X̄)²)

# 6. What is the purpose of the least squares method in Simple Linear Regression?
# It minimizes the sum of squared residuals to find the best-fit regression line.

# 7. How is the coefficient of determination (R²) interpreted?
# R² measures how well the regression line fits the data. A value close to 1 indicates a good fit.

# 📌 Multiple Linear Regression

# 8. What is Multiple Linear Regression?
# A regression model that includes two or more independent variables to predict Y.

# 9. What is the main difference between Simple and Multiple Linear Regression?
# - Simple Linear Regression: 1 independent variable.
# - Multiple Linear Regression: 2 or more independent variables.

# 10. What are the key assumptions of Multiple Linear Regression?
# - Linearity
# - Independence
# - Homoscedasticity
# - Normality of residuals
# - No Multicollinearity

# 11. What is heteroscedasticity, and how does it affect the model?
# Heteroscedasticity means non-constant variance in residuals, leading to unreliable predictions.

# 12. How can you improve a Multiple Linear Regression model with high multicollinearity?
# - Remove correlated predictors.
# - Use Principal Component Analysis (PCA).
# - Apply Ridge or Lasso Regression.

# 13. What are common techniques for transforming categorical variables?
# - One-hot encoding
# - Label encoding
# - Dummy variables

# 14. What is the role of interaction terms in Multiple Linear Regression?
# Interaction terms help capture combined effects of independent variables.

# 15. How can the interpretation of intercept differ in Simple vs. Multiple Linear Regression?
# - Simple: Represents Y when X = 0.
# - Multiple: Represents Y when all independent variables = 0.

# 16. What is the significance of the slope in regression analysis?
# It determines how much the dependent variable changes per unit increase in an independent variable.

# 17. How does the intercept provide context for the relationship between variables?
# It provides a baseline value of Y when all independent variables are zero.

# 18. What are the limitations of using R² as a sole measure of model performance?
# - Does not indicate if the model is overfitting.
# - A high R² does not always mean a good model.
# - It does not account for the number of predictors.

# 19. How would you interpret a large standard error for a regression coefficient?
# It indicates a high level of uncertainty in the coefficient estimate.

# 20. How can heteroscedasticity be identified in residual plots?
# If residuals show a funnel shape, heteroscedasticity is present.

# 21. What does it mean if a Multiple Linear Regression model has a high R² but low adjusted R²?
# It suggests that additional variables are not contributing meaningful information.

# 22. Why is it important to scale variables in Multiple Linear Regression?
# To ensure that all features contribute equally and to prevent bias in coefficients.

# 📌 Polynomial Regression

# 23. What is polynomial regression?
# A regression model that fits a polynomial equation instead of a straight line.

# 24. How does polynomial regression differ from linear regression?
# - Linear Regression fits a straight line.
# - Polynomial Regression fits a curved trendline.

# 25. When is polynomial regression used?
# When the relationship between X and Y is non-linear.

# 26. What is the general equation for polynomial regression?
# Y = b0 + b1X + b2X² + ... + bnXⁿ

# 27. Can polynomial regression be applied to multiple variables?
# Yes, it can be extended to Multiple Polynomial Regression.

# 28. What are the limitations of polynomial regression?
# - Overfitting when the degree is too high.
# - Difficult interpretation of coefficients.
# - Computationally expensive.

# 29. What methods can be used to evaluate model fit for polynomial regression?
# - R² and Adjusted R²
# - Mean Squared Error (MSE)
# - Cross-validation

# 30. Why is visualization important in polynomial regression?
# It helps identify patterns, check model fit, and avoid overfitting.

# 31. How is polynomial regression implemented in Python?



In [1]:
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
import numpy as np

# Example dataset
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([2, 5, 10, 17, 26])

# Convert to polynomial features
poly = PolynomialFeatures(degree=2)
X_poly = poly.fit_transform(X)

# Train the model
model = LinearRegression()
model.fit(X_poly, y)

# Predict
y_pred = model.predict(X_poly)
print(y_pred)

[ 2.  5. 10. 17. 26.]
