1. Simple Linear Regression is a statistical method used to model the relationship between a dependent variable (Y) and an independent variable (X) using a straight-line equation:

𝑌
=
𝑚
𝑋
+
𝑐
where
𝑚
 is the slope and
𝑐
 is the intercept.

2. Linearity: The relationship between X and Y is linear.

Independence: Observations are independent of each other.

Homoscedasticity: The variance of residuals is constant across all levels of X.

Normality: Residuals (errors) are normally distributed.

3. The coefficient
𝑚
 represents the slope of the regression line, indicating the change in Y for a one-unit increase in X.

4. The intercept
𝑐
 represents the predicted value of Y when X is zero. It essentially defines where the regression line crosses the Y-axis.

5. The slope
𝑚
 is calculated using the formula:

𝑚
=
∑
(
𝑋
𝑖
−
𝑋
ˉ
)
(
𝑌
𝑖
−
𝑌
ˉ
)
∑
(
𝑋
𝑖
−
𝑋
ˉ
)
2
where
𝑋
𝑖
,
𝑌
𝑖
 are the individual data points and
𝑋
ˉ
,
𝑌
ˉ
 are the mean values of X and Y.

6. The least squares method minimizes the sum of squared residuals (differences between actual and predicted values) to find the best-fitting regression line.

7. R² measures the proportion of variance in Y explained by X. It ranges from 0 to 1:

R² = 1: Perfect fit, X explains all variation in Y.

R² = 0: No relationship between X and Y.

8. Multiple Linear Regression extends Simple Linear Regression to model relationships between multiple independent variables and a dependent variable:

𝑌
=
𝑚
1
𝑋
1
+
𝑚
2
𝑋
2
+
⋯
+
𝑚
𝑛
𝑋
𝑛
+
𝑐
9. Simple Linear Regression: Involves only one independent variable.

Multiple Linear Regression: Involves two or more independent variables.

10. Same assumptions as Simple Linear Regression (linearity, independence, homoscedasticity, normality).

No multicollinearity: Independent variables should not be highly correlated.

No autocorrelation: Errors should not be correlated.

11. Heteroscedasticity occurs when the variance of residuals is not constant across different values of the independent variables. This can lead to inefficient estimates, invalid significance tests, and unreliable confidence intervals.

12. Remove highly correlated predictors if they don’t add value.

Use Principal Component Analysis (PCA) to reduce dimensionality.

Standardize or normalize variables.

Use Ridge or Lasso regression, which are designed for multicollinearity.

13. One-hot encoding: Converts categorical values into binary variables.

Label encoding: Assigns numeric values to categories.

Dummy variables: Creates separate variables for each category.

Target encoding: Maps categories based on the mean of the target variable.

14. Interaction terms help capture the combined effect of two independent variables on the dependent variable. They are useful when the impact of one predictor depends on the value of another.

15. Simple Linear Regression: The intercept represents the expected value of Y when X is zero.

Multiple Linear Regression: The intercept represents the expected value of Y when all independent variables are zero, which may not always be meaningful.

16. The slope indicates the rate of change in Y for a one-unit change in the predictor variable. A larger absolute slope means a stronger relationship between X and Y.

17. Does not indicate if predictors are meaningful.

Does not account for overfitting—a high R² doesn’t mean good generalization.

Cannot measure whether the model is the best among alternatives.

Adjusted R² is preferred when comparing models with different numbers of predictors.

18. A large standard error suggests high variability in coefficient estimates, meaning the predictor variable may not be very reliable in explaining Y.

19. Polynomial regression is a variation of linear regression where the relationship between the independent and dependent variables is modeled as a polynomial function, rather than a straight line.

20. Polynomial regression is used when the relationship between X and Y is nonlinear, and a curved trend fits the data better than a straight line.

21. The intercept represents the predicted value of the dependent variable when all independent variables are zero. It helps establish the starting point of the regression equation and can sometimes have a meaningful interpretation, depending on the context.

22. Heteroscedasticity can be identified by plotting residuals against fitted values. If the spread of residuals increases or forms a cone-shaped pattern, it suggests heteroscedasticity. Addressing it is important because it affects the reliability of standard errors, leading to invalid statistical conclusions.

23. A high
𝑅
2
 but low adjusted
𝑅
2
 suggests that the model is overfitting, meaning it includes too many predictors that don’t significantly contribute to explaining the dependent variable. Adjusted
𝑅
2
 penalizes excessive predictors and helps assess the true predictive power of the model.

24. Scaling ensures that all variables have comparable units and prevents certain predictors from dominating the model simply due to larger numerical values. Standardization (subtracting the mean and dividing by the standard deviation) or normalization (rescaling between 0 and 1) improves numerical stability and interpretation.

25. Linear regression assumes a straight-line relationship between X and Y.

Polynomial regression fits a curved relationship by including higher-degree terms such as
𝑋
2
,
𝑋
3
, etc.

26. The general equation for a degree
𝑛
 polynomial regression model is:

𝑌
=
𝑚
0
+
𝑚
1
𝑋
+
𝑚
2
𝑋
2
+
⋯
+
𝑚
𝑛
𝑋
𝑛
+
𝑐
where
𝑛
 is the degree of the polynomial.

27. Yes! Polynomial regression can be extended to multiple variables by including polynomial terms for each predictor. Example:

𝑌
=
𝑚
0
+
𝑚
1
𝑋
1
+
𝑚
2
𝑋
1
2
+
𝑚
3
𝑋
2
+
𝑚
4
𝑋
2
2
+
𝑐
However, too many polynomial terms can lead to overfitting, so careful selection is necessary.

28. Overfitting: Higher-degree polynomials can fit training data too well but generalize poorly.

Complex interpretation: Coefficients become harder to interpret.

Extrapolation issues: Predictions outside the observed range can be highly unreliable.

Numerical instability: Large polynomial degrees can lead to computational instability.

29. Cross-validation: Splitting data into training and validation sets to check performance.

Adjusted
𝑅
2
: Helps prevent unnecessary complexity.

Mean Squared Error (MSE): Measures prediction error.

Residual plots: Checks if errors follow random patterns.

Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC): Help balance model complexity and accuracy.

30. Visualization helps:

Assess the fit of the regression curve.

Identify overfitting by seeing if the curve follows excessive fluctuations.

Verify trends in data.

Choose the appropriate polynomial degree before applying statistical techniques.

31. Polynomial regression can be implemented using Python libraries such as NumPy, Scikit-learn, and Matplotlib. A typical approach includes:

Generating polynomial features using PolynomialFeatures from Scikit-learn.

Fitting the model using LinearRegression().

Visualizing the results using Matplotlib.