### 1. What is Simple Linear Regression?
Simple Linear Regression is a statistical technique used to model the relationship between one independent variable (X) and a dependent variable (Y).
It fits a straight line that best represents the data using the least squares method.
The primary goal is to predict Y based on X and understand how changes in X influence Y.

### 2. What are the key assumptions of Simple Linear Regression?
The model assumes a linear relationship between X and Y, meaning changes in X correspond to proportional changes in Y.
Residuals must be normally distributed with constant variance (homoscedasticity) and should be independent of each other.
Additionally, the predictor variable should be measured without significant error.

### 3. What does the coefficient m represent in Y = mX + c?
The coefficient m is the slope of the regression line and represents the rate of change in Y for each unit change in X.
A positive slope indicates that Y increases as X increases, while a negative slope indicates the opposite.
The magnitude of m shows how strong the relationship is.

### 4. What does the intercept c represent in Y = mX + c?
The intercept c represents the predicted value of Y when X is zero.
It determines the vertical placement of the regression line on the graph.
Its interpretation depends on whether X=0 is meaningful within the context of the data.

### 5. How do we calculate the slope m in Simple Linear Regression?
The slope m is calculated using the formula m = Cov(X, Y) / Var(X), derived from minimizing squared residuals.
This ensures the best-fitting line by optimizing the vertical error between actual and predicted values.
The slope indicates how sensitive Y is to changes in X.

### 6. What is the purpose of the least squares method?
The least squares method minimizes the sum of squared differences between observed and predicted values.
It ensures the regression line fits the data as accurately as possible.
This method also gives slope and intercept estimates with desirable statistical properties under model assumptions.

### 7. How is R² interpreted in Simple Linear Regression?
R² indicates the proportion of variation in the dependent variable explained by the independent variable.
It ranges from 0 to 1, where higher values show a better fit of the model.
However, it does not indicate causal relationship or prediction quality on new data.

### 8. What is Multiple Linear Regression?
Multiple Linear Regression models the relationship between multiple independent variables and one dependent variable.
It extends simple regression by estimating separate coefficients for each predictor.
This helps understand combined and individual effects of predictors on the outcome.

### 9. Main difference between Simple and Multiple Linear Regression?
Simple Linear Regression uses one predictor, while Multiple Linear Regression uses two or more.
Multiple regression captures more complex relationships and interactions among predictors.
However, it requires stricter assumptions and careful interpretation.

### 10. Key assumptions of Multiple Linear Regression?
The model assumes linearity between predictors and outcome, normally distributed and homoscedastic residuals, and independence of observations.
It also requires no perfect multicollinearity among predictors.
Model correctness depends on including all relevant variables.

### 11. What is heteroscedasticity?
Heteroscedasticity occurs when residual variance changes across levels of predictors rather than remaining constant.
It leads to unreliable standard errors, affecting hypothesis tests and confidence intervals.
Although coefficient estimates remain unbiased, inference becomes inaccurate.

### 12. How to improve a model with high multicollinearity?
You can remove or combine highly correlated predictors, or transform them using PCA.
Regularization techniques such as Ridge or Lasso regression also help stabilize coefficient estimates.
Increasing sample size or centering variables may reduce multicollinearity effects.

### 13. Techniques for transforming categorical variables?
One-hot encoding creates binary variables for each category, commonly used in regression.
Label encoding assigns numeric values to ordinal features.
For high-cardinality variables, target encoding or frequency encoding may be used.

### 14. Role of interaction terms?
Interaction terms capture situations where the effect of one predictor on Y depends on another predictor.
They help model non-additive relationships, adding flexibility to the regression equation.
They are essential when theory or data suggests variable interactions.

### 15. Interpretation of intercept in Simple vs Multiple Regression?
In Simple Regression, the intercept is the predicted value of Y at X=0.
In Multiple Regression, the intercept represents Y when all predictors equal zero.
Depending on context, this may or may not be meaningful.

### 16. Significance of the slope in regression?
The slope indicates how much the dependent variable changes for a unit increase in the predictor.
It tells us the strength and direction of the relationship.
Larger absolute values imply stronger influence on predictions.

### 17. How does the intercept provide context?
The intercept establishes the baseline value of Y when predictors are zero.
It helps anchor the regression line and interpret predictions.
Its practical relevance depends on whether zero values make sense for the variables.

### 18. Limitations of R²?
R² does not indicate whether the model is appropriate or overfitted.
It also cannot assess predictive performance on unseen data.
High R² does not guarantee meaningful or reliable predictions.

### 19. Interpretation of large standard error?
A large standard error means the coefficient estimate is unstable and uncertain.
It suggests the model may suffer from multicollinearity or insufficient data.
High SE reduces statistical significance of the coefficient.

### 20. Identifying heteroscedasticity in residual plots?
Residual plots show heteroscedasticity when residuals form a fan or cone shape.
Addressing it is important because it biases standard errors and test statistics.
Solutions include transformations or using robust standard errors.

### 21. Meaning of high R² but low adjusted R²?
This indicates the model contains unnecessary predictors that don't improve explanatory power.
The model may be overfitted, capturing noise rather than signal.
Adjusted R² penalizes extra predictors, revealing this issue.

### 22. Why scale variables in regression?
Scaling ensures all predictors contribute equally during optimization.
It improves stability in algorithms sensitive to variable magnitude.
It is essential for regularization methods like Ridge and Lasso.

### 23. What is polynomial regression?
Polynomial regression models nonlinear relationships by adding polynomial terms (X², X³...).
It maintains linearity in coefficients while allowing curved fits.
Useful for capturing smooth nonlinear trends.

### 24. Difference between polynomial and linear regression?
Linear regression fits a straight line, whereas polynomial fits curved patterns.
Polynomial regression introduces higher-degree terms to capture complexity.
It remains a linear model in terms of coefficients.

### 25. When is polynomial regression used?
Used when data shows a curved or nonlinear pattern that a straight line cannot capture.
Appropriate for trend analysis, growth curves, and smooth nonlinear relationships.
Care must be taken to avoid overfitting.

### 26. General equation for polynomial regression?
The equation is Y = b0 + b1X + b2X² + ... + bnXⁿ.
Each added term increases model flexibility.
Higher-degree polynomials allow more curvature in the fitted line.

### 27. Can polynomial regression apply to multiple variables?
Yes, polynomial terms can be added for each predictor and their interactions.
This results in complex models with many terms.
Regularization is often required to prevent overfitting.

### 28. Limitations of polynomial regression?
High-degree polynomials often overfit the data.
They are sensitive to outliers and extrapolate poorly outside the data range.
Interpretability decreases as model complexity grows.

### 29. Selecting polynomial degree?
Use cross-validation to compare performance across degrees.
AIC/BIC scores and validation curves help choose optimal complexity.
Visualization of fit and residuals also guides selection.

### 30. Why is visualization important?
Plots help assess how well the polynomial curve fits the data.
They reveal underfitting, overfitting, and anomalies.
Visualization improves interpretation and communication of results.

### 31. How is polynomial regression implemented in Python?
Polynomial regression in Python uses PolynomialFeatures to expand features.
Then a LinearRegression model is fit to the transformed data.
A simple example is included below in a code cell.

In [2]:

from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline
import numpy as np

X = np.array([1,2,3,4]).reshape(-1,1)
y = np.array([2,5,10,17])

model = make_pipeline(PolynomialFeatures(degree=2), LinearRegression())
model.fit(X, y)

print("Coefficients:", model.named_steps['linearregression'].coef_)
print("Intercept:", model.named_steps['linearregression'].intercept_)


Coefficients: [0.00000000e+00 2.10942375e-15 1.00000000e+00]
Intercept: 1.0
