# **Supervised Learning: Regression Models and Performance Metrics |**




Question 1 : What is Simple Linear Regression (SLR)? Explain its purpose.
 - Simple Linear Regression (SLR) is a statistical method used to model the relationship between two variables — one independent variable (X) and one dependent variable (Y) — by fitting a straight line through the data points.

Question 2: What are the key assumptions of Simple Linear Regression?
 - The key assumptions of Simple Linear Regression (SLR) ensure that the model’s estimates and predictions are reliable and valid. They are as follows:

Linearity

- The relationship between the independent variable (X) and the dependent variable (Y) must be linear.

- This means a change in X leads to a proportional change in Y.

Independence of Errors

 - The residuals (errors) should be independent of each other.

- This means the error for one observation should not influence another.

Homoscedasticity (Constant Variance of Errors)

- The variance of residuals should be constant across all values of X.

- If the spread of errors changes with X (called heteroscedasticity), it violates this assumption.

Normality of Errors

- The residuals (ε) should be normally distributed.

- This is important for valid hypothesis testing and confidence intervals.

No Multicollinearity (Not applicable in SLR)

- Since SLR uses only one independent variable, this assumption mainly applies to multiple regression, not simple regression.

Question 3: Write the mathematical equation for a simple linear regression model and
explain each term.
- The mathematical equation for a Simple Linear Regression (SLR) model is:

- 𝑌 =𝛽0+𝛽1𝑋+𝜀

Question 4: Provide a real-world example where simple linear regression can be
applied.
- Real-World Example of Simple Linear Regression (SLR):

Scenario:
A company wants to predict sales revenue based on advertising expenditure.

Dependent Variable (Y): Sales revenue (in ₹ or $)

Independent Variable (X): Advertising spend (in ₹ or $)

Application:
The company collects past data on how much they spent on advertising and the corresponding sales generated. By applying Simple Linear Regression, they can model the relationship as:

- Sales=𝛽+𝛽(Advertising Spend)+
𝜀
Question 5: What is the method of least squares in linear regression?
- The Method of Least Squares determines the line of best fit by minimizing the squared differences between actual and predicted values of the dependent variable.
Question 6: What is Logistic Regression? How does it differ from Linear Regression?
 - Logistic Regression

Logistic Regression is a statistical technique used to model the relationship between a categorical dependent variable (usually binary: 0 or 1) and one or more independent variables.

It predicts the probability that an observation belongs to a particular category (e.g., “Yes” or “No”, “Spam” or “Not Spam”).

The logistic regression model is based on the logistic (sigmoid) function:

𝑃(𝑌=1)=1/(1+e−(β0	​+β1	​X))
	​
Question 7: Name and briefly describe three common evaluation metrics for regression
models
- MAE → average absolute error (easy to interpret).

- MSE → penalizes large errors more.

- R² → explains how well the model fits the data.

Question 8: What is the purpose of the R-squared metric in regression analysis?
 - Explains Model Fit

R² indicates the proportion of the variance in the dependent variable (Y) that is explained by the independent variable(s) (X).

Example: An R² of 0.85 means 85% of the variation in Y is explained by X, and the remaining 15% is due to other factors or random error.

- Evaluates Model Performance

A higher R² value (closer to 1) means the model fits the data better.

A lower R² value (closer to 0) means the model explains very little of the variability in Y.

- Helps Compare Models

R² can be used to compare different regression models applied to the same dataset — the model with a higher R² generally provides a better fit.

Question 9: Write Python code to fit a simple linear regression model using scikit-learn
and print the slope and intercept.


In [1]:
# Import necessary libraries
from sklearn.linear_model import LinearRegression
import numpy as np

# Example data
# X = independent variable (e.g., years of experience)
# y = dependent variable (e.g., salary)
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([15000, 18000, 21000, 24000, 27000])

# Create and fit the model
model = LinearRegression()
model.fit(X, y)

# Print the slope (coefficient) and intercept
print("Slope (β1):", model.coef_[0])
print("Intercept (β0):", model.intercept_)


Slope (β1): 3000.0000000000005
Intercept (β0): 11999.999999999998


Question 10: How do you interpret the coefficients in a simple linear regression model?
- Intercept (β₀): Predicted value of Y when X = 0.

- Slope (β₁): Amount by which Y changes for every one-unit change in X.