# Supervised Learning: Regression Models and Performance Metrics | **Solution**

**Instructions:** Carefully read each question. Use Google Docs, Microsoft Word, or a similar tool to create a document where you type out each question along with its answer. Save the document as a PDF, and then upload it to the LMS. Please do not zip or archive the files before uploading them. Each question carries 20 marks.

**Question 1 :** What is Simple Linear Regression (SLR)? Explain its purpose.

**Answer:** Simple Linear Regression (SLR) is a statistical method used to study the relationship between two variables:

- One independent variable (X) — the predictor or input

- One dependent variable (Y) — the outcome or target

The goal of SLR is to find a straight line (called the regression line) that best fits the data and can be used to predict the value of Y based on X.

**Purpose of SLR:**

 **1. Prediction:**

To predict the value of one variable (Y) based on the value of another variable (X). Example: Predicting house price (Y) based on its size (X).

**2.Relationship Analysis:**

To determine whether and how strongly two variables are related. Example: Studying the relationship between advertising spend (X) and sales revenue (Y).

**3. Trend Estimation:**

To identify general trends or patterns in data.

**Question 2:** What are the key assumptions of Simple Linear Regression?


**Answer:**
1. Linearity

The relationship between the independent variable (X) and dependent variable (Y) should be linear.

- This means the change in Y is proportional to the change in X.

- You can check this by plotting a scatter plot of X and Y.

- If the points roughly form a straight line, the assumption holds true.

Example: As study hours increase, exam scores increase roughly in a straight-line pattern.

2. Independence of Errors

The residuals (errors) should be independent of each other.

- This means one observation’s error shouldn’t influence another’s.

- It’s especially important in time-series data.

- You can test this using the Durbin-Watson test.

Example: The error in predicting sales for January shouldn’t depend on the error for February.

3. Homoscedasticity (Constant Variance of Errors)

The variance of residuals should be constant across all levels of X.

- The spread of residuals should look roughly the same for all X values.

- If variance increases or decreases, it indicates heteroscedasticity which can distort results.

Example: The prediction errors should be about the same size for small and large values of X.

4. Normality of Errors

The residuals (differences between observed and predicted Y values) should follow a normal distribution.

- This assumption is important for hypothesis testing and confidence intervals.

- You can check it with a histogram or Q-Q plot of residuals.

Example: Most errors are small, with few very large positive or negative errors.

5. No or Minimal Multicollinearity

In simple linear regression, there’s only one independent variable, so this usually doesn’t apply. But in multiple regression, the independent variables shouldn’t be highly correlated with each other.

**Question 3:** Write the mathematical equation for a simple linear regression model and explain each term.

**Answer:** The mathematical equation for a Simple Linear Regression (SLR) model is:

𝑌
=𝑎
+
𝑏
𝑋
+
𝑒


1. Y — Dependent Variable (Response Variable)

- This is the output or value we want to predict or explain.

- Example: House price, sales amount, exam score, etc.

2. X — Independent Variable (Predictor Variable)

- This is the input or factor used to predict Y.

- Example: House size, advertising spend, study hours, etc.

3. a — Intercept (Constant Term)

- The value of Y when X = 0.

- It represents the point where the regression line crosses the Y-axis.

- Example: If sales = 50 + 10X, the intercept (50) means even with no advertising, sales are expected to be 50 units.

4. b — Slope (Regression Coefficient)

- The rate of change in Y for every one-unit change in X.

- It shows the strength and direction of the relationship.

- If b > 0, Y increases as X increases (positive relationship).

- If b < 0, Y decreases as X increases (negative relationship).

- Example: If b = 10, it means every extra unit of X increases Y by 10 units.

5. e — Error Term (Residual)

- The difference between the actual and predicted value of Y.

- It accounts for all other factors that affect Y but are not included in the model.

𝑒
=𝑌
𝑎
𝑐
𝑡
𝑢
𝑎
𝑙
−
𝑌
𝑝
𝑟
𝑒
𝑑
𝑖
𝑐
𝑡
𝑒
𝑑


**Question 4:** Provide a real-world example where simple linear regression can be applied.


**Answer:** **Example: Predicting Student Exam Scores**

Scenario:

A teacher wants to understand how the number of hours a student studies (X) affects their exam score (Y).

Model Setup:

- Independent Variable (X): Hours studied

- Dependent Variable (Y): Exam score

After collecting data from several students, the teacher finds this regression equation:

Score
=25
+
5
𝑋

**Interpretation:**

- Intercept (25):

If a student studies 0 hours, they can still score around 25 marks (maybe from basic knowledge or attendance marks).

- Slope (5):

For every additional hour studied, the exam score increases by 5 marks on average.

**Example Prediction:**

If a student studies for 8 hours, the predicted score is:

Score
=25
+
5
(
8
)
=65

**Question 5:** What is the method of least squares in linear regression?

**Answer:** The method of least squares is the most common technique used to find the best-fitting line in a linear regression model.

Its main goal is to minimize the difference between the actual values and the predicted values from the regression line.

**Concept:**

In Simple Linear Regression, the model is:

𝑌
=𝑎
+
𝑏
𝑋
+
𝑒


- Y = Actual dependent variable

- a + bX = Predicted value of Y

- e = Error (difference between actual and predicted Y)

The method of least squares chooses values of a (intercept) and b (slope) so that the sum of squared errors (residuals) is as small as possible.

**Question 6:** What is Logistic Regression? How does it differ from Linear Regression?

**Answer:** Logistic Regression is a statistical method used to predict a categorical (usually binary) outcome based on one or more independent variables.

Instead of predicting a continuous value like in linear regression, logistic regression predicts the probability that an observation belongs to a particular class.

**Example:**

Predicting whether a student passes (1) or fails (0) based on hours studied.

**Question 7:** Name and briefly describe three common evaluation metrics for regression models.


**Answer:** 1. Mean Absolute Error (MAE)

𝑀
𝐴
𝐸
=1
𝑛
∑
∣
𝑌
𝑖
−
𝑌
𝑖
^
∣


- What it means:

It measures the average absolute difference between actual and predicted values.

- Interpretation:

Lower MAE = better model accuracy.

- Example:

If MAE = 5, on average the model’s predictions are off by 5 units.

2. Mean Squared Error (MSE)

𝑀
𝑆
𝐸
=1
𝑛
∑
(
𝑌
𝑖
−
𝑌
𝑖
^
)
2

- What it means:

It measures the average of squared errors between actual and predicted values.

- Key point:

Larger errors are penalized more because errors are squared.

- Interpretation:

A smaller MSE means the model fits the data better.

3. R-squared (Coefficient of Determination)

𝑅
2
=1
−
SS
𝑟
𝑒
𝑠
SS
𝑡
𝑜
𝑡

Where:

- SS
𝑟
𝑒
𝑠
=∑
(
𝑌
𝑖
−
𝑌
𝑖
^
)
2
→ Residual sum of squares

- SS
𝑡
𝑜
𝑡
=∑
(
𝑌
𝑖
−
𝑌
ˉ
)
2→ Total sum of squares

- What it means:

R² shows how much of the variation in the dependent variable (Y) is explained by the model.

- Range: 0 to 1

- R² = 1: Perfect fit

- R² = 0: Model explains nothing

**Question 8:** What is the purpose of the R-squared metric in regression analysis?

**Answer:** Purpose of the R-squared Metric in Regression Analysis R-squared (R²), also called the coefficient of determination, measures how well the regression model explains the variation in the dependent variable (Y) using the independent variable(s) (X).

Definition:
𝑅
2
=1
−
SS
𝑟
𝑒
𝑠
SS
𝑡
𝑜
𝑡

Where:

- SS
𝑟
𝑒
𝑠
=∑
(
𝑌
𝑖
−
𝑌
𝑖
^
)
2→ Residual (unexplained) variation

- SS
𝑡
𝑜
𝑡
=∑
(
𝑌
𝑖
−
𝑌
ˉ
)
2→ Total variation in Y

Purpose:

To show how well the model fits the data.

It tells us the proportion of the variance in the dependent variable that can be explained by the independent variable(s).

**Question 9:** Write Python code to fit a simple linear regression model using scikit-learn and print the slope and intercept.

(Include your Python code and output in the code box below.)


In [None]:
# Import necessary libraries
import numpy as np
from sklearn.linear_model import LinearRegression

# Sample data
# X = independent variable (e.g., hours studied)
# Y = dependent variable (e.g., exam score)
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
Y = np.array([2, 4, 5, 4, 5])

# Create and fit the model
model = LinearRegression()
model.fit(X, Y)

# Print slope and intercept
print("Slope (b):", model.coef_[0])
print("Intercept (a):", model.intercept_)

# P


**Question 10:** How do you interpret the coefficients in a simple linear regression model?



**Answer:** In a Simple Linear Regression model, the equation is:

𝑌
=𝑎
+
𝑏
𝑋
+
𝑒


Where:

- Y = Dependent variable (what you’re trying to predict)

- X = Independent variable (the predictor)

- a = Intercept

- b = Slope (coefficient of X)

- e = Error term

1. Intercept (a)

- The intercept represents the predicted value of Y when X = 0.

- It’s the point where the regression line crosses the Y-axis.

- Interpretation depends on context—sometimes it makes sense, sometimes it doesn’t (for example, when X = 0 is outside the data range).

Example:
If the regression equation for exam scores is

Score
=25
+
5
𝑋

then when study hours (X) = 0, the predicted score (Y) = 25. It means a student who doesn’t study at all might still score around 25 marks.

2. Slope (b)

- The slope shows how much Y changes for every one-unit change in X.

- It tells you the direction and strength of the relationship:

- b > 0 → positive relationship (Y increases as X increases)

- b < 0 → negative relationship (Y decreases as X increases)

Example:
From the same equation,

Score
=25
+
5
𝑋

the slope (b = 5) means that for every 1 additional hour of study, the exam score increases by 5 points on average.