# **Assginment 2: Supervised Learning: Regression Models and Performance Metrics**

**1.What is Simple Linear Regression (SLR)? Explain its purpose.**


>Simple Linear Regression (SLR) is a statistical method used to model the relationship between two variables — one independent variable (X) and one dependent variable (Y) — by fitting a straight line through the data points.

>**Equation of Simple Linear Regression**

 >>Y=β₀​+β₁​X+ε

>Where:

>>Y = Dependent variable (the one we want to predict)

>>X = Independent variable (the predictor)

>>β₀ = Intercept (value of Y when X = 0)

>>β₁ = Slope (change in Y for each unit change in X)

>>ε = Error term (difference between actual and predicted values)

>**Purpose of Simple Linear Regression**

>1.Prediction:
>>To predict the value of a dependent variable based on the value of an independent variable.

>>Example: Predicting a student’s exam score (Y) based on study hours (X).

>2.Relationship Identification:
>>To determine whether and how strongly two variables are linearly related.

>>Example: Checking if sales increase with advertisement spending.

>3.Trend Analysis:
>>To understand trends and make future forecasts from past data.

>4.Quantification of Impact:
>>To measure how much change in the independent variable affects the dependent variable.

>**Example**

>>Suppose we have data on hours studied (X) and marks scored (Y):

| Hours Studied (X) | Marks (Y) |
| ----------------- | --------- |
| 2                 | 50        |
| 4                 | 60        |
| 6                 | 70        |
| 8                 | 80        |



>>This means:

>>The intercept (40) is the expected marks if no hours are studied.

>>The slope (5) means for every extra hour studied, marks increase by 5 points.

**2.What are the key assumptions of Simple Linear Regression?**

>The **key assumptions of Simple Linear Regression (SLR)** ensure that the model is valid and the predictions are reliable.


 >**1. Linearity**

>* The relationship between the **independent variable (X)** and the **dependent variable (Y)** is **linear**.
>* This means that a change in X results in a proportional change in Y.
  >> *Example:* If hours studied double, marks should roughly double.

 >>**Check:** Use scatter plots of X vs Y — the points should form a roughly straight-line pattern.



> **2. Independence of Errors**

>* The residuals (errors) should be **independent** of each other.
>* There should be **no autocorrelation** (i.e., one error should not depend on another).

 >>**Check:** Use the **Durbin–Watson test** to detect autocorrelation.



>**3. Homoscedasticity (Constant Variance)**

>* The variance of the residuals (errors) should be **constant** across all levels of X.
>* That is, the spread of errors should be roughly the same for all predicted values.

 >>>If the spread increases or decreases with X, it indicates **heteroscedasticity**.

 >>**Check:** Plot residuals vs fitted values — the points should be randomly scattered without a funnel shape.


>**4. Normality of Errors**

>* The residuals (errors) should follow a **normal distribution**.
>* This is important for accurate hypothesis testing and confidence intervals.

>> **Check:** Use a **Q-Q plot** or **histogram of residuals** — they should look approximately normal.

>**5. No Multicollinearity (Not applicable for SLR)**

>* Since SLR has only **one independent variable**, this assumption applies only to **multiple regression**.
>* But still, X should not be a constant or perfectly correlated with any other variable.






3.**Write the mathematical equation for a simple linear regression model and
explain each term.**


>The **mathematical equation** for a **Simple Linear Regression (SLR)** model is:

>>Y = β₀ + β₁X + ε




>**Explanation of Each Term**

| **Term**           | **Name**                        | **Meaning / Role**                                                                                                                                |
| ------------------ | ------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Y**              | Dependent Variable              | The variable we want to **predict or explain**. <br>Example: Marks scored, house price, etc.                                                      |
| **X**              | Independent Variable            | The variable used to **predict Y**. <br>Example: Hours studied, area of the house, etc.                                                           |
| **β₀ (Beta-zero)** | Intercept                       | The **value of Y when X = 0**. <br>It represents the baseline or starting value.                                                                  |
| **β₁ (Beta-one)**  | Slope or Regression Coefficient | Shows how much **Y changes** for a **one-unit increase in X**. <br>If β₁ = 5, then for every 1 increase in X, Y increases by 5 units.             |
| **ε (Epsilon)**    | Error Term or Residual          | Represents the **difference** between the **actual** and **predicted** Y values. <br>It captures randomness or factors not included in the model. |


>**Example**

>Suppose we are predicting **marks (Y)** based on **hours studied (X)**:

>>Y = 40 + 5X + ε


>Here:

>* **β₀ = 40** → Base marks if no hours are studied.
>* **β₁ = 5** → For every extra hour studied, marks increase by 5.
>* **ε** → Random error due to factors like mood, environment, or luck.


4.**Provide a real-world example where simple linear regression can be
applied.**


> **Real-World Example: Predicting House Prices**

>>**Scenario:**
A real estate company wants to **predict the price of a house (Y)** based on its **size in square feet (X)**.



>>**Application of Simple Linear Regression**

>>We collect past data like this:

| House Size (sq. ft) (X) | Price (₹ Lakhs) (Y) |
| ----------------------- | ------------------- |
| 800                     | 40                  |
| 1000                    | 50                  |
| 1200                    | 60                  |
| 1500                    | 75                  |
| 1800                    | 90                  |

>>The regression analysis gives the equation:

>>>Y = 10 + 0.045X


>> **Interpretation:**

>>* **β₀ = 10** → A house with 0 sq. ft (theoretically) costs ₹10 lakhs (base cost such as land, location, etc.).
>>* **β₁ = 0.045** → For every **1 sq. ft increase in area**, the **price increases by ₹0.045 lakhs (₹4,500)**.



>> **Use:**

>>If a customer wants to know the estimated price of a **1600 sq. ft house**:

>>Y = 10 + 0.045(1600) = 82 \text{ lakhs}




5.**What is the method of least squares in linear regression?**

>The **Method of Least Squares** is a **mathematical approach** used in **linear regression** to find the **best-fitting line** through a set of data points by **minimizing the sum of the squared errors (residuals)**.


>**Concept** :

>>In **Simple Linear Regression**, the equation of the line is:

>>>Y = β_0 + β_1X + ε

>>But the actual data points ((X_i, Y_i)) will not all lie exactly on this line — there will be **errors** (differences) between the **actual values (Yᵢ)** and the **predicted values (Ŷᵢ)**.

>>These differences are called **residuals**:

>>>e_i = Y_i - Ŷ_i

>**Goal of Least Squares**

>>The goal is to choose values of **β₀** and **β₁** that **minimize the sum of squared residuals (errors)**:

>>>\text{Minimize } S = \sum (Y_i - Ŷ_i)^2 = \sum (Y_i - (β_0 + β_1X_i))^2


>>This ensures the regression line is the one that fits the data **as closely as possible**.

> **Formulas for Coefficients**

>>By solving the minimization equations, we get:

>>>β1​=∑(Xi​−Xˉ)(Yi​−Yˉ)/2∑(Xi​−Xˉ)^2

>>>β0​=Yˉ−β1​Xˉ

>>Where:

>>* (Xˉ) = Mean of X values
>>* (Yˉ) = Mean of Y values

> **Intuitive Understanding**

>>* “Least Squares” means we’re **making the total squared distance** between actual and predicted Y values **as small as possible**.
>>* Squaring ensures that positive and negative errors don’t cancel out and gives more weight to larger errors.


>**Example**

| X (Hours studied) | Y (Marks) | Predicted Ŷ | Residual (Y–Ŷ) | (Y–Ŷ)² |
| ----------------- | --------- | ----------- | -------------- | ------ |
| 2                 | 50        | 52          | -2             | 4      |
| 4                 | 60        | 61          | -1             | 1      |
| 6                 | 70        | 70          | 0              | 0      |
| 8                 | 80        | 79          | 1              | 1      |

>The **least squares line** minimizes the total of the last column → **∑(Y–Ŷ)² = 6** (minimum possible).




6.**What is Logistic Regression? How does it differ from Linear Regression?**

> **Logistic Regression**

>>**Logistic Regression** is a **supervised machine learning algorithm** used for **classification problems**, where the **dependent variable (Y)** is **categorical** — usually **binary (0 or 1)**.

>>It predicts the **probability** that an observation belongs to a particular class (e.g., “Yes” or “No”, “Spam” or “Not Spam”).

<br>


>**Mathematical Form**

>>Unlike Linear Regression, which predicts a continuous value, Logistic Regression predicts a **probability (p)** using the **sigmoid (logistic) function**:


>>>p = \frac{1}{1 + e^{-(β_0 + β_1X)}}


>>Where:

>>* **p** = Probability that Y = 1
>>* **β₀, β₁** = Model coefficients
>>* **e** = Exponential constant (~2.718)

>The predicted probability (p) is always between **0 and 1**.
To classify, we set a **threshold** (usually 0.5):

>>* If (p ≥ 0.5), predict **1** (Yes/True)
>>* If (p < 0.5), predict **0** (No/False)

<br>

>**Purpose**

>>To model the **relationship between input variables (X)** and the **probability of an event occurring (Y=1)**.

>> Example: Predict whether a student will pass (1) or fail (0) based on study hours.

<br>


> **Difference Between Linear and Logistic Regression**

| **Aspect**               | **Linear Regression**                          | **Logistic Regression**                                              |
| ------------------------ | ---------------------------------------------- | -------------------------------------------------------------------- |
| **Purpose**              | Predicts a **continuous** value                | Predicts a **categorical** outcome (usually binary)                  |
| **Output Range**         | Output can be **any real number** (−∞ to +∞)   | Output is a **probability between 0 and 1**                          |
| **Equation**             | (Y = β_0 + β_1X)                               | (p = \frac{1}{1 + e^{-(β_0 + β_1X)}})                                |
| **Type of Relationship** | Models **linear** relationship between X and Y | Models **non-linear** relationship using **sigmoid curve**           |
| **Use Case Example**     | Predicting house prices, sales, temperature    | Predicting if a customer will buy (Yes/No), disease (Present/Absent) |
| **Error Function Used**  | Mean Squared Error (MSE)                       | Log Loss (Cross-Entropy)                                             |
| **Nature of Model**      | Regression (continuous output)                 | Classification (discrete output)                                     |

<br>

>**Graphical View**

>* **Linear Regression:** Produces a **straight line** prediction
>* **Logistic Regression:** Produces an **S-shaped (sigmoid) curve**



7.**Name and briefly describe three common evaluation metrics for regression
models.**

>**Three common evaluation metrics** used to assess the performance of **regression models**

<br>

>**1. Mean Absolute Error (MAE)**

>>MAE = \frac{1}{n} \sum_{i=1}^{n} |Y_i - Ŷ_i|


>**Description:**

>* MAE measures the **average absolute difference** between the actual values (**Yᵢ**) and the predicted values (**Ŷᵢ**).
>* It gives an idea of how **far predictions are from actual values**, on average.
>* It treats all errors **equally**, regardless of direction.

>**Interpretation:**

>* Lower MAE → Better model performance.
>* Example: MAE = 3 means, on average, predictions are off by 3 units.

<br>

> **2. Mean Squared Error (MSE)**


>>MSE = \frac{1}{n} \sum_{i=1}^{n} (Y_i - Ŷ_i)^2


>**Description:**

>* MSE measures the **average of the squared differences** between actual and predicted values.
>* It gives **more weight to large errors**, because errors are **squared**.
>* It’s useful when **large errors are particularly undesirable**.

> **Interpretation:**

>* Smaller MSE = More accurate model.
>* Sensitive to outliers due to squaring.

<br>

>**3. R-squared (Coefficient of Determination)**

>>R²= 1− ( ∑(Yi​−Yˉ)^2  / ∑(Yi​−Y^i​)^2 )

>**Description:**

>* R² measures how well the regression line **explains the variability** in the dependent variable (Y).
>* It represents the **proportion of variance** in Y that is explained by X.

 >**Interpretation:**

>* R² ranges from **0 to 1**

  >* **0** → Model explains none of the variability
  >* **1** → Model perfectly explains all variability
>* Example: R² = 0.85 → 85% of the variation in Y is explained by X.




8.**What is the purpose of the R-squared metric in regression analysis?**

> **Purpose of R-squared Metric in Regression Analysis**

>>**R-squared (R²)**, also known as the **coefficient of determination**, measures how well the **regression model explains the variability** of the dependent variable (**Y**) based on the independent variable(s) (**X**).

>>R²= 1− ( ∑(Yi​−Yˉ)^2  / ∑(Yi​−Y^i​)^2 )
<br>

> **Key Points**

>* **R² value ranges from 0 to 1:**

  >>* **0** → Model explains none of the variation in Y.
  >>* **1** → Model perfectly explains all the variation in Y.
>* It represents the **proportion of variance** in the dependent variable that is **explained by the independent variable(s)**.
>* Higher **R²** indicates a **better model fit**.

<br>

>  **Example**

>>If a regression model has **R² = 0.85**, it means:

>>>85% of the variation in the dependent variable is explained by the model, and the remaining 15% is due to other unexplained factors (errors or noise).




In [1]:
'''9.Write Python code to fit a simple linear regression model using scikit-learn
and print the slope and intercept.'''


# Import necessary libraries
import numpy as np
from sklearn.linear_model import LinearRegression

# Sample data
# X = independent variable (reshape required for sklearn)
# Y = dependent variable
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
Y = np.array([2, 4, 5, 4, 5])

# Create and fit the linear regression model
model = LinearRegression()
model.fit(X, Y)

# Print the slope (coefficient) and intercept
print("Slope (β₁):", model.coef_[0])
print("Intercept (β₀):", model.intercept_)

# Optional: print predicted values
Y_pred = model.predict(X)
print("\nPredicted values:", Y_pred)



Slope (β₁): 0.6
Intercept (β₀): 2.2

Predicted values: [2.8 3.4 4.  4.6 5.2]


**10.How do you interpret the coefficients in a simple linear regression model?**

> **Interpretation of Coefficients in a Simple Linear Regression Model**

>>A **Simple Linear Regression** model is represented as:


>>>Y = β₀ + β₁X + ε


>>Where:

>>* **Y** → Dependent (response) variable
>>* **X** → Independent (predictor) variable
>>* **β₀** → Intercept
>>* **β₁** → Slope (coefficient of X)
>>* **ε** → Error term


> **1. Intercept (β₀)**

>* It represents the **predicted value of Y when X = 0**.
>* In other words, it is the **starting point** or **baseline value** of the dependent variable.

>> Example: If β₀ = 40, it means when X = 0, Y is expected to be 40.


> **2. Slope (β₁)**

>* It represents the **change in Y** for a **one-unit increase in X**.
>* It shows the **strength and direction** of the relationship between X and Y:

  >>* If **β₁ > 0**, Y increases as X increases.
  >>* If **β₁ < 0**, Y decreases as X increases.

>> Example: If β₁ = 5, then for every additional unit increase in X, Y increases by 5 units (on average).


> **Example**

>>If the fitted model is:

>>>Y = 40 + 5X


>>* **β₀ = 40** → When X = 0, predicted Y = 40.
>>* **β₁ = 5** → For each 1-unit increase in X, Y increases by 5.


