<a href="https://colab.research.google.com/github/Mangesh0309/ML_Basics/blob/main/LogisticRegression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Supervised Learning: Regression Models and Performance Metrics


1.  What is Simple Linear Regression (SLR)? Explain its purpose.

Simple Linear Regression (SLR) is one of the most fundamental and widely used algorithms in Supervised Learning, particularly for predictive modeling involving continuous numerical data. It establishes a linear relationship between one independent variable (predictor) and one dependent variable (target).

Simple Linear Regression is a statistical method that models the relationship between two variables by fitting a straight line to the observed data.

The general equation of the SLR model is:
Y=Œ≤0‚Äã+Œ≤1‚ÄãX+œµ

Where:

Y = Dependent variable (the value we want to predict)

X = Independent variable (the input or predictor)


Œ≤
0
 = Intercept (value of Y when X = 0)

ùõΩ
1
 = Slope (rate of change of Y with respect to X)


œµ = Error term (difference between predicted and actual values)

The algorithm finds the best-fitting line through the data points by minimizing the sum of squared errors (SSE) between the actual and predicted values of
ùëå
Y.
This method is called the Least Squares Method.


 - Purpose of Simple Linear Regression

The primary purposes of SLR are:

a) Prediction

To predict the value of a dependent variable based on the known value of an independent variable.
Example: Predicting house price (Y) based on area in square feet (X).

b) Relationship Analysis

To measure and understand the strength and direction of the relationship between two continuous variables.
If the slope (
ùõΩ
1
Œ≤
1
	‚Äã

) is positive ‚Üí the relationship is direct;
If negative ‚Üí the relationship is inverse.

c) Trend Analysis

Used to identify and forecast trends over time, such as sales, population growth, or temperature change.

d) Feature Importance

Helps determine whether the independent variable significantly influences the dependent variable ‚Äî a key step before building more complex models.

 - Example
 Dataset :

 | Hours Studied (X) | Marks Scored (Y) |
| ----------------- | ---------------- |
| 1                 | 25               |
| 2                 | 45               |
| 3                 | 50               |
| 4                 | 60               |
| 5                 | 80               |

Model Equation (after fitting):
Y=15+13X

Interpretation:

Intercept

Œ≤
0
=15: If a student studies 0 hours, they may score 15 marks.

Slope

Œ≤
1=13: For every additional hour of study, marks increase by 13 on average.

 - Simple Linear Regression is an essential technique in data science and machine learning. It is:

		- Easy to interpret

		- Fast to compute

		- Useful for understanding variable relationships

It forms the foundation for multiple linear regression and other advanced regression models used in modern analytics.

2. What are the key assumptions of Simple Linear Regression?

Simple Linear Regression relies on several statistical assumptions to ensure that the estimated regression line is valid, unbiased, and reliable. If these assumptions are violated, the model‚Äôs predictions and statistical inferences may become inaccurate.

Below are the key assumptions, explained clearly:

 - Linearity

The relationship between the independent variable (X) and the dependent variable (Y) must be linear.

Y=Œ≤
0+Œ≤
1
X+œµ

Meaning:
A straight line should best describe how Y changes with X.

Violation example:
If the true relationship is curved, SLR will not fit well.

 - Independence of Errors

The residuals (errors) should be independent of each other.

ùúñ
1
,
ùúñ
2
,
.
.
.
,
ùúñ
ùëõ
 are independent

Meaning:
One prediction error should not depend on another.

Violation example:
Time-series data often violates this (autocorrelation).

 - Homoscedasticity (Constant Variance of Errors)

The variance of residuals should remain constant across all values of X.

Meaning:
The spread of errors should not increase or decrease with X.

 - Normality of Residuals

Residuals should follow a normal distribution.

œµ‚àºN(0,œÉ
2
)

Meaning:
When you plot residuals, they should form a bell-shaped curve.

Why needed:
For hypothesis testing, confidence intervals, and significance tests (like p-value).

 - No (or minimal) Outliers

Outliers can strongly influence the regression line.

Meaning:
A few very large or very small points can pull the line and distort results.

Example:
A single extreme value can drastically change the slope.

 - The Error Term Has Mean Zero

The average of all residuals should be zero.

E(œµ)=0

This ensures that the regression line correctly represents the central trend of the data.

3. Write the mathematical equation for a simple linear regression model and
explain each term.


1. Mathematical Equation

The general equation for a Simple Linear Regression (SLR) model is:

Y=Œ≤
0+Œ≤
1X+œµ

This equation represents a straight-line relationship between the independent variable (X) and the dependent variable (Y).

2. Explanation of Each Term

 - Y: Dependent Variable (Output)

Also called the response or target variable.

It is the variable we want to predict or estimate using the regression model.

Example: Predicting salary based on years of experience ‚Üí Salary = Y.

 - X: Independent Variable (Input)

Also known as the predictor, feature, or explanatory variable.

It is used to explain the variations in
Y.

Example: Years of experience = X.

 - Œ≤
0
: Intercept

Also called the constant term.

It is the predicted value of
Y when
X=0.

Graphically, it is the point where the regression line intersects the Y-axis.

Example:
If Œ≤0=5, it means when X = 0, predicted Y = 5.

 - Œ≤1: Slope Coefficient

Represents the change in Y for every one-unit change in X.

Indicates the strength and direction of the relationship.

If
ùõΩ
1>
0 ‚Üí Positive relationship

If
ùõΩ
1
<
0
‚Üí Negative relationship

Example:
Œ≤1=3, it means for every 1 unit increase in X, Y increases by 3 units.

 - œµ: Error Term (Residual)

Represents the difference between the actual observed value and the predicted value.

Accounts for:

randomness

measurement errors

unobserved factors

model imperfections

 - Example

Suppose the fitted model is:
Y=20+4X

Interpretation:
Œ≤
0=20: When X = 0, predicted Y = 20

Œ≤
1=4: For each 1 unit increase in X, Y increases by 4

Error term
œµ accounts for differences between predictions and actual values

 - Conclusion

The equation of Simple Linear Regression provides a mathematical way to understand and predict the dependent variable based on the independent variable while accounting for errors. Each term in the equation has a clear statistical meaning and contributes to understanding how input variables influence outcomes.

4. Provide a real-world example where simple linear regression can be
applied.


Simple Linear Regression is widely used in various fields such as economics, engineering, business, healthcare, and education. It is most useful when we want to predict a continuous output based on one key input variable.

Below is a strong real-world example with explanation.

 - Context

In real estate, the price of a house often depends on its area. Larger houses usually cost more. This creates a linear relationship between:

X: Area of the house (in square feet)

Y: Price of the house (in lakhs or rupees)

This makes it an ideal scenario for applying Simple Linear Regression.

 - Problem Statement

A real estate company wants to predict the price of a house based on its size. They collect data of previously sold houses:
| Area (sq ft) | Price (‚Çπ lakhs) |
| ------------ | --------------- |
| 600          | 35              |
| 800          | 50              |
| 1000         | 65              |
| 1200         | 80              |
| 1500         | 100             |

 - Applying Simple Linear Regression

Using the data, we fit the SLR model:
Y=Œ≤0+Œ≤
1X

Assume after training the model, we get:

Price=5+0.06√ó(Area)

 - Interpretation of the Model

Intercept (Œ≤‚ÇÄ = 5)
When the area = 0, the baseline price is ‚Çπ5 lakhs (this may represent land or basic cost).

Slope (Œ≤‚ÇÅ = 0.06)
For every 1 sq ft increase in area, the price increases by ‚Çπ0.06 lakhs (‚Çπ6,000).

 - Prediction Example

For a house of 1100 sq ft:

Price=5+(0.06√ó1100)=5+66=‚Çπ71 lakhs

The model provides a quick, data-driven estimate of house price.

 - Why This is a Good Real-World Use Case

    - There is a strong linear relationship between house area and price.

     - Helps real estate agents, buyers, and sellers make informed decisions.

    - Useful for forecasting, budgeting, and market analysis.

    - Simple Linear Regression is easy to explain and implement.

 - Conclusion

A highly practical real-world application of Simple Linear Regression is predicting house prices based on the house area. The model helps estimate prices, understand trends, and support decision-making in the real estate industry.

5.  What is the method of least squares in linear regression?


The Method of Least Squares is the most commonly used technique to estimate the parameters (Œ≤‚ÇÄ and Œ≤‚ÇÅ) of a Simple Linear Regression model. It determines the best-fitting line through a set of data points by minimizing the sum of the squared differences between the observed values and the predicted values.

The Least Squares Method finds the regression line:

Y^=Œ≤0+Œ≤1X

by minimizing the Sum of Squared Errors (SSE):

SSE=
i=1
‚àën(Y
i‚àí
Yi)^2

Here,

Yi= Actual value

Y^i= Predicted value

(
ùëå
ùëñ
‚àí
ùëå
^
ùëñ)= Residual or error

 - Objective of Least Squares

To choose values of Œ≤‚ÇÄ (intercept) and Œ≤‚ÇÅ (slope) such that:

SSE is minimum

The line that minimizes these squared errors is called the best-fit line.

 - Importance of Least Squares Method

‚úî Simple and effectively

‚úî Provides unbiased estimates (under assumptions)

‚úî Used in machine learning, statistics, economics

‚úî Forms the base for advanced regression models

 - Conclusion

The Method of Least Squares is the core technique in linear regression. It determines the best-fit line by minimizing the sum of squared differences between actual and predicted values. This method ensures accurate estimation of regression parameters and reliable predictions.

6. What is Logistic Regression? How does it differ from Linear Regression?


Logistic Regression is a supervised machine learning algorithm used for classification tasks, not regression.
It predicts the probability that a given input belongs to a particular category (usually binary classification: 0 or 1).

Examples:

Will a customer buy the product? (Yes/No)

Is the email spam? (Spam/Not Spam)

Will a student pass the exam? (Pass/Fail)

Mathematical Model

Instead of modeling a straight line like linear regression, logistic regression uses the sigmoid (logistic) function:

p=
1+e‚àí(Œ≤
0+Œ≤
1X)
1

Where:

p = Probability that Y = 1

Output range = 0 to 1

The S-shaped curve ensures predictions are always probability values

| **Feature**                 | **Linear Regression**                    | **Logistic Regression**                                             |
| --------------------------- | ---------------------------------------- | ------------------------------------------------------------------- |
| **Purpose**                 | Predicts **continuous numerical values** | Predicts **probability** and classifies into categories (e.g., 0/1) |
| **Type of Problem**         | Regression                               | Classification                                                      |
| **Output Range**            | (‚àí‚àû to +‚àû)                               | 0 to 1 (probability)                                                |
| **Model Equation**          | ( Y = \beta_0 + \beta_1 X )              | ( p = \frac{1}{1 + e^{-(\beta_0 + \beta_1 X)}} )                    |
| **Relationship Assumed**    | Linear relationship between X and Y      | Linear relationship between X and **log-odds**                      |
| **Curve/Graph Shape**       | Straight line                            | S-shaped (sigmoid curve)                                            |
| **Loss Function**           | Mean Squared Error (MSE)                 | Log Loss / Cross-Entropy                                            |
| **Estimation Method**       | Ordinary Least Squares (OLS)             | Maximum Likelihood Estimation (MLE)                                 |
| **Output Interpretation**   | Direct predicted value                   | Probability ‚Üí converted to class using threshold (e.g., 0.5)        |
| **Use Cases**               | House price, salary, sales forecasting   | Spam detection, disease prediction, churn prediction                |
| **Dependent Variable Type** | Continuous                               | Categorical (binary/multi-class)                                    |
| **Error Distribution**      | Assumes normally distributed errors      | Assumes binomial distribution of errors                             |


7. Name and briefly describe three common evaluation metrics for regression
models.


Regression models are evaluated using error-based metrics that measure how close the model‚Äôs predictions are to the actual values. Three commonly used metrics are:

1. Mean Absolute Error (MAE)

Definition:
MAE measures the average of the absolute differences between the actual values and the predicted values.

MAE=n1i=1‚àën‚à£Y
i‚àíYi^‚à£

Meaning:

It tells how much the model‚Äôs prediction deviates from the actual value on average.

Less affected by outliers compared to MSE.

2. Mean Squared Error (MSE)

Definition:
MSE calculates the average of squared differences between actual and predicted values.

MSE=n1i=1‚àën(Yi
‚àíYi^)2

Meaning:

Squaring the errors heavily penalizes large mistakes.

Useful for identifying models with large variance or poor fit.

3. R-squared (Coefficient of Determination)

Definition:
R-squared measures the proportion of variance in the dependent variable that is explained by the regression model.

R
2
=1‚àí
SS
tot /
SS
res

Where:

SS
res = Sum of squared residuals

SS
tot = Total variation in actual values

Meaning:

R¬≤ ranges from 0 to 1.

Higher R¬≤ ‚Üí the model explains more of the variability in the data.

R¬≤ = 1 means perfect fit, R¬≤ = 0 means no explanation.

 - Conclusion

The three common metrics‚ÄîMAE, MSE, and R¬≤‚Äîhelp measure model accuracy, error magnitude, and goodness of fit. A good regression model ideally has low MAE, low MSE, and high R¬≤.

8. What is the purpose of the R-squared metric in regression analysis?

In regression analysis, the R-squared (R¬≤) metric‚Äîalso called the Coefficient of Determination‚Äîis used to evaluate how well a regression model fits the observed data. It measures the proportion of the variance in the dependent variable (Y) that is explained by the independent variable(s).

The main purpose of R-squared is:

To quantify how well the regression model explains the variability in the output (Y).

In simple terms, it tells us how much of the change in Y can be predicted from X.

R2=1‚àíSStot/‚ÄãSSres‚Äã‚Äã

Where:

SS
res = Unexplained variation (residual sum of squares)

SS
tot = Total variation in the actual data

 - Interpretation of R-squared Values

‚úî R¬≤ = 1

Perfect fit

Model explains 100% of the variance in Y

All predictions lie on the regression line

‚úî R¬≤ = 0

Model explains 0% of the variance

Model performs no better than simply predicting the mean of Y

‚úî 0 < R¬≤ < 1

Partial fit

Higher value ‚Üí better explanatory power

 - Why R-squared Is Useful

a) Measures Goodness of Fit

Helps determine how well the model matches the data.

b) Compares Competing Models

Higher R¬≤ indicates a model that explains the data better (when number of predictors is equal).

c) Helps Evaluate Model Improvement

If adding a variable increases R¬≤, it means that variable helps explain more variance.

d) Assesses Predictive Strength

Though not a measure of prediction accuracy, it indicates how well the underlying trend is captured.

 - Example

If R¬≤ = 0.85:

85% of the variation in the target variable (Y) is explained by the model

Only 15% remains unexplained (due to noise, missing variables, or errors)

 - Conclusion

The purpose of R-squared in regression analysis is to measure how effectively the regression model explains the variability in the dependent variable. It provides a clear, numerical indication of model fit and helps compare and evaluate regression models.

9. Write Python code to fit a simple linear regression model using scikit-learn
and print the slope and intercept.


Python Code (Simple Linear Regression using scikit-learn)

In [1]:
import numpy as np
from sklearn.linear_model import LinearRegression

# Sample data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 4, 5, 4, 5])

# Fit model
model = LinearRegression()
model.fit(X, y)

slope = model.coef_[0]
intercept = model.intercept_

slope, intercept


(np.float64(0.6), np.float64(2.2))

Interpretation

Slope (Œ≤‚ÇÅ) = 0.6
‚Üí For every 1 unit increase in X, Y increases by 0.6 units (on average).

Intercept (Œ≤‚ÇÄ) = 2.2
‚Üí When X = 0, the predicted Y value is 2.2.

10. How do you interpret the coefficients in a simple linear regression model?


A simple linear regression model is written as:

Y=Œ≤0‚Äã+Œ≤1‚ÄãX+œµ

Where:

ùõΩ
0= Intercept

ùõΩ
1= Slope (coefficient of X)

Interpreting these coefficients correctly helps us understand the relationship between the independent variable (X) and the dependent variable (Y).

 - Interpretation of the Intercept (Œ≤‚ÇÄ)

Definition

The intercept is the predicted value of Y when X = 0.

Meaning

It tells us where the regression line crosses the Y-axis.

 - Interpretation of the Slope (Œ≤‚ÇÅ)
Definition

The slope tells us how much Y changes for a one-unit increase in X.

Meaning

If
ùõΩ
1 >
0: Positive relationship
‚Üí Y increases as X increases

If
ùõΩ 1 < 0: Negative relationship
‚Üí Y decreases as X increases

If
ùõΩ1=0: No linear relationship

 - Key Points to Remember

    - Coefficients describe the strength and direction of the relationship.

    - They are interpreted in terms of change in Y due to a unit change in X.

    - Slope tells the rate of change.

    - Intercept tells the baseline value.

    - Interpretation is always done in context of the problem.

 - Conclusion

In a simple linear regression model, the intercept represents the predicted value of Y when X is zero, while the slope coefficient indicates how much Y changes for a one-unit increase in X. Together, they define the linear relationship between the predictor and the response variable.