# **Day 3 — Logistic Regression**

### **1. What is Logistic Regression?**
Logistic Regression is a classification algorithm used to predict binary or multi-class outcomes.  
Instead of predicting continuous values (like Linear Regression), it predicts probabilities of a class.

### **2. Sigmoid Function**
The sigmoid maps any real number to a value between 0 and 1:  
![image.png](attachment:image.png)

**Why it matters**  
- Converts linear output → probability  
- Helps decide if an instance belongs to class 0 or 1  
- S-shaped curve ensures smooth probability transitions

### **3. Decision Boundary**
A decision boundary separates the classes.  
For binary logistic regression: 

![image-2.png](attachment:image-2.png) 

**Types of decision boundaries**  
- Linear boundary → when using logistic regression normally  
- Non-linear boundary → when using polynomial or kernel features

### **4. Binary vs Multi-class Logistic Regression**

| Aspect                     | Binary Logistic Regression       | Multi-class Logistic Regression         |
|----------------------------|---------------------------------|-----------------------------------------|
| Number of Classes          | Two (0/1)                       | Three or more (e.g., cat/dog/horse)     |
| Activation Function        | Sigmoid                        | Softmax                                 |
| Loss Function              | Binary Cross-Entropy            | Categorical Cross-Entropy                |

### **5. Softmax Function (for Multi-class)**
Softmax converts raw scores into probabilities across multiple classes:  

![image-3.png](attachment:image-3.png)

**Why softmax?**  
- Ensures probabilities sum to 1  
- Higher score → higher probability  
- Perfect for multi-class prediction


In [1]:
import pandas as pd

df=pd.read_csv('pass_fail.csv')
df.head()

Unnamed: 0,Hours_Studied,Practice_Questions,Pass_Fail
0,7,0,1
1,4,10,1
2,8,27,1
3,5,24,1
4,7,49,1


In [3]:
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(df[['Hours_Studied','Practice_Questions']],df.Pass_Fail,test_size=0.2,random_state=10)


In [4]:
from sklearn.linear_model import LogisticRegression

model=LogisticRegression()

In [5]:
model.fit(X_train,y_train)

0,1,2
,penalty,'l2'
,dual,False
,tol,0.0001
,C,1.0
,fit_intercept,True
,intercept_scaling,1
,class_weight,
,random_state,
,solver,'lbfgs'
,max_iter,100


In [6]:
model.predict(X_test)

array([1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0])

In [7]:
model.score(X_test,y_test)

1.0