# ***Train Test Split and First Machine Learning Model***

## Objective
Understand machine learning pipeline by training and evaluating a simple classification model.


### **1. Import Libraries**

In [1]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix


### **2. Sample Dataset**

In [2]:
data = {
    "age": [45, 54, 39, 61, 48, 52, 44, 59],
    "cholesterol": [233, 250, 204, 286, 210, 240, 220, 270],
    "blood_pressure": [140, 130, 120, 145, 135, 138, 128, 142],
    "target": [1, 0, 1, 0, 1, 0, 1, 0]
}

df = pd.DataFrame(data)
df


Unnamed: 0,age,cholesterol,blood_pressure,target
0,45,233,140,1
1,54,250,130,0
2,39,204,120,1
3,61,286,145,0
4,48,210,135,1
5,52,240,138,0
6,44,220,128,1
7,59,270,142,0


### **3. Feature / Target Split**

In [3]:
X = df.drop("target", axis=1)
y = df["target"]


### **4. Train Test Split**

In [4]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)


Train data is used to learn patterns.
Test data is used to evaluate model performance on unseen data.


### **5. Train Model**

In [5]:
model = LogisticRegression()
model.fit(X_train, y_train)


### **6. Prediction**

In [6]:
y_pred = model.predict(X_test)


### **7. Model Evaluation**

In [7]:
print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nClassification Report:\n", classification_report(y_test, y_pred))
print("\nConfusion Matrix:\n", confusion_matrix(y_test, y_pred))


Accuracy: 0.0

Classification Report:
               precision    recall  f1-score   support

           0       0.00      0.00      0.00       2.0
           1       0.00      0.00      0.00       0.0

    accuracy                           0.00       2.0
   macro avg       0.00      0.00      0.00       2.0
weighted avg       0.00      0.00      0.00       2.0


Confusion Matrix:
 [[0 2]
 [0 0]]


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


## Key Learnings
- Importance of train-test split
- Basic ML pipeline
- Logistic Regression for classification
- Model evaluation techniques

## Next Steps
- Try different ML models
- Learn feature scaling
- Apply pipeline on real datasets
