# Class 3 Notebook – Logistic Regression (Classification Basics)

This notebook introduces **Logistic Regression** as a first classification algorithm.

We’ll mirror the steps from the *Logistic_Regression_demo1* slides using a tiny example. If you have not already walked through the **Class 3 Linear Regression** notebook (`class-3-linear-regression-basics.ipynb`), start there first – the pattern is the same, just with a regression target instead of a classification label.

- **Objective**: Predict whether a student **passes or fails** based on **study hours**.
- **Model type**: Logistic Regression (binary classification).
- **Key idea**: The model outputs a **probability between 0 and 1**, then we turn that into a class (0 = Fail, 1 = Pass).

We’ll follow the same 10‑step supervised‑learning pattern you saw in Class 3:

1. Define the objective
2. Install / import libraries
3. Create a small dataset
4. Separate features and target
5. Train/test split
6. Create the model
7. Train the model
8. Make predictions
9. Evaluate the model
10. (Optional) Visualize and interpret

Run the next cell to make sure your environment works and the right libraries are installed.

In [None]:
# Environment sanity check + core classification libraries
import platform

print("Python:", platform.python_version())
print("OS:", platform.system(), platform.release())

try:
    import numpy as np
    import pandas as pd
    import matplotlib.pyplot as plt
    from sklearn.model_selection import train_test_split
    from sklearn.linear_model import LogisticRegression
    from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

    print("NumPy:", np.__version__, "| Pandas:", pd.__version__)
except ModuleNotFoundError as exc:
    print("Missing dependency:", exc)
    print("Install with: python -m pip install numpy pandas matplotlib scikit-learn")
    raise

In [None]:
# Step 1–4: Create a tiny dataset of study hours and pass/fail labels
# (0 = Fail, 1 = Pass)

# Study hours for 10 students
X = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Result: 0 = Fail, 1 = Pass (this is our label)
y = np.array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])

mydata = pd.DataFrame({
    "StudyHours": X,
    "Result": y,
})

mydata