#Lab Exercise: Introduction to Supervised, Unsupervised
##Objective:
- Understand and implement basic examples of supervised, unsupervised, and reinforcement learning.

- Use common datasets and simple models for clarity.

- Observe and interpret results.

In [1]:
# Setup: Import libraries and load datasets
# Common imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# For supervised and unsupervised tasks
from sklearn.datasets import load_iris, load_digits
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, adjusted_rand_score
from sklearn.cluster import KMeans

# For reinforcement learning task
!pip install gymnasium --quiet
import gymnasium as gym


In [2]:
# Part 1: Supervised Learning — Iris Dataset Classification
# Task:
# Train a Logistic Regression model to classify Iris flower species.
# Load data
iris = load_iris(as_frame=True)
X, y = iris.data, iris.target

# Split data (70% train, 30% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Train logistic regression
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
print("Classification Report:\n", classification_report(y_test, y_pred))
print("Test Accuracy:", accuracy_score(y_test, y_pred))


Classification Report:
               precision    recall  f1-score   support

           0       1.00      1.00      1.00        19
           1       1.00      1.00      1.00        13
           2       1.00      1.00      1.00        13

    accuracy                           1.00        45
   macro avg       1.00      1.00      1.00        45
weighted avg       1.00      1.00      1.00        45

Test Accuracy: 1.0


##What you learn:

- How a supervised model learns from labeled data.

- How to evaluate classification performance using accuracy and detailed reports.

In [3]:
# Part 2: Unsupervised Learning — K-Means Clustering on Digits Dataset
#Task:
 #Cluster digit images into groups without label information.

# Load digits data
digits = load_digits()
X_digits, y_digits = digits.data, digits.target

# K-means clustering (10 clusters for 10 digits)
kmeans = KMeans(n_clusters=10, random_state=42, n_init=10)
clusters = kmeans.fit_predict(X_digits)

# Evaluate clustering quality against true labels (adjusted rand index)
ari_score = adjusted_rand_score(y_digits, clusters)
print(f"Adjusted Rand Index (measures cluster-label agreement): {ari_score:.4f}")


Adjusted Rand Index (measures cluster-label agreement): 0.6669


**Breakdown:**

Precision = TP / (TP + FP)
How many predicted positives are actually correct?

Recall = TP / (TP + FN)
How many actual positives did the model catch?

F1-score = Harmonic mean of precision and recall
Good balance when classes are imbalanced

The report also gives:

Support = actual number of instances per class

Macro avg = average over all classes (unweighted)

Weighted avg = weighted by support (good when class sizes vary)



## What you learn:

- How unsupervised learning discovers data structure without labels.

- The concept of clustering and how results can be evaluated despite no training labels.

Unsupervised learning involves:

No labels (no target output).

The algorithm finds structure or patterns in the input data.

Examples include:

Clustering: grouping similar items (e.g., KMeans).

Dimensionality reduction: simplifying high-dimensional data (e.g., PCA).

In [4]:
##  reinforcement learning

import gymnasium as gym
env = gym.make("CartPole-v1", render_mode="human")
obs, info = env.reset()


**Adjusted Rand Index (ARI)**
📌 What is ARI?
Rand Index (RI) measures how similar the clustered labels are to the true labels.

ARI is a corrected-for-chance version of RI.

ARI = 1 → perfect match

ARI = 0 → random labeling

ARI < 0 → worse than random

In [5]:
from sklearn.metrics import adjusted_rand_score

# y_digits = true labels, clusters = predicted clusters
ari_score = adjusted_rand_score(y_digits, clusters)
print("Adjusted Rand Index:", ari_score)


Adjusted Rand Index: 0.6669121092859385
