### **Unsupervised Learning**

**Data preparation**
- convert images into feature vectors (ensure normalization for normalization and preprocessing for consistency)

**Clustering**
- explore K-means, hierarchical clustering, or DBSCAN to see which one is best at identify patterns
- choose optimal number of clusters (K) using techniques like the elbow method or silhouette score
- evaluate performance and visualize clustering results for insights into data distribution and cluster relationships

**PCA**
- reduce dimensionality and select number of principal components based on variance ratio
- analyze contribution of each component and assess impact on data structure

**(Optional) combination of clustering and pca**
- compare performance using original features vs. PCA-reduced features

**Integration with supervised learning**
- explore clustering and pca results (and the optional combination of them) as additional inputs to P2_supervised model (based off EfficientNet)
- evaluate enhancement of supervised learning performance

**Analyze Model Performance**
- plot loss and accuracy over epochs to visualize training progress and identify potential overfitting or underfitting.
- create a confusion matrix to examine how well the model distinguishes between classes.

In [2]:
# importing all necessary libraries
import glob
import warnings
import random
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy

import torch
import torchvision
import torchvision.transforms as transforms
from torchvision.models import VGG19_Weights, EfficientNet_V2_L_Weights
from torch.utils.data import Dataset, DataLoader
from torchinfo import summary
from tqdm.notebook import tqdm

from PIL import Image
from typing import List, Tuple
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans, AgglomerativeClustering, DBSCAN