## **Unsupervised Learning: Human Activity Recognition Analysis**

### **1. Project Overview**

**Description:**
The goal of this project is to analyze the **Human Activity Recognition (HAR)** dataset using unsupervised learning techniques. The dataset consists of sensor data (accelerometer and gyroscope) collected from smartphones carried by participants performing six different physical activities.

**Objectives:**
- Exploratory Data Analysis (EDA): Understand the data structure, class balance, and feature distributions.

- Dimensionality Reduction: Visualize the high-dimensional data (561 features) in 2D space using PCA and t-SNE.

- Clustering: Apply the K-Means algorithm to group activities without using label information.

- Evaluation: Assess the model's ability to distinguish between static and dynamic activities by comparing clusters with ground truth labels.

### **2. Imports**
Importing necessary libraries for data manipulation, visualization, and machine learning.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from ucimlrepo import fetch_ucirepo

from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
from sklearn.cluster import KMeans
from sklearn.metrics import confusion_matrix, classification_report, adjusted_rand_score

### **3. Data Load**

Loading the UCI HAR dataset. Since this is an unsupervised learning task, we will merge the training and testing sets to maximize the amount of data available for clustering.

- Input: Raw text files from the UCI dataset.

- Output: A combined feature matrix X and a label vector y (used only for validation).

In [None]:
har_data = fetch_ucirepo(id=240)

X = har_data.data.features
y = har_data.data.target

y = y.values.flatten()
