# **Unsupervised Learning**

Unsupervised learning is a type of machine learning where the model works with unlabeled data. The algorithm identifies patterns, structures, or clusters within the data without explicit instructions on what to predict.

## Characteristics

- **No labeled data required**: The algorithm discovers hidden structures in the data.
- **Goal**: Group or cluster similar data points and find underlying patterns.
- **Applications**: Clustering, association, and dimensionality reduction.

##
---

## Workflow

1. **Data Collection**:
   - Gather unlabeled data.
2. **Model Training**:
   - Use algorithms to identify patterns or group data.
3. **Evaluation**:
   - Evaluate using metrics like silhouette score, inertia, or visual inspection.
4. **Application**:
   - Use discovered patterns for decision-making or further analysis.

##
---

## Techniques

### Clustering

Clustering groups data points into clusters based on similarity.

- **Algorithms**:
  - K-Means
  - DBSCAN (Density-Based Spatial Clustering)
  - Hierarchical Clustering

- **Example**:
  Segment customers based on purchasing behavior.

  ```python
  from sklearn.cluster import KMeans
  import numpy as np

  # Sample Data
  data = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])

  # Train Model
  kmeans = KMeans(n_clusters=2, random_state=0).fit(data)

  # Predict Clusters
  print("Cluster Centers:", kmeans.cluster_centers_)
  print("Labels:", kmeans.labels_)
  ```

### Dimensionality Reduction


Dimensionality reduction reduces the number of features while retaining important information.

- **Algorithms**:
  - PCA (Principal Component Analysis)
  - t-SNE (t-Distributed Stochastic Neighbor Embedding)

- **Example**:
  Visualizing high-dimensional data in 2D or 3D space.

  ```python
  from sklearn.decomposition import PCA
  import numpy as np

  # Sample Data
  data = np.random.rand(100, 5)  # 5-dimensional data

  # Reduce to 2 Dimensions
  pca = PCA(n_components=2)
  reduced_data = pca.fit_transform(data)

  print("Reduced Data:", reduced_data[:5])
  ```


##
---

## Applications


1. **Customer Segmentation**:
   - Group customers based on purchasing behavior.
   - Techniques: K-Means, Hierarchical Clustering.

2. **Anomaly Detection**:
   - Identify outliers or unusual patterns in data.
   - Techniques: Isolation Forest, DBSCAN.

3. **Document Clustering**:
   - Group similar documents for topic analysis.
   - Techniques: Latent Dirichlet Allocation (LDA).

4. **Image Compression**:
   - Reduce image size while preserving important features.
   - Techniques: PCA.

##
---

Unsupervised learning is widely used for exploratory data analysis and has numerous practical applications in areas such as marketing, fraud detection, and biology. It helps discover the hidden structures within data without requiring labels.

## Unsupervised Learning Models

- [K-Means Clustering](../Unsupervised%20Learning/01%20-%20K-Means%20Clustering%20Algorithm.ipynb)
- [Hierarchical Clustering](../Unsupervised%20Learning/03%20-%20Hierarchical%20Clustering.ipynb)
- [Principal Component Analysis (PCA)](../Unsupervised%20Learning/05%20-%20Principal%20Component%20Analysis%20(PCA).ipynb)
- Singular Value Decomposition
- Independent Component Analysis

[More About Unupervised Learning Models & Algorithms](Unupervised%20Learning/)

##
---