# Analysis of PCA Loadings and Clustering Results

### Top Contributors to PCA Components

Below are the features with the highest contributions to PCA1 and PCA2, sorted by weight:

| **Feature**                                   | **Weight (PCA1)** | **Weight (PCA2)** |
|-----------------------------------------------|--------------------|--------------------|
| mode_Voiture / Moto                           | 0.5445             | -                  |
| speed                                         | 0.4852             | -                  |
| purpose_Santé                                 | 0.1341             | -                  |
| latitude                                      | 0.1293             | -                  |
| purpose_Loisir                                | 0.1162             | -                  |
| purpose_Reconduire / aller chercher une personne | 0.0824             | -                  |
| purpose_Magasinage / emplettes                | 0.0726             | -                  |
| mode_À pied, Transport collectif, Voiture / Moto | 0.0504             | 0.1134             |
| mode_Transport collectif, Vélo               | 0.0040             | -                  |
| purpose_Travail / Rendez-vous d'affaires      | -0.0017            | 0.6970             |
| mode_Transport collectif                      | -                  | 0.1496             |
| mode_Vélo                                     | -                  | 0.0579             |
| purpose_Éducation                             | -                  | 0.0224             |
| mode_À pied, Transport collectif              | -                  | 0.0141             |
| mode_À pied                                   | -                  | 0.0074             |
| purpose_Repas / collation / café              | -                  | 0.0059             |
| mode_À pied, Vélo                             | -                  | 0.0029             |
| altitude                                      | -                  | -0.0023            |

## PCA Loadings Analysis

PCA loadings reveal how much each original feature contributes to the newly derived principal components (PCA1 and PCA2). Let’s examine the findings in detail.

### PCA1 (Mode and Speed Axis)
- **Dominant Features**:
  - **`mode_Voiture / Moto` (Weight: 0.5445)**:
    - The highest contributor to PCA1.
    - Indicates private vehicle usage (cars/motorbikes) is a key differentiator along this axis.
    - Associated with faster travel speeds and more flexibility.
  - **`speed` (Weight: 0.4852)**:
    - Strong correlation with motorized travel modes.
    - Highlights that faster travel modes, such as cars or motorcycles, dominate this component.
  - **`purpose_Santé` (Health-related Purpose; Weight: 0.1341)**:
    - Indicates that trips related to health (e.g., hospital visits) are often associated with faster modes.
  - **`latitude` (Weight: 0.1293)**:
    - Represents spatial variation.
    - Shows that the geographical distribution of trips plays a moderate role in the clusters.
  - **`purpose_Loisir` (Leisure-related Purpose; Weight: 0.1162)**:
    - Leisure trips, like health-related trips, are often associated with motorized transport.
  - **`purpose_Reconduire / aller chercher une personne` (Pick-up/Drop-off Purpose; Weight: 0.0824)**:
    - Reflects the importance of such trips in differentiating motorized clusters.

- **Interpretation**:
  - PCA1 separates clusters primarily based on **travel speed**, **mode of transport**, and **trip purpose**.
  - Dominated by fast travel modes, such as cars and motorcycles.

---

### PCA2 (Purpose and Public Transport Axis)
- **Dominant Features**:
  - **`purpose_Travail / Rendez-vous d'affaires` (Work-related Purpose; Weight: 0.6970)**:
    - The most influential feature for PCA2.
    - Strongly aligns with work-related trips, indicating commuting and business-related patterns.
  - **`mode_Transport collectif` (Public Transport; Weight: 0.1496)**:
    - Reflects the dominance of public transport in commuting patterns.
    - Highlights that work trips often involve buses, trains, or shared transit.
  - **`mode_À pied, Transport collectif, Voiture / Moto` (Multi-modal; Weight: 0.1134)**:
    - Suggests some work trips involve a combination of walking and motorized transport.
  - **`mode_Vélo` (Cycling; Weight: 0.0579)**:
    - Indicates that cycling trips contribute to PCA2, but less prominently than public transit.
  - **`purpose_Éducation` (Education-related Purpose; Weight: 0.0224)**:
    - Low contribution but shows similarity with work-related trips in commuting patterns.

- **Negative Contributions**:
  - **`altitude` (Weight: -0.0023)**:
    - Indicates that altitude does not significantly influence clustering.
  - Other low weights suggest minimal spatial or geographical influence on work-related trips.

- **Interpretation**:
  - PCA2 highlights differences in **trip purpose** (especially work) and **transport modes** (e.g., public transport vs. private vehicles).
  - Strong focus on work-related trips and commuting behavior.

---

### Key Observations from PCA Loadings
- **Travel Mode Dominance**:
  - PCA1 focuses on private vehicles and speed-related features.
  - PCA2 emphasizes public transport and work-related patterns.
- **Purpose-Specific Clusters**:
  - Leisure and health-related trips align with PCA1.
  - Work and education-related trips align with PCA2.
- **Geographical Impact**:
  - Latitude plays a moderate role, but altitude has negligible influence.

---

## Trends in Clustering Results

### K-Means Clustering
- **Cluster Overview**:
  - K-Means splits the data into three clusters based on PCA1 and PCA2.
  - Cluster distinctions are sharp, with boundaries defined by **travel modes** and **trip purposes**.

- **Trends Observed**:
  - **Cluster 0**:
    - Represents trips with high-speed motorized travel (e.g., cars).
    - Dominated by leisure and health-related purposes.
  - **Cluster 1**:
    - Includes trips with medium-speed modes (e.g., cycling, walking).
    - Shows a mix of purposes, including education and errands.
  - **Cluster 2**:
    - Focused on work-related trips using public transport.
    - Represents commuting patterns with multi-modal travel.

- **Insights**:
  - K-Means effectively separates clusters based on both **speed** and **trip purpose**.
  - Cluster boundaries reflect behavior differences in travel modes.

---

### DBSCAN Clustering
- **Cluster Overview**:
  - DBSCAN identifies density-based clusters, forming natural groupings in the data.
  - It detects:
    - **Cluster 0**: Dense core of trips with similar characteristics.
    - **Cluster 1**: Smaller cluster of unique, possibly slower trips.
    - **Noise Points (-1)**: Outliers, representing trips with unique or extreme characteristics.

- **Trends Observed**:
  - DBSCAN highlights:
    - **Cluster 0**: Majority of trips involving commuting or routine travel patterns.
    - **Cluster 1**: Sparse trips that are slower or geographically distinct.
    - **Noise Points**: Outliers that may represent errors or exceptional travel behavior.

- **Insights**:
  - DBSCAN excels at identifying **outliers** and natural groupings.
  - It is less rigid than K-Means, making it ideal for analyzing non-uniform data.

---

## Combined Insights

### Mode-Purpose Relationship
- Fast travel modes (e.g., cars) align with health and leisure purposes (PCA1).
- Public transport and cycling are common in work and education trips (PCA2).

### Spatial and Temporal Influence
- Latitude contributes moderately to clustering, but altitude is negligible.

---

### Final Remarks
Both clustering methods provide valuable insights. K-Means excels in defining distinct behavioral clusters, while DBSCAN is effective for identifying outliers and natural groupings. The PCA analysis highlights the critical role of travel modes and purposes in differentiating clusters.


![dbscan_clusters.png](attachment:a17c091a-414e-4fb8-927e-3d31c064e24b.png)

![kmeans_clusters.png](attachment:b8ecf00c-37f4-4c63-b8eb-45c1c1d61716.png)