# Exploratory Data Analysis

This project utilizes the [**Drowning Detection Dataset**](https://github.com/Wang-Kaikai/drowning-detection-dataset) sourced from the open-access repository provided by Wang Kaikai et al.

This dataset was originally introduced in their research paper:
[**A Dataset for Drowning Detection Based on Surveillance Videos**](https://doi.org/10.3390/app14010200)

The dataset consists of labeled surveillance video clips that represent various human states in aquatic environments, including:

- **Drowning**
- **Treading Water**
- **Swimming**

In this section, we aim to perform a thorough exploratory data analysis (EDA) to understand:

- The class distribution across different video segments.
- Sample visualizations for each behavior class.

This analysis serves as a foundation for designing robust detection and classification models in the later stages.


In [1]:
from dotenv import load_dotenv

load_dotenv()

import sys
import os

sys.path.append(os.getenv('SRC_DIR'))

from utils.system import display_system_info

display_system_info(markdown=True)


**Last Updated**: 2025-04-17 04:10:07

**Python Version**: 3.11.11  
**OS**: Windows 10.0.26100  
**Architecture**: 64bit  
**Hostname**: ShenLaptop  
**Processor**: Intel64 Family 6 Model 186 Stepping 3, GenuineIntel  
**RAM Size**: 15.65 GB  
  
        

In [None]:
import cv2

import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns

According to the paper by [Wang KaiKai et al.](https://doi.org/10.3390/app14010200), the label numbers are corresponding to the class labels as follows:

| Label Number | Class Name  |
|--------------|-------------|
| 0            | Swimming    |
| 1            | Treading Water |
| 2            | Drowning    |

## Class Label Distribution

Here we check the number of instances in each class

False

## Sample Image Visualisation

This section provides a visual overview of sample frames extracted from the drowning detection dataset. Each image is annotated with bounding boxes indicating detected individuals, which correspond to different classes, that is *swimming*, *treading water*, and *drowning*.


## Sample ROI for Classes

Here we display some sample cropped images for each class

Therefore, we can further confirm that the class labels are as follows:

| Label Number | Class Name  |
|--------------|-------------|
| 0            | Swimming    |
| 1            | Treading Water |
| 2            | Drowning    |

## Understanding Image Channels

To better understand how different color representations impact image perception and feature extraction, we visualize the individual channel components across various color spaces:

- **RGB**: Red, Green, Blue  
- **HSV**: Hue, Saturation, Value  
- **LAB**: Lightness, A (green–red), B (blue–yellow)  

By visualizing these channels independently, we can observe how each component contributes to the image, aiding in preprocessing and model interpretation tasks.
