

# Title: Dataset Preparation and Preprocessing for BD Sports-10


## Dataset Access:


###  We have used the resized version (224×224 pixels) of the dataset for experimentation. However, you can also access the original high-resolution version (1920×1080 pixels).


### 1) Resized Version (224×224): Mendeley Data: https://data.mendeley.com/datasets/rnh3x48nfb/1

### 2) Original Version (1920×1080): Science Data Bank: https://doi.org/10.57760/sciencedb.24216

  

## Citation:


### When using this dataset, please do not forget to cite the following:


#### 1) Tanzim, Wazih Ullah; Minhaz Hossain, Syed Md.; Supta, Niloy Barua; Shifa, Shifatun Nur (2025).DOI: 10.17632/rnh3x48nfb.1



#### 2) Wazih Ullah Tanzim, Syed Md. Minhaz Hossain, Niloy Barua Supta, et al. (2025).DOI: 10.57760/sciencedb.24216






### Step-by-Step Description

1. **Install dependencies**

   * Install required Python libraries:

     ```bash
     !pip install --upgrade tensorflow
     !pip install -q git+https://github.com/tensorflow/docs
     !pip install torch torchvision pillow
     ```
   * These packages enable deep learning (`tensorflow`, `torch`), dataset visualization (`tensorflow-docs`), and image/video handling (`pillow`, `opencv-python`).

2. **Import necessary libraries**

   * `cv2` for video frame extraction and resizing.
   * `tensorflow` and `keras` for data preprocessing and model-ready tensors.
   * `numpy` for numerical operations.
   * `pandas` for handling metadata.
   * `tqdm` for progress bars.
   * `sklearn.model_selection.train_test_split` for splitting dataset.

3. **Define configuration**

   * Set global hyperparameters:

     * Epochs: `100`
     * Batch size: `32`
     * Classes: `10` categories of Bangladeshi indigenous sports (Hari Vanga, Joldanga, Kabaddi, Kanamachi, Kho Kho, Kolagach, Lathim, Lathi Khela, Morog Lorai, Nouka Baich).
   * `num_classes` automatically calculated from the class list.

4. **Frame formatting**

   * Each frame is resized to **224×224 pixels with padding** using TensorFlow.
   * This ensures uniform input dimensions while preserving aspect ratio.

5. **Frame extraction from videos**

   * Videos are read using `cv2.VideoCapture`.
   * From each video, a fixed number of **10 frames** are extracted.
   * Frames are sampled with a `frame_step=15` to avoid redundancy.
   * If the video is shorter than required, zero-padded frames are added.
   * Frames are converted from **BGR → RGB** format for compatibility with deep learning models.

6. **Train–test split**

   * Videos from each class folder are collected (`.mp4` format).
   * Data is shuffled for randomness.
   * An **80–20 split** is applied:

     * **80% training set**
     * **20% test set**
   * Targets (class labels) are stored as numeric indices.

7. **Feature extraction**

   * For each video in training and test sets:

     * Frames are extracted.
     * Each video is represented as a **NumPy array of shape**.
   * This forms a dataset ready for model input.

8. **Validation split**

   * Training set is further split into **training (80%)** and **validation (20%)** using `train_test_split`.
   * Ensures independent validation for model performance.

9. **Convert to TensorFlow Datasets**

   * Training, validation, and test arrays are converted into `tf.data.Dataset` objects.
   * Each dataset is:

     * **Shuffled** for randomness.
     * **Batched** according to `CFG.batch_size`.
     * **Cached & Prefetched** to optimize GPU training speed.

10. **Final dataset structure**

    * **Training dataset**: videos → frames → tensors.
    * **Validation dataset**: subset of training set for tuning.
    * **Test dataset**: unseen videos for evaluation.
    * All inputs standardized to **(224×224×3)**, with **10 frames per video**.

---


In [None]:
pip install --upgrade tensorflow

In [None]:
import tensorflow as tf
import tensorflow_hub as hub
import keras

print("TensorFlow version:", tf.__version__)
print("TensorFlow Hub version:", hub.__version__)
print("Keras version:", keras.__version__)

In [None]:
pip install -q git+https://github.com/tensorflow/docs

In [None]:
import glob
import pandas as pd
import cv2
import gc
import numpy as np
import random
import imageio
import tensorflow as tf
from sklearn.model_selection import train_test_split
from tqdm.notebook import tqdm
from tensorflow_docs.vis import embed
import matplotlib.pyplot as plt
import imageio
from IPython.display import display, Image

In [None]:
pip install torch torchvision

In [None]:
pip install pillow

In [None]:
class CFG:
    epochs = 100
    batch_size = 32
    classes = ["Hari_Vanga","Joldanga", "Kabaddi", "Kanamachi","Kho_Kho", "Kolagach","Lathim","Lathi_Khela","Morog_Lorai", "Nouka_Baich"]

In [None]:
num_classes = len(CFG.classes)
print(num_classes)

# Convert Image Dtype and Resize with Padding

In [None]:
def format_frames(frame, output_size):
  """
    Pad and resize an image from a video.

    Args:
      frame: Image that needs to resized and padded. 
      output_size: Pixel size of the output frame image.

    Return:
      Formatted frame with padding of specified output size.
  """
  frame = tf.image.convert_image_dtype(frame, tf.float32)
  frame = tf.image.resize_with_pad(frame, *output_size)
  return frame

# Frame Extraction from Video

In [None]:
def frames_from_video_file(video_path, n_frames, output_size = (224,224), frame_step = 15):
  """
    Creates frames from each video file present for each category.

    Args:
      video_path: File path to the video.
      n_frames: Number of frames to be created per video file.
      output_size: Pixel size of the output frame image.

    Return:
      An NumPy array of frames in the shape of (n_frames, height, width, channels).
  """
  # Read each video frame by frame
  result = []
  src = cv2.VideoCapture(str(video_path))  

  video_length = src.get(cv2.CAP_PROP_FRAME_COUNT)

  need_length = 1 + (n_frames - 1) * frame_step

  if need_length > video_length:
    start = 0
  else:
    max_start = video_length - need_length
    start = random.randint(0, max_start + 1)

  src.set(cv2.CAP_PROP_POS_FRAMES, start)
  # ret is a boolean indicating whether read was successful, frame is the image itself
  ret, frame = src.read()
  result.append(format_frames(frame, output_size))

  for _ in range(n_frames - 1):
    for _ in range(frame_step):
      ret, frame = src.read()
    if ret:
      frame = format_frames(frame, output_size)
      result.append(frame)
    else:
      result.append(np.zeros_like(result[0]))
  src.release()
  result = np.array(result)[..., [2, 1, 0]]

  return result

# Train Test Split

In [None]:
import os
import random

train_file_paths = []
test_file_paths = []

train_targets = []
test_targets = []

# For reproducibility
random.seed(42)

# Assuming CFG.classes contains the class names in order
for i, cls in enumerate(CFG.classes):
    class_dir = f"/kaggle/input/bd-sports-10-dataset-224x224-pixels-resized/BD_Sports_10/Dataset/{cls}"
    
    if os.path.exists(class_dir):
        # Collect all mp4 files from this class
        files = [os.path.join(class_dir, f) for f in os.listdir(class_dir) if f.endswith('.mp4')]
        
        # Shuffle before splitting
        random.shuffle(files)
        
        # 80% train, 20% test split
        split_idx = int(0.8 * len(files))
        train_files = files[:split_idx]
        test_files = files[split_idx:]
        
        # Add to lists
        train_file_paths.extend(train_files)
        test_file_paths.extend(test_files)
        
        train_targets.extend([i] * len(train_files))
        test_targets.extend([i] * len(test_files))

# Print sample file paths and targets to verify
print("\033[1mTraining file paths:\033[0m")
print(train_file_paths[:5])

print("\n\033[1mTest file paths:\033[0m")
print(test_file_paths[:5])

print("\n\033[1mTrain Targets (first few):\033[0m")
print(train_targets)

print("\n\033[1mTest Targets (first few):\033[0m")
print(test_targets)

# Print total counts
print("\n\033[1mTotal Training files:\033[0m", len(train_file_paths))
print("\033[1mTotal Test files:\033[0m", len(test_file_paths))


# Extract Train Feature

In [None]:
train_features = []
for train_file_path in tqdm(train_file_paths):
    train_features.append(frames_from_video_file(train_file_path, n_frames = 10))
train_features = np.array(train_features)

# Extract Test Feature

In [None]:
test_features = []
for test_file_path in tqdm(test_file_paths):
    test_features.append(frames_from_video_file(test_file_path, n_frames = 10))
test_features = np.array(test_features)

# Creating training feature, training targets, validation feature, validation targets

In [None]:
training_features, validation_features, training_targets,validation_targets = train_test_split(train_features, train_targets, test_size=0.2, random_state=42)


In [None]:
print(
    f"Training features shape:    {training_features.shape}\n"
    f"Validation features shape:  {validation_features.shape}\n"
    f"Training targets length:    {len(training_targets)}\n"
    f"Validation targets length:  {len(validation_targets)}"
)


In [None]:
print(
    f"Test features shape: {test_features.shape}\n"
    f"Test targets length: {len(test_targets)}"
)


# Prepare Train and Validation Data

In [None]:
train_ds = tf.data.Dataset.from_tensor_slices((training_features, training_targets)).shuffle(CFG.batch_size * 4).batch(CFG.batch_size).cache().prefetch(tf.data.AUTOTUNE)

valid_ds = tf.data.Dataset.from_tensor_slices((validation_features,validation_targets)).batch(CFG.batch_size).cache().prefetch(tf.data.AUTOTUNE)

# Prepare Test Data

In [None]:
# Prepare test dataset
test_ds = tf.data.Dataset.from_tensor_slices((test_features, test_targets)).batch(CFG.batch_size).cache().prefetch(tf.data.AUTOTUNE)

# Now, The Dataset is Totally Prepared to fit into the Deep Learning Model