### Quick Run Block

In [1]:
# import torch
# from pytorch_lightning.callbacks import ModelCheckpoint

# # Path to the problematic checkpoint
# checkpoint_path = r"c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\pre_trained_models\reid\dukemtmcreid_resnet50_256_128_epoch_120.ckpt"

# # Allow loading of ModelCheckpoint objects (since PyTorch 2.6 blocks this by default)
# torch.serialization.add_safe_globals([ModelCheckpoint])

# # Load the checkpoint safely
# checkpoint = torch.load(checkpoint_path, map_location="cpu", weights_only=False)

# # Print out keys to inspect conflicting ones
# print("Checkpoint keys:", checkpoint.keys())

# # Remove ModelCheckpoint states (if they exist)
# keys_to_remove = [key for key in checkpoint.keys() if "ModelCheckpoint" in key]
# for key in keys_to_remove:
#     del checkpoint[key]

# # Save the cleaned checkpoint
# fixed_checkpoint_path = checkpoint_path.replace(".ckpt", "_fixed.ckpt")
# torch.save(checkpoint, fixed_checkpoint_path)

# print(f"Fixed checkpoint saved to: {fixed_checkpoint_path}")

## Code

In [2]:
import sys
from pathlib import Path
import os

sys.path.append(str(Path.cwd().parent.parent))
print(str(Path.cwd().parent.parent))
print("Current working directory: ", os.getcwd())

from ModelDevelopment.CentralPipeline import CentralPipeline
from ModelDevelopment.ImageBatchPipeline import ImageBatchPipeline
from DataProcessing.DataPreProcessing import DataPaths

c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition
Current working directory:  c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\ModelDevelopment\experiments


## Multithreaded Studies Research Findings
- ThreadPoolExecutor works best for CentralPipeline pre-processing step (i.e. everything before run_pose)
- Time taken to process one batch of 32 tracklets: 251 seconds => (251*32)/3600 = 2.35 hours ETA @ 200 images per batch cap, num_threads = 6x3 = 18
- Note on the GPU_SEMAPHORE constant inside ImageFeatureTransformPipeline: this is always capped at 1 so we only offload 1 batch of images to the GPU
- Since this gate is capped at 1, we can control the amount of data offloaded to the GPU via the image_batch_size param to CentralPipeline and only that
- Since caching is implemented, you can try setting the gate to 2 for even faster results, and if the process crashes, keep restarting with use_cache=True

In [3]:
# TODO: Add tracklets_override to pass a list of tracklets to process; tracklets_override=["8", "9", "10"]
# NOTE: It is recommended you delete the entire processed_data/{challenge/test/train} before running this to 100% avoid issues with old data.
# Furthermore, you should always restart your kernel before every new run because sometimes there are problems with paths
pipeline = CentralPipeline(
  tracklets_to_process_override=["34"],
  #num_tracklets=32,
  #num_images_per_tracklet=50,
  input_data_path=DataPaths.TEST_DATA_DIR.value,
  output_processed_data_path=DataPaths.PROCESSED_DATA_OUTPUT_DIR_TEST.value,
  common_processed_data_dir=DataPaths.COMMON_PROCESSED_OUTPUT_DATA_TEST.value,
  gt_data_path=DataPaths.TEST_DATA_GT.value,
  single_image_pipeline=False,
  display_transformed_image_sample=False, # NOTE: DO NOT USE. Code is parallelized so we cannot show images anymore. Code breaks, but first one will show if True.
  num_image_samples=1,
  use_cache=True, # Set to false if you encounter data inconsistencies.
  suppress_logging=False,
  
  # --- PARALLELIZATION PARAMS --- These settings are optimal for an NVIDIA RTX 3070 Ti Laptop GPU.
  num_workers=6,            # CRITICAL optimisation param. Adjust accordingly.
  tracklet_batch_size=32,   # CRITICAL optimisation param. Adjust accordingly. 
  image_batch_size=200,      # CRITICAL optimisation param. Adjust accordingly. 
  num_threads_multiplier=3  # CRITICAL optimisation param. Adjust accordingly. Interpretation: num_threads = num_workers * num_threads_multiplier
  )

2025-03-23 17:20:25 [INFO] DataPreProcessing initialized. Universe of available data paths:
2025-03-23 17:20:25 [INFO] ROOT_DATA_DIR: c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted
2025-03-23 17:20:25 [INFO] TEST_DATA_GT: c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\test\test_gt.json
2025-03-23 17:20:25 [INFO] TRAIN_DATA_GT: c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\train\train_gt.json
2025-03-23 17:20:25 [INFO] TEST_DATA_DIR: c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\test\images
2025-03-23 17:20:25 [INFO] TRAIN_DATA_DIR: c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\train\images
2025-03-23 17:20:25 [INFO] CHALLENGE_DATA_DIR: c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\challenge\images
2

In [4]:
pipeline.run_soccernet(
  run_soccer_ball_filter=True,
  generate_features=True,
  run_filter=True,
  run_legible=True,
  run_legible_eval=True,
  run_pose=True,
  run_crops=True,
  run_str=True,
  run_combine=True,
  run_eval=True)

2025-03-23 17:20:25 [INFO] Running the SoccerNet pipeline.
2025-03-23 17:20:25 [INFO] Tracklet override applied. Using provided tracklets: 34


  0%|          | 0/1 [00:00<?, ?it/s]

2025-03-23 17:20:25 [INFO] Tracklet batch size: 32
2025-03-23 17:20:25 [INFO] Image batch size: 200
2025-03-23 17:20:25 [INFO] Number of workers: 6
2025-03-23 17:20:25 [INFO] Number of threads created: 18
2025-03-23 17:20:25 [INFO] Using double parallelization: multiprocessing + CUDA batch processing.


Processing tracklets (CUDA + CPU):   0%|          | 0/1 [00:00<?, ?it/s]

2025-03-23 17:20:37 [INFO] Creating placeholder data files for Soccer Ball Filter.
2025-03-23 17:20:37 [INFO] Determine soccer balls in image(s) using pre-trained model.


Processing Batch Tracklets (0-32):   0%|          | 0/1 [00:00<?, ?it/s]

2025-03-23 17:20:37 [INFO] Found 1 balls, Ball list: ['34']


c:\Users\colin\miniconda3\envs\UBC\Lib\site-packages\pytorch_lightning\utilities\migration\migration.py:208: You have multiple `ModelCheckpoint` callback states in this checkpoint, but we found state keys that would end up colliding with each other after an upgrade, which means we can't differentiate which of your checkpoint callbacks needs which states. At least one of your `ModelCheckpoint` callbacks will not be able to reload the state.
Lightning automatically upgraded your loaded checkpoint from v1.1.4 to v2.5.0.post0. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\pre_trained_models\reid\dukemtmcreid_resnet50_256_128_epoch_120.ckpt`


2025-03-23 17:20:41 [INFO] Saved features for tracklet with shape (734, 2048) to c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\test\34\features.npy
2025-03-23 17:20:41 [INFO] Identifying and removing outliers by calling gaussian_outliers.py on feature file
2025-03-23 17:20:49 [INFO] 
2025-03-23 17:20:49 [ERROR] 
2025-03-23 17:20:49 [INFO] Done removing outliers
2025-03-23 17:20:49 [INFO] Running model chain on preprocessed image(s).
2025-03-23 17:20:49 [INFO] Classifying legibility of image(s) using pre-trained model.




2025-03-23 17:21:08 [INFO] Tracklet 34 is illegible.
2025-03-23 17:21:08 [INFO] Saving legible_tracklets to: c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\test\34\legible_results.json
2025-03-23 17:21:08 [INFO] Saved legible_tracklets to: c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\test\34\legible_results.json
2025-03-23 17:21:08 [INFO] Saving illegible_tracklets to: c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\test\34\illegible_results.json
2025-03-23 17:21:08 [INFO] Saved illegible_tracklets to: c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\test\34\illegible_results.json
2025-03-23 17:21:08 [INFO] Legibility classification complete.
2025-03-23 17:21:08 [INFO] Processed tracklet: 34
2025-03-23 17:21:08 [INFO] Evaluating legibility results on 1 tracklets
2025-03-23 17:21

Evaluating legibility:   0%|          | 0/1 [00:00<?, ?it/s]

2025-03-23 17:21:08 [INFO] Correct 0 out of 1. Accuracy 0.0%.
2025-03-23 17:21:08 [INFO] TP=0, TN=0, FP=0, FN=1
2025-03-23 17:21:08 [INFO] Precision=0, Recall=0.0
2025-03-23 17:21:08 [INFO] F1=0
2025-03-23 17:21:08 [INFO] Generating json for pose
2025-03-23 17:21:08 [INFO] Reading legible & illegible results from cache (both global files exist).
2025-03-23 17:21:08 [INFO] Done generating json for pose
2025-03-23 17:21:09 [INFO] Detecting pose
2025-03-23 17:21:32 [INFO] Current working directory:  c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\pose\ViTPose
apex is not installed
apex is not installed
apex is not installed
Current working directory:  c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition
False 
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
pose/ViTPose/configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/ViTPose_huge_coco_256x192.py
pose/ViTPose/checkpoints/vitpose-h.pth
load checkpoint from local path: pose/ViTPose/che

Generating crops:   0%|          | 0/3266 [00:00<?, ?it/s]

skipping c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\test\images\5\5_35.jpg, unreliable points
skipping c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\test\images\5\5_36.jpg, unreliable points
skipping c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\test\images\5\5_39.jpg, unreliable points
skipping c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\test\images\5\5_40.jpg, unreliable points
skipping c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\test\images\5\5_48.jpg, unreliable points
skipping c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\test\images\6\6_125.jpg, unreliable points
skipping c:\Users\colin\OneDrive\Desktop\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\test\images\6\6_14.