## Colin's Pipeline Experiments
### Objectives:
- Create a SingleImagePipeline so we can run fast-iteration experiments on just a single image
- Create a CentraPipeline that calls SingleImagePipeline
- Modularize the soccer_net_pipeline and migrate all operations to SingleImagePipeline
- Run all systematic ops from CentralPipeline (more than one tracklet)

### Main benefit from creating a `BatchImagePipeline` and a `CentralPipeline`
Decoupling large-scale systematic training versus testing. The main pipeline trains the model, creates output processed images and does so by systematically traversing all the tracklets. I want to be able to see what pre-processing is happening to the image so I can understand what is being fed to the model. I also just want to be able to pass a single raw image to the model so I can get fast results. Right now everything is coupled together by all the tracklets, and I don't want to traverse all of them or even a single tracklet, but maybe only a single image from a single tracklet.

### Quick Run Block

## Code

In [1]:
import sys
from pathlib import Path
import os

sys.path.append(str(Path.cwd().parent.parent))
print(str(Path.cwd().parent.parent))
print("Current working directory: ", os.getcwd())

from ModelDevelopment.CentralPipeline import CentralPipeline
from ModelDevelopment.ImageBatchPipeline import ImageBatchPipeline
from DataProcessing.DataPreProcessing import DataPaths

c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition
Current working directory:  c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\ModelDevelopment\experiments


In [2]:
# TODO: Add tracklets_override to pass a list of tracklets to process; tracklets_override=["8", "9", "10"]
# NOTE: It is recommended you delete the entire processed_data/{challenge/test/train} before running this to 100% avoid issues with old data.
# Furthermore, you should always restart your kernel before every new run because sometimes there are problems with paths
pipeline = CentralPipeline(
  num_tracklets=2,
  #num_images_per_tracklet=1,
  input_data_path=DataPaths.TRAIN_DATA_DIR.value,
  output_processed_data_path=DataPaths.PROCESSED_DATA_OUTPUT_DIR_TRAIN.value,
  common_processed_data_dir=DataPaths.COMMON_PROCESSED_OUTPUT_DATA_TRAIN.value,
  gt_data_path=DataPaths.TRAIN_DATA_GT.value,
  single_image_pipeline=False,
  display_transformed_image_sample=False, # NOTE: DO NOT USE. Code is parallelized so we cannot show images anymore. Code breaks, but first one will show if True.
  num_image_samples=1,
  use_cache=False, # Better set to false, not stable
  suppress_logging=False,
  num_workers=12
  )

2025-03-21 22:56:11 [INFO] DataPreProcessing initialized. Universe of available data paths:
2025-03-21 22:56:11 [INFO] ROOT_DATA_DIR: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted
2025-03-21 22:56:11 [INFO] TEST_DATA_GT: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\test\test_gt.json
2025-03-21 22:56:11 [INFO] TRAIN_DATA_GT: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\train\train_gt.json
2025-03-21 22:56:11 [INFO] TEST_DATA_DIR: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\test\images
2025-03-21 22:56:11 [INFO] TRAIN_DATA_DIR: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\train\images
2025-03-21 22:56:11 [INFO] CHALLENGE_DATA_DIR: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extr

In [3]:
pipeline.run_soccernet(
  run_soccer_ball_filter=True,
  generate_features=True,
  run_filter=True,
  run_legible=True,
  run_legible_eval=True,
  run_pose=True,
  run_crops=True,
  run_str=True,
  run_combine=True,
  run_eval=True)

2025-03-21 22:56:12 [INFO] Running the SoccerNet pipeline.
2025-03-21 22:56:12 [INFO] Using double parallelization: multiprocessing + CUDA batch processing.


Processing tracklets (CUDA + CPU):   0%|          | 0/2 [00:00<?, ?it/s]

2025-03-21 22:56:21 [INFO] DEBUG Number of images per tracklet (should be < max (1400+)): 578
2025-03-21 22:56:21 [INFO] Creating placeholder data files for Soccer Ball Filter.
2025-03-21 22:56:21 [INFO] Creating placeholder data files for Legibility Classifier.
2025-03-21 22:56:21 [INFO] Removed cached tracklet feature file (use_cache: False): c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\0\features.npy
2025-03-21 22:56:21 [INFO] Removed cached tracklet feature file (use_cache: False): c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\1\features.npy


Phase 1: Data Pre-Processing Pipeline Progress:   0%|          | 0/2 [00:00<?, ?it/s]

2025-03-21 22:56:21 [INFO] Determine soccer balls in image(s) using pre-trained model.
2025-03-21 22:56:21 [INFO] Determine soccer balls in image(s) using pre-trained model.
2025-03-21 22:56:21 [INFO] Found 0 balls, Ball list: []
2025-03-21 22:56:21 [INFO] Found 0 balls, Ball list: []


c:\Users\colin\miniconda3\envs\UBC\Lib\site-packages\pytorch_lightning\utilities\migration\migration.py:208: You have multiple `ModelCheckpoint` callback states in this checkpoint, but we found state keys that would end up colliding with each other after an upgrade, which means we can't differentiate which of your checkpoint callbacks needs which states. At least one of your `ModelCheckpoint` callbacks will not be able to reload the state.
Lightning automatically upgraded your loaded checkpoint from v1.1.4 to v2.5.0.post0. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\pre_trained_models\reid\dukemtmcreid_resnet50_256_128_epoch_120.ckpt`
Lightning automatically upgraded your loaded checkpoint from v1.1.4 to v2.5.0.post0. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint c:\Users\colin\OneDrive\Deskt

using GPU
using GPU
2025-03-21 22:57:01 [INFO] Saved features for tracklet with shape (578, 2048) to c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\0\features.npy
2025-03-21 22:57:01 [INFO] Identifying and removing outliers by calling gaussian_outliers_streamlined.py on feature file
2025-03-21 22:57:02 [INFO] Saved features for tracklet with shape (750, 2048) to c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\1\features.npy
2025-03-21 22:57:02 [INFO] Identifying and removing outliers by calling gaussian_outliers_streamlined.py on feature file
2025-03-21 22:57:08 [INFO] 
2025-03-21 22:57:08 [INFO] Done removing outliers
2025-03-21 22:57:08 [INFO] Running model chain on preprocessed image(s).
2025-03-21 22:57:08 [INFO] Classifying legibility of image(s) using pre-trained model.
2025-03-21 22:57:08 [INFO] Path checked: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Numb



2025-03-21 22:57:09 [INFO] 
2025-03-21 22:57:09 [INFO] Done removing outliers
2025-03-21 22:57:09 [INFO] Running model chain on preprocessed image(s).
2025-03-21 22:57:09 [INFO] Classifying legibility of image(s) using pre-trained model.
2025-03-21 22:57:09 [INFO] Path checked: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\common_data\soccer_ball.json
2025-03-21 22:57:27 [INFO] Saving legible_tracklets to: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\0\legible_results.json
2025-03-21 22:57:27 [INFO] Saved legible_tracklets to: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\0\legible_results.json
2025-03-21 22:57:27 [INFO] Saving illegible_tracklets to: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\0\illegible_results.json
2025-03-21 2

Phase 1: Data Pre-Processing Pipeline Progress:  50%|█████     | 1/2 [01:06<01:06, 66.41s/it]

2025-03-21 22:57:28 [INFO] Saving legible_tracklets to: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\1\legible_results.json
2025-03-21 22:57:28 [INFO] Saved legible_tracklets to: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\1\legible_results.json
2025-03-21 22:57:28 [INFO] Saving illegible_tracklets to: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\1\illegible_results.json
2025-03-21 22:57:28 [INFO] Saved illegible_tracklets to: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\1\illegible_results.json
2025-03-21 22:57:28 [INFO] Legibility classification complete.
2025-03-21 22:57:28 [INFO] Processed tracklet: 1


Phase 1: Data Pre-Processing Pipeline Progress: 100%|██████████| 2/2 [01:07<00:00, 33.68s/it]

2025-03-21 22:57:28 [INFO] Evaluating legibility results on 2 tracklets



Evaluating legibility: 100%|██████████| 2/2 [00:00<00:00, 9765.55it/s]

2025-03-21 22:57:28 [INFO] Correct 2 out of 2. Accuracy 100.0%.
2025-03-21 22:57:28 [INFO] TP=2, TN=0, FP=0, FN=0
2025-03-21 22:57:28 [INFO] Precision=1.0, Recall=1.0
2025-03-21 22:57:28 [INFO] F1=1.0
2025-03-21 22:57:28 [INFO] Generating json for pose
2025-03-21 22:57:28 [INFO] Aggregating legible & illegible results (cache not used or only one file is missing).
2025-03-21 22:57:28 [INFO] Saved global legible results to: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\common_data\legible_results.json
2025-03-21 22:57:28 [INFO] Saved global illegible results to: c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\processed_data\train\common_data\illegible_results.json
2025-03-21 22:57:28 [INFO] Done generating json for pose
2025-03-21 22:57:28 [INFO] Detecting pose





2025-03-21 22:58:33 [INFO] Current working directory:  c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\ModelDevelopment
apex is not installed
apex is not installed
apex is not installed
Show: False Out img root: 
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Total images: 313, Already processed: 0, Remaining: 313
Current working directory:  c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition
apex is not installed
apex is not installed
apex is not installed
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Current working directory:  c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition
apex is not installed
apex is not installed
apex is not installed
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
load checkpoint from local path: pose/ViTPose/checkpoints/vitpose-h.pth
load checkpoint from local path: pose/ViTPose/checkpoints/vitpose-h.pth
Saved combin

Generating crops: 100%|██████████| 313/313 [00:00<00:00, 6213.99it/s]

skipping c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\train\images\0\0_579.jpg, unreliable pointsskipping c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\train\images\0\0_580.jpg, unreliable points
skipping c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\train\images\0\0_582.jpg, unreliable points
skipping c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\train\images\0\0_581.jpg, unreliable points
skipping c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\train\images\0\0_583.jpg, unreliable points
skipping c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jersey-2023\extracted\train\images\0\0_584.jpg, unreliable points
skipping c:\Users\colin\OneDrive\Desktop\UBC\Jersey-Number-Recognition\data\SoccerNet\jer




2025-03-21 22:58:50 [INFO] Additional keyword arguments: {'charset_test': '0123456789'}


  return torch.load(f, map_location=map_location)  # type: ignore[arg-type]


STR Inference:   0%|          | 1/281 [00:00<02:02,  2.28it/s]
STR Inference:   3%|▎         | 9/281 [00:00<00:15, 17.61it/s]
STR Inference:   6%|▌         | 17/281 [00:00<00:09, 26.64it/s]
STR Inference:   9%|▉         | 25/281 [00:00<00:07, 32.23it/s]
STR Inference:  12%|█▏        | 33/281 [00:01<00:06, 35.61it/s]
STR Inference:  15%|█▍        | 41/281 [00:01<00:06, 37.34it/s]
STR Inference:  17%|█▋        | 49/281 [00:01<00:06, 38.50it/s]
STR Inference:  20%|██        | 57/281 [00:01<00:05, 39.49it/s]
STR Inference:  23%|██▎       | 65/281 [00:01<00:05, 40.02it/s]
STR Inference:  26%|██▌       | 73/281 [00:02<00:05, 40.30it/s]
STR Inference:  29%|██▉       | 81/281 [00:02<00:04, 41.55it/s]
STR Inference:  32%|███▏      | 89/281 [00:02<00:04, 42.86it/s]
STR Inference:  35%|███▍      | 97/281 [00:02<00:04, 44.31it/s]
ST