# Image Preprocessing Usage Example

In [1]:
%load_ext autoreload
%autoreload 2

## Approach 1: Extract All Images at Once (Non-Batch Approach)
__When to Use:__
* For small to medium datasets that can be processed in one pass.
* Simple use case with no need for progress saving in between.

In [2]:
from vtt.data.image_preprocessing import (
    extract_features_from_directory,
    save_features,
    load_features,
)

2025-07-14 18:26:52.867105: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-07-14 18:26:52.878990: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1752532012.890544  107326 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1752532012.893538  107326 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1752532012.901997  107326 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking 

In [3]:
# Path to the image directory
image_dir = "../../data/flickr8k_images/subset/"  # Subset of 100 images

# Path to the output file
output_file = "../../data/processed/flickr8k_features_nonbatch.npz"

# Extract features for all images in the directory
features = extract_features_from_directory(image_dir)

# Save the full dictionary of features to disk
save_features(features, output_file)

[INFO] Found 100 image(s) in '../../data/flickr8k_images/subset/'.


2025-07-14 18:26:55.089337: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)
Extracting image features: 100%|██████████████████████████████████████████████████████| 100/100 [00:15<00:00,  6.67it/s]


In [4]:
# Load the feature dictionary
features = load_features(output_file)

# Print the number of entries and preview the first 5
print(f"Total images in feature file: {len(features)}\n")

print("Previewing first 5 image feature entries:\n")
for i, (img_name, feature_vector) in enumerate(features.items()):
    print(f"{i+1}. {img_name} -> shape: {feature_vector.shape}")
    print(feature_vector[:5], "...")  # Show first 5 elements for brevity
    print()
    if i == 4:
        break

Total images in feature file: 100

Previewing first 5 image feature entries:

1. 101654506_8eb26cfb60.jpg -> shape: (2048,)
[0.0473685  0.         0.34836045 0.         1.9708986 ] ...

2. 101669240_b2d3e7f17b.jpg -> shape: (2048,)
[0.00277199 0.01376122 0.00498268 0.05396137 0.19276744] ...

3. 102351840_323e3de834.jpg -> shape: (2048,)
[0.00125399 0.03713456 0.07491528 0.05316498 0.70476127] ...

4. 102455176_5f8ead62d5.jpg -> shape: (2048,)
[0.13471708 0.24836668 0.27189383 0.04021228 0.9100197 ] ...

5. 103106960_e8a41d64f8.jpg -> shape: (2048,)
[0.00936221 0.10794617 0.33646026 0.24238296 0.4932172 ] ...



## Approach 2: Extract Features in Batches (Batch Approach)
__When to use:__
* For large datasets that could exceed memory limits.
* If you want to recover progress after a crash or resume later.

In [5]:
from vtt.data.image_preprocessing import (
    extract_features_in_batches,
    combine_feature_batches,
    save_features,
    load_features,
)

In [6]:
# Path to the image directory
image_dir = "../../data/flickr8k_images/subset/"  # Subset of 100 images

# Output directory to store partial .npz files
output_dir = "../../data/processed/batches"

# Set batch size (i.e., number of images to process at once)
batch_size = 30

# Process images in batches and save features as separate .npz files
extract_features_in_batches(image_dir, batch_size, output_dir)

Processing batches:   0%|                                                                         | 0/4 [00:00<?, ?it/s]

[INFO] Processing batch 1 of 4 (30 images)



Extracting image features:   0%|                                                                 | 0/30 [00:00<?, ?it/s][A
Extracting image features:   3%|█▉                                                       | 1/30 [00:00<00:28,  1.03it/s][A
Extracting image features:   7%|███▊                                                     | 2/30 [00:01<00:13,  2.02it/s][A
Extracting image features:  10%|█████▋                                                   | 3/30 [00:01<00:09,  2.92it/s][A
Extracting image features:  13%|███████▌                                                 | 4/30 [00:01<00:07,  3.56it/s][A
Extracting image features:  17%|█████████▌                                               | 5/30 [00:01<00:06,  4.15it/s][A
Extracting image features:  20%|███████████▍                                             | 6/30 [00:01<00:05,  4.60it/s][A
Extracting image features:  23%|█████████████▎                                           | 7/30 [00:01<00:04,  4.98it/s][A
Extract

[INFO] Saved batch to: ../../data/processed/batches/features_batch_000.npz
[INFO] Processing batch 2 of 4 (30 images)



Extracting image features:   0%|                                                                 | 0/30 [00:00<?, ?it/s][A
Extracting image features:   3%|█▉                                                       | 1/30 [00:01<00:35,  1.21s/it][A
Extracting image features:   7%|███▊                                                     | 2/30 [00:02<00:35,  1.26s/it][A
Extracting image features:  10%|█████▋                                                   | 3/30 [00:02<00:21,  1.27it/s][A
Extracting image features:  13%|███████▌                                                 | 4/30 [00:02<00:14,  1.80it/s][A
Extracting image features:  17%|█████████▌                                               | 5/30 [00:03<00:10,  2.41it/s][A
Extracting image features:  20%|███████████▍                                             | 6/30 [00:03<00:07,  3.00it/s][A
Extracting image features:  23%|█████████████▎                                           | 7/30 [00:03<00:06,  3.54it/s][A
Extract

[INFO] Saved batch to: ../../data/processed/batches/features_batch_001.npz
[INFO] Processing batch 3 of 4 (30 images)



Extracting image features:   0%|                                                                 | 0/30 [00:00<?, ?it/s][A
Extracting image features:   3%|█▉                                                       | 1/30 [00:01<00:31,  1.07s/it][A
Extracting image features:   7%|███▊                                                     | 2/30 [00:01<00:15,  1.86it/s][A
Extracting image features:  10%|█████▋                                                   | 3/30 [00:01<00:10,  2.66it/s][A
Extracting image features:  13%|███████▌                                                 | 4/30 [00:01<00:07,  3.43it/s][A
Extracting image features:  17%|█████████▌                                               | 5/30 [00:01<00:07,  3.28it/s][A
Extracting image features:  20%|███████████▍                                             | 6/30 [00:02<00:07,  3.14it/s][A
Extracting image features:  23%|█████████████▎                                           | 7/30 [00:02<00:06,  3.58it/s][A
Extract

[INFO] Saved batch to: ../../data/processed/batches/features_batch_002.npz
[INFO] Processing batch 4 of 4 (10 images)



Extracting image features:   0%|                                                                 | 0/10 [00:00<?, ?it/s][A
Extracting image features:  10%|█████▋                                                   | 1/10 [00:01<00:09,  1.00s/it][A
Extracting image features:  20%|███████████▍                                             | 2/10 [00:01<00:03,  2.04it/s][A
Extracting image features:  30%|█████████████████                                        | 3/10 [00:01<00:02,  3.02it/s][A
Extracting image features:  40%|██████████████████████▊                                  | 4/10 [00:01<00:01,  3.53it/s][A
Extracting image features:  50%|████████████████████████████▌                            | 5/10 [00:01<00:01,  4.05it/s][A
Extracting image features:  60%|██████████████████████████████████▏                      | 6/10 [00:01<00:00,  4.06it/s][A
Extracting image features:  70%|███████████████████████████████████████▉                 | 7/10 [00:02<00:00,  4.68it/s][A
Extract

[INFO] Saved batch to: ../../data/processed/batches/features_batch_003.npz





In [7]:
# Output path for the combined feature file
combined_file = "../../data/processed/flickr8k_features_batch.npz"
combine_feature_batches(output_dir, combined_file)

Combining feature batches: 100%|██████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 25.49it/s]


Combined 100 features into '../../data/processed/flickr8k_features_batch.npz'.


In [8]:
# Load the feature dictionary
features = load_features(combined_file)

# Print the number of entries and preview the first 5
print(f"Total images in feature file: {len(features)}\n")

print("Previewing first 5 image feature entries:\n")
for i, (img_name, feature_vector) in enumerate(features.items()):
    print(f"{i+1}. {img_name} -> shape: {feature_vector.shape}")
    print(feature_vector[:5], "...")  # Show first 5 elements for brevity
    print()
    if i == 4:
        break

Total images in feature file: 100

Previewing first 5 image feature entries:

1. 101654506_8eb26cfb60.jpg -> shape: (2048,)
[0.0473685  0.         0.34836045 0.         1.9708986 ] ...

2. 101669240_b2d3e7f17b.jpg -> shape: (2048,)
[0.00277199 0.01376122 0.00498268 0.05396137 0.19276744] ...

3. 102351840_323e3de834.jpg -> shape: (2048,)
[0.00125399 0.03713456 0.07491528 0.05316498 0.70476127] ...

4. 102455176_5f8ead62d5.jpg -> shape: (2048,)
[0.13471708 0.24836668 0.27189383 0.04021228 0.9100197 ] ...

5. 103106960_e8a41d64f8.jpg -> shape: (2048,)
[0.00936221 0.10794617 0.33646026 0.24238296 0.4932172 ] ...

