One image in the `image_paths` folder is truncated. This will cause an error in PyTorch's dataloaders. One easy fix is to set `LOAD_TRUNCATED_IMAGES = True`. However, this only works when `num_workers = 0`, which can lead to low GPU usage during training as the CPU doesn't keep up with the GPU. 

The code below tries to open all images with `LOAD_TRUNCATED_IMAGES = False` and save them to a temporary file. When this fails for a truncated image, we set `LOAD_TRUNCATED_IMAGES = True`,  adding a small black border, and we resave the image to its original path. Thereafter we can set `num_workers > 0` in our dataloaders and enjoy optimal GPU utilization. 

In [9]:
import os  
import glob
from PIL import Image
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = False

# image_paths = glob.glob('dogImages/*/*/*.jpg')
image_paths = glob.glob('dogImages/train/098.Leonberger/*.jpg')

for i in range(len(image_paths)):
    im = Image.open(image_paths[i])
    try:
        im.save("temp.jpg")
    except:
        print("Corrupt image: ",image_paths[i])
        ImageFile.LOAD_TRUNCATED_IMAGES = True
        im.save(image_paths[i])
        ImageFile.LOAD_TRUNCATED_IMAGES = False
    