-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Description
Loading the CelebA dataset using torchvision.datasets.CelebA
uses up all the memory of a Google Colab runtime causing a crash.
Am loading the CelebA dataset in Google Colab. During the process, the memory consumption rises till it reaches the maximum memory allocated i.e: 12GB. This leads to the colab runtime crashing.
To Reproduce
# Root directory for the dataset
data_root = 'data/celeba'
# Spatial size of training images, images are resized to this size.
image_size = 64
celeba_data = datasets.CelebA(data_root,
download=True,
transform=transforms.Compose([
transforms.Resize(image_size),
transforms.CenterCrop(image_size),
transforms.ToTensor(),
transforms.Normalize(mean=[0.5, 0.5, 0.5],
std=[0.5, 0.5, 0.5])
]))
The notebook can be found here.
Expected behavior
I expected that running the script above, correctly loads and applies the transformations to the dataset without enormous memory requirements.
Environment
-
PyTorch version: 1.7.1+cu101
-
Is debug build: False
-
CUDA used to build PyTorch: 10.1
-
ROCM used to build PyTorch: N/A
-
OS: Ubuntu 18.04.5 LTS (x86_64)
-
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
-
Clang version: 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
-
CMake version: version 3.12.0
-
Python version: 3.6 (64-bit runtime)
-
Is CUDA available: True
-
CUDA runtime version: 10.1.243
-
GPU models and configuration: GPU 0: Tesla T4
-
Nvidia driver version: 418.67
-
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
-
HIP runtime version: N/A
-
MIOpen runtime version: N/A
Versions of relevant libraries:
- [pip3] numpy==1.19.4
- [pip3] torch==1.7.1+cu101
- [pip3] torchaudio==0.7.2
- pip3] torchsummary==1.5.1
- [pip3] torchtext==0.3.1
- [pip3] torchvision==0.8.2+cu101
- [conda] Could not collect
Additional context
I have also tried to separate the download process and the dataloading process to see if this can solve the memory problem, i.e:
# Download the data first
datasets.CelebA(data_root, download=True)
celeba_dataset = datasets.CelebA(data_root, ...)
The results are the same, the runtime crashes.
cc @pmeier