Training Error #7

jainnipun11 · 2022-06-09T07:53:02Z

Hey! I unzipped the images in the suggested path, but still I keep getting the:

FileNotFoundError: [Errno 2] No such file or directory: '/home/data_storage/mimic-cxr/dataset/image_preprocessing/re_512_3ch/Train/s50328096.jpg'

Can you elaborate why this error is coming?

Thanks.

ttumyche · 2022-06-09T09:09:59Z

Hi, jainnipun

I guess that error occurred in this line
How did you define the suggested path? or you can just change 'fixed_path' to yours.

jainnipun11 · 2022-06-10T00:42:30Z

Thanks! Resolved the "fixed_path" issue, but now it's giving CUDA out of memory error:

CUDA out of memory. Tried to allocate 314.00 MiB (GPU 0; 15.78 GiB total capacity; 13.97 GiB already allocated; 240.75 MiB free; 14.33 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

What to do? Should I decrease the batch size?

ttumyche · 2022-06-10T04:00:21Z

Yup, reduce the batch size to fit your GPU VRAM first.
If that does not solve the error, let me know again

jainnipun11 · 2022-06-10T06:01:12Z

Yes! Training started after I reduced the "batch_size" from 36 to 15. Thanks.

jainnipun11 closed this as completed Jun 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training Error #7

Training Error #7

jainnipun11 commented Jun 9, 2022

ttumyche commented Jun 9, 2022

jainnipun11 commented Jun 10, 2022

ttumyche commented Jun 10, 2022

jainnipun11 commented Jun 10, 2022

Training Error #7

Training Error #7

Comments

jainnipun11 commented Jun 9, 2022

ttumyche commented Jun 9, 2022

jainnipun11 commented Jun 10, 2022

ttumyche commented Jun 10, 2022

jainnipun11 commented Jun 10, 2022