Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA failed with error out of memory #72

Closed
zorbaTheRainy opened this issue Mar 25, 2024 · 6 comments
Closed

CUDA failed with error out of memory #72

zorbaTheRainy opened this issue Mar 25, 2024 · 6 comments

Comments

@zorbaTheRainy
Copy link

Get the error:

INFO:root:Error processing or transcribing /movies/The Hollywood Revue of 1929 (1929)/The.Hollywood.Revue.of.1929.1929.DVDRip.XviD-BBM(iLC).avi: CUDA failed with error out of memory
compose file includes
- "TRANSCRIBE_DEVICE=gpu"
I request that if the transcription fails with this error, it falls back to using the cpu.

Otherwise, thanks so much for this program. I started to try Whisper on my own and was not happy with the results. Thank you so much for doing all the hard work for me.

@McCloudS
Copy link
Owner

McCloudS commented Mar 25, 2024 via email

@zorbaTheRainy
Copy link
Author

zorbaTheRainy commented Mar 25, 2024

Oh yes.
GPU works fine for other movies.

This one is just trouble.
EDIT: Actually another movie just failed (H265 MKV). So, I assume it is the lack of VRAM in my old GPU. The other movies were just simpler (480p AVI).

It is only a GTX 1060 3Gb, So, it isn't the best GPU.
But for other movies it works fine.

It would be nice if you (1) made an ARM64 (cpu) image, and (2) made a gpu image for AMD/ Intel N100 devices. But I am guessing the "fall back to CPU on failure" would be easier.

@hnorgaar
Copy link

I have same card as you and never had any fails with medium model. Dont go higher than that

@zorbaTheRainy
Copy link
Author

zorbaTheRainy commented Mar 26, 2024

I'll post my compose file

#docker-compose.yml
version: '2'
services:
  subgen:
    container_name: subgen
    tty: true
    image: mccloud/subgen
    environment:
       - "TRANSCRIBE_DEVICE=gpu"
       - "WHISPER_MODEL=medium"
       - "WHISPER_THREADS=4"
       - "PROCADDEDMEDIA=True"
       - "PROCMEDIAONPLAY=False"
       - "NAMESUBLANG=aa"
       - "SKIPIFINTERNALSUBLANG=eng"
       # - "PLEXTOKEN=plextoken"
       # - "PLEXSERVER=http://plexserver:32400"
       # - "JELLYFINTOKEN=token here"
       # - "JELLYFINSERVER=http://jellyfin:8096"
       - "WEBHOOKPORT=9010"
       - "CONCURRENT_TRANSCRIPTIONS=1"
       - "WORD_LEVEL_HIGHLIGHT=False"
       - "DEBUG=True"
       - "USE_PATH_MAPPING=False"
       - "PATH_MAPPING_FROM=/tv"
       - "PATH_MAPPING_TO=/Volumes/TV"
       - "CLEAR_VRAM_ON_COMPLETE=True"
       - "HF_TRANSFORMERS=False"
       - "HF_BATCH_SIZE=24"
       - "MODEL_PATH=./models"
       - "UPDATE=False"
       - "APPEND=False"
       - "TRANSCRIBE_FOLDERS=/tv|/movies"
       - "MONITOR=True"
    volumes:
       - 'D:\docker\subgen\tv:/tv'
       - 'D:\docker\subgen\movies:/movies'
       - 'D:\docker\subgen\models:/subgen/models'
    ports:
       - "9010:9010"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

and the intro part of the docker log

03/26/2024
07:26:20 AM
==========
03/26/2024
07:26:20 AM
== CUDA ==
03/26/2024
07:26:20 AM
==========
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
CUDA Version 12.2.2
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
03/26/2024
07:26:20 AM
By pulling and using the container, you accept the terms and conditions of this license:
03/26/2024
07:26:20 AM
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
Environment variable UPDATE is not set or set to False, skipping download.
03/26/2024
07:26:22 AM
INFO:root:Subgen v2024.3.19.17
03/26/2024
07:26:22 AM
INFO:root:Starting Subgen with listening webhooks!
03/26/2024
07:26:22 AM
INFO:root:Transcriptions are limited to running 1 at a time
03/26/2024
07:26:22 AM
INFO:root:Running 4 threads per transcription
03/26/2024
07:26:22 AM
INFO:root:Using cuda to encode
03/26/2024
07:26:22 AM
INFO:root:Using faster-whisper
03/26/2024
07:26:22 AM
INFO:root:Starting to search folders to see if we need to create subtitles.
03/26/2024
07:26:22 AM
WARNING:libav.matroska,webm:Could not find codec parameters for stream 5 (Subtitle: hdmv_pgs_subtitle (pgssub)): unspecified size
03/26/2024
07:26:22 AM
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
03/26/2024
07:26:22 AM
WARNING:libav.matroska,webm:Could not find codec parameters for stream 6 (Subtitle: hdmv_pgs_subtitle (pgssub)): unspecified size
03/26/2024
07:26:22 AM
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
03/26/2024
07:26:22 AM
WARNING:libav.matroska,webm:Could not find codec parameters for stream 5 (Subtitle: hdmv_pgs_subtitle (pgssub)): unspecified size
03/26/2024
07:26:22 AM
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
03/26/2024
07:26:22 AM
WARNING:libav.matroska,webm:Could not find codec parameters for stream 6 (Subtitle: hdmv_pgs_subtitle (pgssub)): unspecified size
03/26/2024
07:26:22 AM
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
03/26/2024
07:26:22 AM
INFO:root:Added Z.1969.Multi.Complete.Bluray-Oldham.mkv for transcription.
03/26/2024
07:26:22 AM
INFO:root:1 files in the queue for transcription
03/26/2024
07:26:22 AM
INFO:root:Transcribing file: Z.1969.Multi.Complete.Bluray-Oldham.mkv

BTW, if you translate vs transcribe, the docker log does not properly update the percentages. Instead it waits until the whole file is done to output the 1%...2%...3% information.

@McCloudS
Copy link
Owner

There is no graceful way to handle this, as we have to know the architecture before loading the model. Your best bet is using a lower model size or manually handling your OOM issues. You could also try distil-medium.en, as it has a slightly smaller memory footprint.

@zorbaTheRainy
Copy link
Author

OK
Thanks for looking into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants