CUDA failed with error out of memory #72

zorbaTheRainy · 2024-03-25T18:00:50Z

Get the error:

INFO:root:Error processing or transcribing /movies/The Hollywood Revue of 1929 (1929)/The.Hollywood.Revue.of.1929.1929.DVDRip.XviD-BBM(iLC).avi: CUDA failed with error out of memory
compose file includes
- "TRANSCRIBE_DEVICE=gpu"
I request that if the transcription fails with this error, it falls back to using the cpu.

Otherwise, thanks so much for this program. I started to try Whisper on my own and was not happy with the results. Thank you so much for doing all the hard work for me.

The text was updated successfully, but these errors were encountered:

McCloudS · 2024-03-25T18:47:07Z

Do you have a GPU mapped to your container? I can try to work on checks to fail to CPU in the near term.

…

On Mon, Mar 25, 2024 at 12:01 PM zorbaTheRainy ***@***.***> wrote: Get the error: INFO:root:Error processing or transcribing /movies/The Hollywood Revue of 1929 (1929)/The.Hollywood.Revue.of.1929.1929.DVDRip.XviD-BBM(iLC).avi: CUDA failed with error out of memory compose file includes - "TRANSCRIBE_DEVICE=gpu" I request that if the transcription fails with this error, it falls back to using the cpu. Otherwise, thanks so much for this program. I started to try Whisper on my own and was not happy with the results. Thank you so much for doing all the hard work for me. — Reply to this email directly, view it on GitHub <#72>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/APJACQLQYBVYQHYJVAU5WW3Y2BQ6PAVCNFSM6AAAAABFHPFUWSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGIYDMMZWGQYTMMQ> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

zorbaTheRainy · 2024-03-25T19:02:07Z

Oh yes.
GPU works fine for other movies.

This one is just trouble.
EDIT: Actually another movie just failed (H265 MKV). So, I assume it is the lack of VRAM in my old GPU. The other movies were just simpler (480p AVI).

It is only a GTX 1060 3Gb, So, it isn't the best GPU.
But for other movies it works fine.

It would be nice if you (1) made an ARM64 (cpu) image, and (2) made a gpu image for AMD/ Intel N100 devices. But I am guessing the "fall back to CPU on failure" would be easier.

hnorgaar · 2024-03-26T06:21:34Z

I have same card as you and never had any fails with medium model. Dont go higher than that

zorbaTheRainy · 2024-03-26T06:26:24Z

I'll post my compose file

#docker-compose.yml
version: '2'
services:
  subgen:
    container_name: subgen
    tty: true
    image: mccloud/subgen
    environment:
       - "TRANSCRIBE_DEVICE=gpu"
       - "WHISPER_MODEL=medium"
       - "WHISPER_THREADS=4"
       - "PROCADDEDMEDIA=True"
       - "PROCMEDIAONPLAY=False"
       - "NAMESUBLANG=aa"
       - "SKIPIFINTERNALSUBLANG=eng"
       # - "PLEXTOKEN=plextoken"
       # - "PLEXSERVER=http://plexserver:32400"
       # - "JELLYFINTOKEN=token here"
       # - "JELLYFINSERVER=http://jellyfin:8096"
       - "WEBHOOKPORT=9010"
       - "CONCURRENT_TRANSCRIPTIONS=1"
       - "WORD_LEVEL_HIGHLIGHT=False"
       - "DEBUG=True"
       - "USE_PATH_MAPPING=False"
       - "PATH_MAPPING_FROM=/tv"
       - "PATH_MAPPING_TO=/Volumes/TV"
       - "CLEAR_VRAM_ON_COMPLETE=True"
       - "HF_TRANSFORMERS=False"
       - "HF_BATCH_SIZE=24"
       - "MODEL_PATH=./models"
       - "UPDATE=False"
       - "APPEND=False"
       - "TRANSCRIBE_FOLDERS=/tv|/movies"
       - "MONITOR=True"
    volumes:
       - 'D:\docker\subgen\tv:/tv'
       - 'D:\docker\subgen\movies:/movies'
       - 'D:\docker\subgen\models:/subgen/models'
    ports:
       - "9010:9010"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

and the intro part of the docker log

03/26/2024
07:26:20 AM
==========
03/26/2024
07:26:20 AM
== CUDA ==
03/26/2024
07:26:20 AM
==========
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
CUDA Version 12.2.2
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
03/26/2024
07:26:20 AM
By pulling and using the container, you accept the terms and conditions of this license:
03/26/2024
07:26:20 AM
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
03/26/2024
07:26:20 AM

03/26/2024
07:26:20 AM
Environment variable UPDATE is not set or set to False, skipping download.
03/26/2024
07:26:22 AM
INFO:root:Subgen v2024.3.19.17
03/26/2024
07:26:22 AM
INFO:root:Starting Subgen with listening webhooks!
03/26/2024
07:26:22 AM
INFO:root:Transcriptions are limited to running 1 at a time
03/26/2024
07:26:22 AM
INFO:root:Running 4 threads per transcription
03/26/2024
07:26:22 AM
INFO:root:Using cuda to encode
03/26/2024
07:26:22 AM
INFO:root:Using faster-whisper
03/26/2024
07:26:22 AM
INFO:root:Starting to search folders to see if we need to create subtitles.
03/26/2024
07:26:22 AM
WARNING:libav.matroska,webm:Could not find codec parameters for stream 5 (Subtitle: hdmv_pgs_subtitle (pgssub)): unspecified size
03/26/2024
07:26:22 AM
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
03/26/2024
07:26:22 AM
WARNING:libav.matroska,webm:Could not find codec parameters for stream 6 (Subtitle: hdmv_pgs_subtitle (pgssub)): unspecified size
03/26/2024
07:26:22 AM
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
03/26/2024
07:26:22 AM
WARNING:libav.matroska,webm:Could not find codec parameters for stream 5 (Subtitle: hdmv_pgs_subtitle (pgssub)): unspecified size
03/26/2024
07:26:22 AM
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
03/26/2024
07:26:22 AM
WARNING:libav.matroska,webm:Could not find codec parameters for stream 6 (Subtitle: hdmv_pgs_subtitle (pgssub)): unspecified size
03/26/2024
07:26:22 AM
Consider increasing the value for the 'analyzeduration' (0) and 'probesize' (5000000) options
03/26/2024
07:26:22 AM
INFO:root:Added Z.1969.Multi.Complete.Bluray-Oldham.mkv for transcription.
03/26/2024
07:26:22 AM
INFO:root:1 files in the queue for transcription
03/26/2024
07:26:22 AM
INFO:root:Transcribing file: Z.1969.Multi.Complete.Bluray-Oldham.mkv

BTW, if you translate vs transcribe, the docker log does not properly update the percentages. Instead it waits until the whole file is done to output the 1%...2%...3% information.

McCloudS · 2024-03-26T14:42:01Z

There is no graceful way to handle this, as we have to know the architecture before loading the model. Your best bet is using a lower model size or manually handling your OOM issues. You could also try distil-medium.en, as it has a slightly smaller memory footprint.

zorbaTheRainy · 2024-03-26T19:58:35Z

OK
Thanks for looking into it.

McCloudS closed this as completed Mar 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA failed with error out of memory #72

CUDA failed with error out of memory #72

zorbaTheRainy commented Mar 25, 2024

McCloudS commented Mar 25, 2024 via email

zorbaTheRainy commented Mar 25, 2024 •

edited

Loading

hnorgaar commented Mar 26, 2024

zorbaTheRainy commented Mar 26, 2024 •

edited

Loading

McCloudS commented Mar 26, 2024

zorbaTheRainy commented Mar 26, 2024

CUDA failed with error out of memory #72

CUDA failed with error out of memory #72

Comments

zorbaTheRainy commented Mar 25, 2024

McCloudS commented Mar 25, 2024 via email

zorbaTheRainy commented Mar 25, 2024 • edited Loading

hnorgaar commented Mar 26, 2024

zorbaTheRainy commented Mar 26, 2024 • edited Loading

McCloudS commented Mar 26, 2024

zorbaTheRainy commented Mar 26, 2024

zorbaTheRainy commented Mar 25, 2024 •

edited

Loading

zorbaTheRainy commented Mar 26, 2024 •

edited

Loading