音频转文字过程中显存不断增加，最终 out of memory #2881

sunzhaoyang · 2023-02-06T08:33:27Z

General Question

因为音频文件比较大，所以我是给切分成 20s 一段再进行识别

import auditok
from paddlespeech.cli.text.infer import TextExecutor
from paddlespeech.cli.asr.infer import ASRExecutor
import sys
from tempfile import NamedTemporaryFile
import os
from pydub import AudioSegment

def dot(txt):
    text_punc = TextExecutor()
    result = text_punc(txt)
    return result


# split returns a generator of AudioRegion objects

for root, dirs, files in os.walk(".", topdown=False):
    for name in files:
        if name.endswith('mp3'):
            full_path = os.path.join(root, name)
            print(full_path)
            wav_file = full_path.replace('.mp3', '.wav')
            txt_file = full_path.replace('.mp3', '.txt')
            # convert to wav
            sound = AudioSegment.from_mp3(full_path)
            sound.export(wav_file, format="wav")

            audio_regions = auditok.split(
                wav_file,
                min_dur=0.2,     # minimum duration of a valid audio event in seconds
                max_dur=20,       # maximum duration of an event
                max_silence=10,  # maximum duration of tolerated continuous silence within an event
                energy_threshold=55  # threshold of detection
            )

            with open(txt_file, 'w') as t:

                for i, r in enumerate(audio_regions):
                    with NamedTemporaryFile(suffix='.wav') as f:
                        r.save(f.name)
                        asr = ASRExecutor()
                        raw_result = asr(audio_file=f.name, force_yes=True)
                        t.write(dot(raw_result))

识别过程中随着一个个音频分片的解析，眼瞅着 GPU 不断增长，从几百兆增加到 8G 最终 out of memory

试了 FLAGS_use_cuda_managed_memory 改为 true 和 false 都不行

CUDA 版本 11.2

W0206 08:26:41.720482 13625 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 11.2, Runtime API Version: 11.2
W0206 08:26:41.722322 13625 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.

paddlepaddle-gpu         2.4.1.post112

The text was updated successfully, but these errors were encountered:

yt605155624 · 2023-02-06T08:43:10Z

asr = ASRExecutor() 只需要初始化一次，放到 for 循环外面试试，当前这种写法不仅会每次都初始化一个对象，而且多次推理达不到加速效果，参考 #1256 (comment)

TextExecutor() 也只需要初始化一次，ASRExecutor() TextExecutor() 可以尝试在整个程序的最外部初始化一个全局的

coderLinJ5945 · 2023-09-19T08:32:39Z

asr = ASRExecutor() 初始化一次，如果是接口调用的话，微服务启动的时候初始化一次ASRExecutor，并发调用接口的时候，会不会出现ASRExecutor的并发问题？

sunzhaoyang added the Question label Feb 6, 2023

yt605155624 added the S2T asr/st label Feb 6, 2023

iftaken assigned zh794390558 Feb 9, 2023

sunzhaoyang closed this as completed Feb 28, 2023

DazzlingGalaxy mentioned this issue Mar 15, 2023

PaddleOCR安装报错 PaddlePaddle/PaddleOCR#9426

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

音频转文字过程中显存不断增加，最终 out of memory #2881

音频转文字过程中显存不断增加，最终 out of memory #2881

sunzhaoyang commented Feb 6, 2023 •

edited

yt605155624 commented Feb 6, 2023 •

edited

coderLinJ5945 commented Sep 19, 2023

音频转文字过程中显存不断增加，最终 out of memory #2881

音频转文字过程中显存不断增加，最终 out of memory #2881

Comments

sunzhaoyang commented Feb 6, 2023 • edited

General Question

yt605155624 commented Feb 6, 2023 • edited

coderLinJ5945 commented Sep 19, 2023

sunzhaoyang commented Feb 6, 2023 •

edited

yt605155624 commented Feb 6, 2023 •

edited