What does "Error: failed to call OrtRun(). error code = 6." mean? I know it is ONNX related, but how to fix? #732

jquintanilla4 · 2024-05-01T07:01:06Z

Question

I keep running into the same issue when using transformers.js Automatic Speech Recognition pipeline. I've tried solving it multiple ways. But pretty much hit a wall every time. I've done lots of googling, LLMs, and used my prior knowledge of how this stuff functions in python. But I can't seem to get it to work.

I've tried setting up my environment with and without vite. I've tried with react javascript. I've tried with with react typescript. Nothing.

Am i missing a dependency or something? is there a place I can find what the error code means? because I couldn't find it anywhere.

I've fed it an array. I've fed it a .wav file. Nothing works. No matter what I do. No matter if it's an array or a wav file. I always get the same error:

An error occurred during model execution: "Error: failed to call OrtRun(). error code = 6.".
Inputs given to model: {input_features: Proxy(Tensor)}
Error transcribing audio: Error: failed to call OrtRun(). error code = 6.
    at e.run (wasm-core-impl.ts:392:1)
    at e.run (proxy-wrapper.ts:212:1)
    at e.OnnxruntimeWebAssemblySessionHandler.run (session-handler.ts:99:1)
    at InferenceSession.run (inference-session-impl.ts:108:1)
    at sessionRun (models.js:207:1)
    at encoderForward (models.js:520:1)
    at Function.seq2seqForward [as _forward] (models.js:361:1)
    at Function.forward (models.js:820:1)
    at Function.seq2seqRunBeam [as _runBeam] (models.js:480:1)
    at Function.runBeam (models.js:1373:1)

It seems to be a ONNX Runtime issue. But don't know how to fix it. Any guidance will be appreciated.

Note: I'm currently testing with English. Nothing fancy.

The text was updated successfully, but these errors were encountered:

xenova · 2024-05-04T12:00:34Z

Hi there 👋 error code 6 is usually related to out-of-memory issues. Can you provide the code you are running (as well as the model being used)?

jquintanilla4 · 2024-05-06T10:50:46Z

Model I was trying to use was whisper medium.
Here's the full code for the react component:

import React, { useRef, useState, useEffect } from 'react';
import { MediaRecorder, register } from 'extendable-media-recorder';
import { connect } from 'extendable-media-recorder-wav-encoder';
import { pipeline, env, read_audio } from '@xenova/transformers';

env.allowLocalModels=false;

interface AutomaticSpeechRecognitionOutput {
    text?: string;
}

const AudioInput: React.FC = () => {
    const [isRecording, setIsRecording] = useState<boolean>(false);
    const [audioBlob, setAudioBlob] = useState<Blob | null>(null);
    const [recordTime, setRecordTime] = useState<number>(0);
    const [transcription, setTranscription] = useState<string>('');
    const mediaRecorderRef = useRef<MediaRecorder | null>(null);
    const audioChunksRef = useRef<Blob[]>([]);
    const streamRef = useRef<MediaStream | null>(null);
    const recordIntervalRef = useRef<NodeJS.Timeout | null>(null);

    useEffect(() => {
        async function setupRecorder() {
            try {
                await register(await connect());
            } catch (error: any) {
                if (error.message.includes("already an encoder stored")) {
                    console.log("Encoder already registered, continuing...");
                } else {
                    console.error('Error registering encoder:', error);
                    return;
                }
            }

            try {
                const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
                streamRef.current = stream;
            } catch (error: any) {
                console.error('Error accessing microphone:', error);
            }
        }

        setupRecorder();

        return () => {
            if (streamRef.current) {
                streamRef.current.getTracks().forEach(track => track.stop());
            }
            if (recordIntervalRef.current) {
                clearInterval(recordIntervalRef.current);
            }
        };
    }, []);

    const transcribeAudio = async (audioBlob: Blob) => {
        // const arrayBuffer = await audioBlob.arrayBuffer();
        // const audioData = new Uint8Array(arrayBuffer);
        // const audioData = new Float32Array(arrayBuffer);
        const audioURL = URL.createObjectURL(audioBlob);
        const audioData = await read_audio(audioURL, 16000);

        try {
            const transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-medium');
            console.log('Transcriber initialized.'); // Confirm that the transcriber has been initialized
            const output = await transcriber(audioData, { language: 'english', task: 'transcribe', chunk_length_s: 30, stride_length_s: 5 });
            console.log('Transcription output:', output); // Log the full output object
            if (output && !Array.isArray(output) && output.text) {
                setTranscription(output.text);
            } else {
                setTranscription('No transcription output');
            }

            URL.revokeObjectURL(audioURL); // clean up the object URL after use
        } catch (error: any) {
            console.error('Error transcribing audio:', error);
        }
    };
    
    const startRecording = () => {
        if (streamRef.current) {
            mediaRecorderRef.current = new MediaRecorder(streamRef.current, { mimeType: 'audio/wav' }) as any; // any; because it's from ext library
    
            if (mediaRecorderRef.current) { // Check if mediaRecorderRef.current is not null before adding event listeners
                mediaRecorderRef.current.addEventListener('dataavailable', (event: BlobEvent) => {
                    audioChunksRef.current.push(event.data);
                });
    
                mediaRecorderRef.current.addEventListener('stop', async () => {
                    if (mediaRecorderRef.current) { // Additional check before accessing mimeType
                        const mimeType = mediaRecorderRef.current.mimeType;
                        const audioBlob = new Blob(audioChunksRef.current, { type: mimeType });
                        setAudioBlob(audioBlob);
                        audioChunksRef.current = [];
                        await transcribeAudio(audioBlob);
                    }
                });
    
                audioChunksRef.current = [];
                mediaRecorderRef.current.start();
                setIsRecording(true);
                setRecordTime(0);
                recordIntervalRef.current = setInterval(() => {
                    setRecordTime(prevTime => prevTime + 1);
                }, 1000);
            } else {
                console.error('Failed to initialize MediaRecorder');
            }
        } else {
            console.error('Stream not initialized');
        }
    };

    const stopRecording = () => {
        if (mediaRecorderRef.current && mediaRecorderRef.current.state === 'recording') {
            mediaRecorderRef.current.stop();
            setIsRecording(false);
            if (recordIntervalRef.current) {
                clearInterval(recordIntervalRef.current);
            }
        } else {
            console.error('MediaRecorder not recording or not initialized');
        }
    };

    const playAudio = () => {
        if (audioBlob) {
            const audioURL = URL.createObjectURL(audioBlob);
            const audio = new Audio(audioURL);
            audio.play().catch((error: any) => {
                console.error('Error playing the audio:', error);
                URL.revokeObjectURL(audioURL);
            });
        }
    };

    return (
        <div className="audio-input-container">
            <button onClick={startRecording}>Start Recording</button>
            <button onClick={stopRecording}>Stop Recording</button>
            <button onClick={playAudio}>Play Audio</button>
            <p>Recording: {isRecording ? `${recordTime} seconds` : 'No'}</p>
            <p>Transcription: {transcription}</p>
        </div>
    );
};

export default AudioInput;

xenova · 2024-05-06T12:35:46Z

Note that every single call to

const transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-medium');

allocates new memory for a pipeline (and takes a lot of time to construct the model). This is most likely the reason for your out-of-memory issues, since you call this every time you transcribe audio.

I would also recommend selecting a smaller model, like https://huggingface.co/Xenova/whisper-base, https://huggingface.co/Xenova/whisper-small, https://huggingface.co/Xenova/whisper-tiny, https://huggingface.co/distil-whisper/distil-medium.en, or https://huggingface.co/distil-whisper/distil-small.en.

Hope that helps!

jquintanilla4 · 2024-05-07T07:14:34Z

That's good to know. I'll give those other models a shot. However this happens on the first call.

The dev machine i'm trying to run it in has a RTX 4090. I'm surprised that's the issue, since i've never run into memory problems when running whisper in python. Does WebGPU have a memory ceiling?

Thanks for your help.

xenova · 2024-05-07T08:30:16Z

The dev machine i'm trying to run it in has a RTX 4090. I'm surprised that's the issue, since i've never run into memory problems when running whisper in python. Does WebGPU have a memory ceiling?

Assuming you are running Transformers.js v2, everything still runs with WASM/CPU. You can follow along with the development of v3 here, which will add WebGPU support.

jquintanilla4 · 2024-05-11T09:18:17Z

Gotcha. Now it all makes sense. I'll be keeping my eye on v3. Thanks for the patience and good luck with all the work ahead.

jquintanilla4 added the question Further information is requested label May 1, 2024

jquintanilla4 closed this as completed May 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What does "Error: failed to call OrtRun(). error code = 6." mean? I know it is ONNX related, but how to fix? #732

What does "Error: failed to call OrtRun(). error code = 6." mean? I know it is ONNX related, but how to fix? #732

jquintanilla4 commented May 1, 2024

xenova commented May 4, 2024 •

edited

Loading

jquintanilla4 commented May 6, 2024 •

edited

Loading

xenova commented May 6, 2024

jquintanilla4 commented May 7, 2024 •

edited

Loading

xenova commented May 7, 2024

jquintanilla4 commented May 11, 2024

What does "Error: failed to call OrtRun(). error code = 6." mean? I know it is ONNX related, but how to fix? #732

What does "Error: failed to call OrtRun(). error code = 6." mean? I know it is ONNX related, but how to fix? #732

Comments

jquintanilla4 commented May 1, 2024

Question

xenova commented May 4, 2024 • edited Loading

jquintanilla4 commented May 6, 2024 • edited Loading

xenova commented May 6, 2024

jquintanilla4 commented May 7, 2024 • edited Loading

xenova commented May 7, 2024

jquintanilla4 commented May 11, 2024

xenova commented May 4, 2024 •

edited

Loading

jquintanilla4 commented May 6, 2024 •

edited

Loading

jquintanilla4 commented May 7, 2024 •

edited

Loading