Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What does "Error: failed to call OrtRun(). error code = 6." mean? I know it is ONNX related, but how to fix? #732

Closed
jquintanilla4 opened this issue May 1, 2024 · 6 comments
Labels
question Further information is requested

Comments

@jquintanilla4
Copy link

Question

I keep running into the same issue when using transformers.js Automatic Speech Recognition pipeline. I've tried solving it multiple ways. But pretty much hit a wall every time. I've done lots of googling, LLMs, and used my prior knowledge of how this stuff functions in python. But I can't seem to get it to work.

I've tried setting up my environment with and without vite. I've tried with react javascript. I've tried with with react typescript. Nothing.

Am i missing a dependency or something? is there a place I can find what the error code means? because I couldn't find it anywhere.

I've fed it an array. I've fed it a .wav file. Nothing works. No matter what I do. No matter if it's an array or a wav file. I always get the same error:

An error occurred during model execution: "Error: failed to call OrtRun(). error code = 6.".
Inputs given to model: {input_features: Proxy(Tensor)}
Error transcribing audio: Error: failed to call OrtRun(). error code = 6.
    at e.run (wasm-core-impl.ts:392:1)
    at e.run (proxy-wrapper.ts:212:1)
    at e.OnnxruntimeWebAssemblySessionHandler.run (session-handler.ts:99:1)
    at InferenceSession.run (inference-session-impl.ts:108:1)
    at sessionRun (models.js:207:1)
    at encoderForward (models.js:520:1)
    at Function.seq2seqForward [as _forward] (models.js:361:1)
    at Function.forward (models.js:820:1)
    at Function.seq2seqRunBeam [as _runBeam] (models.js:480:1)
    at Function.runBeam (models.js:1373:1)

It seems to be a ONNX Runtime issue. But don't know how to fix it. Any guidance will be appreciated.

Note: I'm currently testing with English. Nothing fancy.

@jquintanilla4 jquintanilla4 added the question Further information is requested label May 1, 2024
@xenova
Copy link
Collaborator

xenova commented May 4, 2024

Hi there 👋 error code 6 is usually related to out-of-memory issues. Can you provide the code you are running (as well as the model being used)?

@jquintanilla4
Copy link
Author

jquintanilla4 commented May 6, 2024

Model I was trying to use was whisper medium.
Here's the full code for the react component:

import React, { useRef, useState, useEffect } from 'react';
import { MediaRecorder, register } from 'extendable-media-recorder';
import { connect } from 'extendable-media-recorder-wav-encoder';
import { pipeline, env, read_audio } from '@xenova/transformers';

env.allowLocalModels=false;

interface AutomaticSpeechRecognitionOutput {
    text?: string;
}

const AudioInput: React.FC = () => {
    const [isRecording, setIsRecording] = useState<boolean>(false);
    const [audioBlob, setAudioBlob] = useState<Blob | null>(null);
    const [recordTime, setRecordTime] = useState<number>(0);
    const [transcription, setTranscription] = useState<string>('');
    const mediaRecorderRef = useRef<MediaRecorder | null>(null);
    const audioChunksRef = useRef<Blob[]>([]);
    const streamRef = useRef<MediaStream | null>(null);
    const recordIntervalRef = useRef<NodeJS.Timeout | null>(null);

    useEffect(() => {
        async function setupRecorder() {
            try {
                await register(await connect());
            } catch (error: any) {
                if (error.message.includes("already an encoder stored")) {
                    console.log("Encoder already registered, continuing...");
                } else {
                    console.error('Error registering encoder:', error);
                    return;
                }
            }

            try {
                const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
                streamRef.current = stream;
            } catch (error: any) {
                console.error('Error accessing microphone:', error);
            }
        }

        setupRecorder();

        return () => {
            if (streamRef.current) {
                streamRef.current.getTracks().forEach(track => track.stop());
            }
            if (recordIntervalRef.current) {
                clearInterval(recordIntervalRef.current);
            }
        };
    }, []);

    const transcribeAudio = async (audioBlob: Blob) => {
        // const arrayBuffer = await audioBlob.arrayBuffer();
        // const audioData = new Uint8Array(arrayBuffer);
        // const audioData = new Float32Array(arrayBuffer);
        const audioURL = URL.createObjectURL(audioBlob);
        const audioData = await read_audio(audioURL, 16000);

        try {
            const transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-medium');
            console.log('Transcriber initialized.'); // Confirm that the transcriber has been initialized
            const output = await transcriber(audioData, { language: 'english', task: 'transcribe', chunk_length_s: 30, stride_length_s: 5 });
            console.log('Transcription output:', output); // Log the full output object
            if (output && !Array.isArray(output) && output.text) {
                setTranscription(output.text);
            } else {
                setTranscription('No transcription output');
            }

            URL.revokeObjectURL(audioURL); // clean up the object URL after use
        } catch (error: any) {
            console.error('Error transcribing audio:', error);
        }
    };
    
    const startRecording = () => {
        if (streamRef.current) {
            mediaRecorderRef.current = new MediaRecorder(streamRef.current, { mimeType: 'audio/wav' }) as any; // any; because it's from ext library
    
            if (mediaRecorderRef.current) { // Check if mediaRecorderRef.current is not null before adding event listeners
                mediaRecorderRef.current.addEventListener('dataavailable', (event: BlobEvent) => {
                    audioChunksRef.current.push(event.data);
                });
    
                mediaRecorderRef.current.addEventListener('stop', async () => {
                    if (mediaRecorderRef.current) { // Additional check before accessing mimeType
                        const mimeType = mediaRecorderRef.current.mimeType;
                        const audioBlob = new Blob(audioChunksRef.current, { type: mimeType });
                        setAudioBlob(audioBlob);
                        audioChunksRef.current = [];
                        await transcribeAudio(audioBlob);
                    }
                });
    
                audioChunksRef.current = [];
                mediaRecorderRef.current.start();
                setIsRecording(true);
                setRecordTime(0);
                recordIntervalRef.current = setInterval(() => {
                    setRecordTime(prevTime => prevTime + 1);
                }, 1000);
            } else {
                console.error('Failed to initialize MediaRecorder');
            }
        } else {
            console.error('Stream not initialized');
        }
    };

    const stopRecording = () => {
        if (mediaRecorderRef.current && mediaRecorderRef.current.state === 'recording') {
            mediaRecorderRef.current.stop();
            setIsRecording(false);
            if (recordIntervalRef.current) {
                clearInterval(recordIntervalRef.current);
            }
        } else {
            console.error('MediaRecorder not recording or not initialized');
        }
    };

    const playAudio = () => {
        if (audioBlob) {
            const audioURL = URL.createObjectURL(audioBlob);
            const audio = new Audio(audioURL);
            audio.play().catch((error: any) => {
                console.error('Error playing the audio:', error);
                URL.revokeObjectURL(audioURL);
            });
        }
    };

    return (
        <div className="audio-input-container">
            <button onClick={startRecording}>Start Recording</button>
            <button onClick={stopRecording}>Stop Recording</button>
            <button onClick={playAudio}>Play Audio</button>
            <p>Recording: {isRecording ? `${recordTime} seconds` : 'No'}</p>
            <p>Transcription: {transcription}</p>
        </div>
    );
};

export default AudioInput;

@xenova
Copy link
Collaborator

xenova commented May 6, 2024

Note that every single call to

const transcriber = await pipeline('automatic-speech-recognition', 'Xenova/whisper-medium');

allocates new memory for a pipeline (and takes a lot of time to construct the model). This is most likely the reason for your out-of-memory issues, since you call this every time you transcribe audio.

I would also recommend selecting a smaller model, like https://huggingface.co/Xenova/whisper-base, https://huggingface.co/Xenova/whisper-small, https://huggingface.co/Xenova/whisper-tiny, https://huggingface.co/distil-whisper/distil-medium.en, or https://huggingface.co/distil-whisper/distil-small.en.

Hope that helps!

@jquintanilla4
Copy link
Author

jquintanilla4 commented May 7, 2024

That's good to know. I'll give those other models a shot. However this happens on the first call.

The dev machine i'm trying to run it in has a RTX 4090. I'm surprised that's the issue, since i've never run into memory problems when running whisper in python. Does WebGPU have a memory ceiling?

Thanks for your help.

@xenova
Copy link
Collaborator

xenova commented May 7, 2024

The dev machine i'm trying to run it in has a RTX 4090. I'm surprised that's the issue, since i've never run into memory problems when running whisper in python. Does WebGPU have a memory ceiling?

Assuming you are running Transformers.js v2, everything still runs with WASM/CPU. You can follow along with the development of v3 here, which will add WebGPU support.

@jquintanilla4
Copy link
Author

Gotcha. Now it all makes sense. I'll be keeping my eye on v3. Thanks for the patience and good luck with all the work ahead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants