V3 audio transcription: aud.subarray is not a function #845

flatsiedatsie · 2024-07-10T17:22:03Z

System Info

Cutting edge version of V3 (just compiled)

Environment/Platform

Description

I attempted a drop-in replacement of the V3 version in a V2 webworker, just to see if it would even work.

And it does! Trying to enable WebGPU is the next step.

However, while it works, I do see an error:

Reproduction

I could share code if need be.

flatsiedatsie · 2024-07-10T18:12:16Z

I spotted a small typo in the demo:

PipelineSingeton is missing an L

flatsiedatsie · 2024-07-10T18:28:07Z

A small question: is there a downside to simply always using the timestamp version?

flatsiedatsie · 2024-07-10T18:36:53Z

And some more questions:

The V2 demo has a number of values that could be manipulated. Are they still useful?

const isDistilWhisper = model.startsWith("distil-whisper/");
quantized

and these:

	        /// Greedy
	        //top_k: 0,
	        //do_sample: false,

	        // Sliding window
	        //chunk_length_s: isDistilWhisper ? 20 : 30,
	        //stride_length_s: isDistilWhisper ? 3 : 5,

	        // Language and task
	        //language: language,
	        task: subtask,

	        // Return timestamps
	        //return_timestamps: true,
	        //force_full_sequences: false,

flatsiedatsie · 2024-07-10T23:05:18Z

I saw this error once:

flatsiedatsie · 2024-07-10T23:08:28Z

I'm having some trouble getting it to respond with more than 1 word when using WebGPU. It usually thinks it heard 'And'.

flatsiedatsie · 2024-07-11T00:39:05Z

FP32 was the key

xenova · 2024-07-11T09:11:01Z

Thanks for the report! The function does require Float32Array or Float64Array inputs, but we could use .slice() if .subarray isn't present (for normal arrays)

flatsiedatsie · 2024-07-11T12:26:19Z

Nah, no worries. With what you're saying I believe the issue was that I was feeding it a fake array to get it to preload. That worked with the V2 version, but no longer worked with the V3 version. But that's fine as there are plenty of new ways to handle pre-loading.

flatsiedatsie added the bug Something isn't working label Jul 10, 2024

flatsiedatsie closed this as completed Jul 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V3 audio transcription: aud.subarray is not a function #845

V3 audio transcription: aud.subarray is not a function #845

flatsiedatsie commented Jul 10, 2024

flatsiedatsie commented Jul 10, 2024

flatsiedatsie commented Jul 10, 2024

flatsiedatsie commented Jul 10, 2024

flatsiedatsie commented Jul 10, 2024

flatsiedatsie commented Jul 10, 2024

flatsiedatsie commented Jul 11, 2024

xenova commented Jul 11, 2024

flatsiedatsie commented Jul 11, 2024

V3 audio transcription: aud.subarray is not a function #845

V3 audio transcription: aud.subarray is not a function #845

Comments

flatsiedatsie commented Jul 10, 2024

System Info

Environment/Platform

Description

Reproduction

flatsiedatsie commented Jul 10, 2024

flatsiedatsie commented Jul 10, 2024

flatsiedatsie commented Jul 10, 2024

flatsiedatsie commented Jul 10, 2024

flatsiedatsie commented Jul 10, 2024

flatsiedatsie commented Jul 11, 2024

xenova commented Jul 11, 2024

flatsiedatsie commented Jul 11, 2024