Using a microphone #80

yakovw · 2023-06-19T18:19:19Z

Is there a way in the library to use the microphone and not just transcribe an existing recording?
because the original library has
in whisper.cpp

sandrohanea · 2023-06-19T18:46:02Z

@adamnova added something like this to the demo in: #9

However, I didn't wanted to add this Naudio dependency on the full demo, but since then, each example is done in a different project where there is no problem to have NAudio.

I think it makes sense to move it as a standalone example.

Also, for the best mic support, continuous recognition is also a must: #25
Otherwise, transcript can be bad near "merging" segments.

I would also add some mic example for blazor, as that would be pretty cool.

adamnova · 2023-06-20T06:01:23Z

My demo was basically a proof of concept, it is not very usable in practice. Without the continuous recognition, all you get is somewhat repeating lines of text.

jbienz · 2023-10-23T03:10:05Z

It appears there is now continuous recognition here:

https://github.com/sandrohanea/whisper.net/tree/main/examples/ContinuousRecognition

Though it appears that's an example rather than part of core, is there a chance of getting a microphone sample now?

danroot · 2023-11-07T01:28:04Z

I was able to get realtime transcription from the mic working on my M1 Mac using the code below, which uses OpenTK.OpenAL. This is stitched together from various SO posts, and could be improved, but may be helpful to others looking to do similar. I ended up having to get the CoreML model manually, unzipping, and putting it in the current folder. Ideally IMO Whisper.net would "just work" and download this model when on apple silicon, similar to how it does the base .bin model.

The other "gotcha" I ran into was that I needed to specify a float[] buffer and ALFormat.MonoFloat32Ext capture.

     var modelName = "ggml-base.bin";
        //TODO: also https://huggingface.co/ggerganov/whisper.cpp/blob/main/ggml-base-encoder.mlmodelc.zip
        if (!File.Exists(modelName))
        {
            Console.WriteLine("Downloading whisper model...");
            using var modelStream = await WhisperGgmlDownloader.GetGgmlModelAsync(GgmlType.Base);
            using var fileWriter = File.OpenWrite(modelName);
            await modelStream.CopyToAsync(fileWriter);
        }
        using var whisperFactory = WhisperFactory.FromPath(modelName);

        using var processor = whisperFactory.CreateBuilder()
            .WithLanguage("en")
            .Build();
        int bufferLength = 10 * 16000;//10 sec
        var mic = ALC.CaptureOpenDevice(null, 16000, ALFormat.MonoFloat32Ext, bufferLength);
        Console.WriteLine("Using:");
        Console.WriteLine(ALC.GetString(new ALDevice(mic.Handle), AlcGetString.DeviceSpecifier));
        var currentInput = new StringBuilder();
        ALC.CaptureStart(mic);
        var buffer = new float[bufferLength];
       
        for (int i = 0; i < 100; ++i)
        {
            Thread.Sleep(1000);
            int samplesAvailable = ALC.GetAvailableSamples(mic);
            ALC.CaptureSamples(mic, buffer, samplesAvailable);

            if (samplesAvailable > 0)
            {            
               await foreach (var resultData in processor.ProcessAsync(buffer[..samplesAvailable]))
                {
                   Console.WriteLine("RAW:" + resultData.Text);   
                }
            }

        }
        ALC.CaptureStop(mic);
        ALC.CaptureCloseDevice(mic);
       
`

sandrohanea · 2023-12-06T19:17:38Z

Will close any issue related to streaming processing as linked to: #25

sandrohanea added the examples label Jun 19, 2023

sandrohanea added the enhancement New feature or request label Jul 1, 2023

sandrohanea mentioned this issue Nov 7, 2023

Can we provide an example of the input sound to Whisper.net through the websocket? #117

Closed

sandrohanea closed this as completed Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using a microphone #80

Using a microphone #80

yakovw commented Jun 19, 2023

sandrohanea commented Jun 19, 2023 •

edited

Loading

adamnova commented Jun 20, 2023

jbienz commented Oct 23, 2023

danroot commented Nov 7, 2023

sandrohanea commented Dec 6, 2023

Using a microphone #80

Using a microphone #80

Comments

yakovw commented Jun 19, 2023

sandrohanea commented Jun 19, 2023 • edited Loading

adamnova commented Jun 20, 2023

jbienz commented Oct 23, 2023

danroot commented Nov 7, 2023

sandrohanea commented Dec 6, 2023

sandrohanea commented Jun 19, 2023 •

edited

Loading