Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to create a PullAudioInputStreamCallback for translate #95

Closed
DennisLJ opened this issue Nov 5, 2018 · 10 comments
Closed

How to create a PullAudioInputStreamCallback for translate #95

DennisLJ opened this issue Nov 5, 2018 · 10 comments

Comments

@DennisLJ
Copy link

DennisLJ commented Nov 5, 2018

I want to capture audio loopback, and make it as source for recognizer.
with C# How could I create a PullAudioInputStreamCallback to capture active speaker as input source ?

@DennisLJ DennisLJ changed the title How to create a PullAudioInputStreamCallback to translate the voice come out from speaker How to create a PullAudioInputStreamCallback for translate Nov 5, 2018
@wolfma61
Copy link
Contributor

wolfma61 commented Nov 5, 2018

not sure I understand your scenario ...

somewhere the audiostream you are hearing is created. You have to intercept it and send it to the SDK. There are programs that intercept playing audio (on Windows) and capture it. To my knowledge they are injecting themselves into the audio pipeline or have their own audio drivers.

this functionality is not supplied by the sdk

thx
Wolfgang

@DennisLJ
Copy link
Author

DennisLJ commented Nov 5, 2018

I can capture audio playing with speaker with WasapiLoopbackCapture, privide by CScore. and can write the captured stream to a Wav file with buffer, the stream is 16000 samplepersecond, 16 bit, and 1 channel.
unfortunitly, I don't know how to copy the captured stream to a stream which can be used by recognizer.
Should I use PushAudioInputStream? I tried with PushAudioInputStream:

//code
private PushAudioInputStream inputstream;
var audioFormat = AudioStreamFormat.GetWaveFormatPCM(samplesPerSecond, bitsPerSample, channels);
inputstream = PushAudioInputStream.CreatePushStream(audioFormat);
public void StartCapture()
{
devices = MMDeviceEnumerator.EnumerateDevices(CSCore.CoreAudioAPI.DataFlow.Render, DeviceState.Active);
if (!devices.Any())
{
System.Windows.MessageBox.Show("No active speaker found!");
this.Close();
}

        for (int i = 0; i < devices.Count; i++)
        {
            if (devices[i].FriendlyName.IndexOf("phone", 0) > 0)
                SelectedDevice = devices[i];

        }
        //System.Windows.MessageBox.Show($"final:"+ SelectedDevice.FriendlyName);

        _soundIn = new WasapiLoopbackCapture();
        _soundIn.Device = SelectedDevice;
        _soundIn.Initialize();

        SoundInSource soundInSource = new SoundInSource(_soundIn) { FillWithZeros = false };

        convertedSource = soundInSource
            .ChangeSampleRate(16000) // sample rate
            .ToSampleSource()
            .ToWaveSource(16)  //bits per sample
            .ToMono();

        _writer = new WaveWriter("a.wav", convertedSource.WaveFormat);

        buffer = new byte[convertedSource.WaveFormat.BytesPerSecond];
        int read;
        soundInSource.DataAvailable += (s, e) =>
            {
                while ((read = convertedSource.Read(buffer, 0, buffer.Length)) > 0)
                {
                    this.inputstream.Write(buffer);
                    _writer.Write(buffer, 0, read);
                }
            };
        _soundIn.Start();
    }

@wolfma61
Copy link
Contributor

wolfma61 commented Nov 5, 2018

looks like you have in the 'buffer' the audio which you write in the stream?
you could now use the pull or pushstream functionality of the SDK to transfer this audio to the services

You need the wave-file header also, that needs to be transmitted to the service (the size isn't relevant).

there will be a new push/pull audio stream sample published in this repo later this week.

Wolfgang

@DennisLJ
Copy link
Author

DennisLJ commented Nov 5, 2018

@wolfma61 Thank you. I will try to add wave-file header to the stream before write data from "buffer".

@DennisLJ
Copy link
Author

DennisLJ commented Nov 6, 2018

@wolfma61 I tried add Wave header to the stream before write audio data to the stream, recognizer still not work

can you help to check the code:
inputstream is the PushAudioInputStream.

        //Wave Header
        var stream1 =new MemoryStream();
        int totalSampleCount ;
        uint sampleRate  ;
        int bitDepth;
        bool isFloatingPoint ;
        int channelCount;

        totalSampleCount = 1000;
        sampleRate = 16000;
        bitDepth = 16;
        isFloatingPoint = false;
        channelCount = 1;

        stream1.Position = 0;
        // RIFF header.
        // Chunk ID.
        stream1.Write(Encoding.ASCII.GetBytes("RIFF"), 0, 4);
        // Chunk size.
        stream1.Write(BitConverter.GetBytes(((bitDepth / 8) * totalSampleCount) + 36), 0, 4);
        // Format.
        stream1.Write(Encoding.ASCII.GetBytes("WAVE"), 0, 4);

        // Sub-chunk 1.
        // Sub-chunk 1 ID.
        stream1.Write(Encoding.ASCII.GetBytes("fmt "), 0, 4);
        // Sub-chunk 1 size.
        stream1.Write(BitConverter.GetBytes(16), 0, 4);

        // Audio format (floating point (3) or PCM (1)). Any other format indicates compression.
        stream1.Write(BitConverter.GetBytes((ushort)(isFloatingPoint? 3 : 1)), 0, 2);

        // Channels.
        stream1.Write(BitConverter.GetBytes(channelCount), 0, 2);

        // Sample rate.
        stream1.Write(BitConverter.GetBytes(sampleRate), 0, 4);

        // Bytes rate.
        stream1.Write(BitConverter.GetBytes(sampleRate* channelCount * (bitDepth / 8)), 0, 4);

        // Block align.
        stream1.Write(BitConverter.GetBytes((ushort) channelCount * (bitDepth / 8)), 0, 2);

        // Bits per sample.
        stream1.Write(BitConverter.GetBytes(bitDepth), 0, 2);
   
        // Sub-chunk 2.
        // Sub-chunk 2 ID.
        stream1.Write(Encoding.ASCII.GetBytes("data"), 0, 4);

        // Sub-chunk 2 size.
        stream1.Write(BitConverter.GetBytes((bitDepth / 8) * totalSampleCount), 0, 4);

        buffer = new byte[convertedSource.WaveFormat.BytesPerSecond];
        int read;
        read=stream1.Read(buffer,0,Convert.ToInt32(stream1.Length));
        inputstream.Write(buffer);

@zhouwangzw
Copy link
Contributor

Can you get the sample for AudioPullInputStream work? You can also try with AudioPushInputstream, like sample here.

@fmegen
Copy link

fmegen commented Dec 21, 2018

Hi @DennisLJ ,

read=stream1.Read(buffer,0,Convert.ToInt32(stream1.Length));

the sample size must be 16bits, i.e., converting to 32bit will not work.

@wolfma61
Copy link
Contributor

wolfma61 commented Jan 7, 2019

@DennisLJ - is this solved?

@wolfma61
Copy link
Contributor

wolfma61 commented Feb 4, 2019

assuming solved?
please re-open if you still experience the issue

@wolfma61 wolfma61 closed this as completed Feb 4, 2019
@rupeshtech
Copy link

Hi @DennisLJ ,
I am also facing same issue..
Are you able to solve the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants