Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to stream audio to speech api while recording ? #98

Open
rena1234 opened this issue Jul 30, 2017 · 0 comments
Open

How to stream audio to speech api while recording ? #98

rena1234 opened this issue Jul 30, 2017 · 0 comments

Comments

@rena1234
Copy link

Hello guys, I'm trying to reduce the latency time in my project, so I have been
trying to send chunks of an audio while I record it, what I had tried until now was mix a code that sends a previously recorded audio by chunks:

def RecognizeSpeech(AUDIO_FILENAME,CHUNK_SIZE):
   client = Wit('MYTOKENHERE') 
   def wavIterator():
       wav = open(AUDIO_FILENAME, 'rb')
       chunk = wav.read(CHUNK_SIZE)
       while chunk:
           yield chunk
           chunk = wav.read(CHUNK_SIZE)
   resp = client.speech(wavIterator(), None,
           {'Content-Type': 'audio/wav', 'Transfer-encoding':'chunked'})

with this tutorial's code: https://indianpythonista.wordpress.com/2017/04/10/speech-recognition-using-wit-ai/
and made this frankstein:

def recReturnWavIterator(RECORD_SECONDS, CHUNK_SIZE, client):

    #--------- SETTING PARAMS FOR OUR AUDIO FILE ------------#
    FORMAT = pyaudio.paInt16    # format of wave
    CHANNELS = 2                # no. of audio channels
    RATE = 44100                # frame rate
    CHUNK = CHUNK_SIZE          # frames per audio sample
    #--------------------------------------------------------#
 
    # creating PyAudio object
    audio = pyaudio.PyAudio()
 
    # open a new stream for microphone
    # It creates a PortAudio Stream Wrapper class object
    stream = audio.open(format=FORMAT,channels=CHANNELS,
                        rate=RATE, input=True,
                        frames_per_buffer=CHUNK)
    print("Listening") 
    for i in range(int(RATE / CHUNK * RECORD_SECONDS)-1):
        # read audio stream from microphone
        data = stream.read(CHUNK)
        yield data
    print("Finished recording")
def RecognizeSpeech(CHUNK_SIZE):
    client = Wit('MYTOKENHERE')
    resp = client.speech(recReturnWavIterator(5,CHUNK_SIZE,client),None,{'Content-Type': 'audio/wav', 'Transfer-encoding': 'chunked'})
    print('Yay, got Wit.ai response: ' + str(resp))

wich does return always an empty text
Yay, got Wit.ai response: {'_text': None, 'entities': {}, 'msg_id': '7ed74ba3-698a-41a9-8158-8dd3857c3808'}

Is it possible to do something like that ? How ?
PS: Sorry, I am not very experienced with programming.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant