Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speech_recognize_continuous_from_file : how to print all results together in a text file #345

Closed
Khushu06 opened this issue Aug 11, 2019 · 10 comments

Comments

@Khushu06
Copy link

Khushu06 commented Aug 11, 2019

# coding: utf-8

# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license. See LICENSE.md file in the project root for full license information.

import time
import wave

try:
    import azure.cognitiveservices.speech as speechsdk
except ImportError:
    print("""
    Importing the Speech SDK for Python failed.
    Refer to
    https://docs.microsoft.com/azure/cognitive-services/speech-service/quickstart-python for
    installation instructions.
    """)
    import sys
    sys.exit(1)

# Set up the subscription info for the Speech Service:
# Replace with your own subscription key and service region (e.g., "westus").
speech_key, service_region = "key", "region"

# Specify the path to an audio file containing speech (mono WAV / PCM with a sampling rate of 16
# kHz).
hindi = "C:/Users/Khushboo.Girotra/Desktop/audionew1.wav"

def speech_recognize_continuous_from_file():
    """performs continuous speech recognition with input from an audio file"""
    # <SpeechContinuousRecognitionWithFile>
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region, speech_recognition_language='hi-IN')
    audio_config = speechsdk.audio.AudioConfig(filename=hindi)

    speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)

    done = False

    def stop_cb(evt):
        """callback that stops continuous recognition upon receiving an event `evt`"""
        print('CLOSING on {}'.format(evt))
        speech_recognizer.stop_continuous_recognition()
        nonlocal done
        done = True
    
                
    # Connect callbacks to the events fired by the speech recognizer
    speech_recognizer.recognizing.connect(lambda evt: print('RECOGNIZING: {}'.format(evt)))
    all_results = []
    def handle_final_result(evt):
    all_results.append(evt.result.text)

    speech_recognizer.recognized.connect(handle_final_result)
    speech_recognizer.start_continuous_recognition()
    print(all_results)
    
    speech_recognizer.recognized.connect(lambda evt: print('RECOGNIZED: {}'.format(evt)))
    speech_recognizer.session_started.connect(lambda evt: print('SESSION STARTED: {}'.format(evt)))
    speech_recognizer.session_stopped.connect(lambda evt: print('SESSION STOPPED {}'.format(evt)))
    speech_recognizer.canceled.connect(lambda evt: print('CANCELED {}'.format(evt)))
    # stop continuous recognition on either session stopped or canceled events
    speech_recognizer.session_stopped.connect(stop_cb)
    speech_recognizer.canceled.connect(stop_cb)
    
    #speech_recognizer.start_continuous_recognition()

    # Start continuous speech recognition
    speech_recognizer.start_continuous_recognition()
    
    while not done:
        time.sleep(.5)
    # </SpeechContinuousRecognitionWithFile>```

**Above code is not working

@Khushu06 Khushu06 changed the title speech_recognize_continuous_from_file : print all results together in a text file speech_recognize_continuous_from_file : how to print all results together in a text file Aug 11, 2019
@Khushu06
Copy link
Author

Hi , Kindly help!

@chlandsi
Copy link
Contributor

You're trying to print results before any recognition has happened. Move the print(all_results) command after the call to stop_continuous_recognition, and you should see the accumulated transcribed utterances.

@Khushu06
Copy link
Author


# Copyright (c) Microsoft. All rights reserved.
# Licensed under the MIT license. See LICENSE.md file in the project root for full license information.

import time
import wave

try:
    import azure.cognitiveservices.speech as speechsdk
except ImportError:
    print("""
    Importing the Speech SDK for Python failed.
    Refer to
    https://docs.microsoft.com/azure/cognitive-services/speech-service/quickstart-python for
    installation instructions.
    """)
    import sys
    sys.exit(1)

# Set up the subscription info for the Speech Service:
# Replace with your own subscription key and service region (e.g., "westus").
speech_key, service_region = "key", "region"

# Specify the path to an audio file containing speech (mono WAV / PCM with a sampling rate of 16
# kHz).
hindi = "C:/Users/Khushboo.Girotra/Desktop/audionew1.wav"

def speech_recognize_continuous_from_file():
    """performs continuous speech recognition with input from an audio file"""
    # <SpeechContinuousRecognitionWithFile>
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region, speech_recognition_language='hi-IN')
    audio_config = speechsdk.audio.AudioConfig(filename=hindi)

    speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)

    done = False

    def stop_cb(evt):
        """callback that stops continuous recognition upon receiving an event `evt`"""
        print('CLOSING on {}'.format(evt))
        speech_recognizer.stop_continuous_recognition()
        print(all_results)
        nonlocal done
        done = True
    
                
    # Connect callbacks to the events fired by the speech recognizer
    speech_recognizer.recognizing.connect(lambda evt: print('RECOGNIZING: {}'.format(evt)))
    
    all_results = []
    def handle_final_result(evt):
        all_results.append(evt.result.text)
        speech_recognizer.recognized.connect(handle_final_result)
        speech_recognizer.start_continuous_recognition()
        
    speech_recognizer.recognized.connect(lambda evt: print('RECOGNIZED: {}'.format(evt)))
    speech_recognizer.session_started.connect(lambda evt: print('SESSION STARTED: {}'.format(evt)))
    speech_recognizer.session_stopped.connect(lambda evt: print('SESSION STOPPED {}'.format(evt)))
    speech_recognizer.canceled.connect(lambda evt: print('CANCELED {}'.format(evt)))
    # stop continuous recognition on either session stopped or canceled event
    
    speech_recognizer.session_stopped.connect(stop_cb)
    speech_recognizer.canceled.connect(stop_cb)
    
   
    #speech_recognizer.start_continuous_recognition()

    # Start continuous speech recognition
    speech_recognizer.start_continuous_recognition()
    
    while not done:
        time.sleep(.5)
    # </SpeechContinuousRecognitionWithFile>```

As you suggested I moved the print(all_results) command after the call to stop_continuous_recognition but it is still reflecting the same output. I am not getting all the text together .  Kindly help

@chlandsi
Copy link
Contributor

Please try this snippet:

def speech_recognize_continuous_from_file():
    """performs continuous speech recognition with input from an audio file"""
    # <SpeechContinuousRecognitionWithFile>
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
    audio_config = speechsdk.audio.AudioConfig(filename=weatherfilename)

    speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)

    done = False

    def stop_cb(evt):
        """callback that stops continuous recognition upon receiving an event `evt`"""
        print('CLOSING on {}'.format(evt))
        speech_recognizer.stop_continuous_recognition()
        nonlocal done
        done = True

    all_results = []
    def handle_final_result(evt):
        all_results.append(evt.result.text)

    speech_recognizer.recognized.connect(handle_final_result)
    # Connect callbacks to the events fired by the speech recognizer
    speech_recognizer.recognizing.connect(lambda evt: print('RECOGNIZING: {}'.format(evt)))
    speech_recognizer.recognized.connect(lambda evt: print('RECOGNIZED: {}'.format(evt)))
    speech_recognizer.session_started.connect(lambda evt: print('SESSION STARTED: {}'.format(evt)))
    speech_recognizer.session_stopped.connect(lambda evt: print('SESSION STOPPED {}'.format(evt)))
    speech_recognizer.canceled.connect(lambda evt: print('CANCELED {}'.format(evt)))
    # stop continuous recognition on either session stopped or canceled events
    speech_recognizer.session_stopped.connect(stop_cb)
    speech_recognizer.canceled.connect(stop_cb)

    # Start continuous speech recognition
    speech_recognizer.start_continuous_recognition()
    while not done:
        time.sleep(.5)

    print("Printing all results:")
    print(all_results)

@Khushu06
Copy link
Author

There are 2 speakers in a conversation . How can I separate the texts/statements of 2 speakers?

@chlandsi
Copy link
Contributor

This feature (diarization) is currently not available in the real-time transcription service, but it is available via the batch service. You can find a Python sample here, and #286 describes how to enable diarization.

@Khushu06
Copy link
Author

One more question , The accuracy of conversion is very very less(irrelevant conversion).
Is there anyway , I can improve the accuracy of text conversion.

@chlandsi
Copy link
Contributor

If possible, try to improve the audio quality of the input (different audio settings, better microphone equipment, try to reduce environment noise, etc.). On the service side, Custom Speech can be used to train models that are specific to your applications.

@Khushu06
Copy link
Author

Can I export this file or download it ?
Like right now I am getting text result in shell . If I want a text file to be downloaded what code shall I write

@chlandsi
Copy link
Contributor

You can use standard Python methods to save the text to a file, i.e.

with open('output.txt') as f:
    f.write('\n'.join(all_results))

Please note that this forum is primarily for SDK-related issues, not for general programming questions (try for example stackoverflow for these), so I'm closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants