<div style="font-size: 28px; color: #e91e63; font-weight: bold; background-color: #fff; padding: 10px; border-radius: 8px; border: 2px solid #e91e63; text-align: center;">
    Lab Exercise 3: Speech/Audio to Text Conversion
</div>


<div style="font-size: 24px; color: #e91e63; font-weight: bold; background-color: #fff; padding: 8px; border-left: 5px solid #e91e63; margin-bottom: 10px;">
    Importing Necessary Libraries
</div>


In [3]:
import speech_recognition as sr

<div style="font-size: 24px; color: #e91e63; font-weight: bold; margin-bottom: 10px;">
    Main Function: Speech to Text Conversion
</div>

<p style="font-size: 18px; color: #333; margin-bottom: 15px;">
    We'll use the <code>recognizer</code> object from the <code>speech_recognition</code> library to capture and process the speech.
</p>

<div style="font-size: 24px; color: #e91e63; font-weight: bold; margin-bottom: 10px;">
    System Feedback
</div>

<p style="font-size: 18px; color: #333; margin-bottom: 15px;">
    The system will give users feedback at the following stages:
</p>

<ul style="background-color: #ffe6f0; border: 1px solid #f8bbd0; border-radius: 8px; padding: 15px; list-style-type: none; color: #333; font-size: 16px; margin-bottom: 15px;">
    <li style="margin-bottom: 10px;"><strong style="color: #e91e63;">Speak something</strong>: Prompting the user to speak.</li>
    <li style="margin-bottom: 10px;"><strong style="color: #e91e63;">Recognizing</strong>: Informing the user that their speech is being processed.</li>
    <li style="margin-bottom: 10px;"><strong style="color: #e91e63;">Speech recognized</strong>: If the text is successfully recognized, it will display the converted text.</li>
    <li style="margin-bottom: 10px;"><strong style="color: #e91e63;">Error handling</strong>: For unclear speech or connection issues, the system will notify the user.</li>
</ul>


In [4]:
def speech_to_text():
    recognizer = sr.Recognizer()
    print("Speak something:")
    with sr.Microphone() as source:
        try:
            recognizer.adjust_for_ambient_noise(source)  
            audio = recognizer.listen(source)  
            print("Recognizing...")
            text = recognizer.recognize_google(audio)
            print(f'Speech recognized: "{text}"')
            print('Speech successfully converted to text!')

        except sr.UnknownValueError:
            print("Speech Recognition could not understand audio, Please try speaking more clearly")
        
        except sr.RequestError:
            print("Could not request results from the speech recognition service. Check your internet connection.")
        
        except Exception as e:
            print(f"An error occurred: {str(e)}")

<div style="font-size: 24px; color: #e91e63; font-weight: bold; margin-bottom: 10px;">
    Explanation
</div>

<ul style="background-color: #ffe6f0; border: 1px solid #f8bbd0; border-radius: 8px; padding: 15px; list-style-type: none; color: #333; font-size: 16px; margin-bottom: 15px;">
    <li style="margin-bottom: 10px;"><strong style="color: #e91e63;">Initialization</strong>: We initialize the <code>Recognizer</code> object from the <code>speech_recognition</code> library.</li>
    <li style="margin-bottom: 10px;"><strong style="color: #e91e63;">Microphone Input</strong>: We use the microphone as the source of input and adjust for ambient noise.</li>
    <li style="margin-bottom: 10px;"><strong style="color: #e91e63;">Speech Recognition</strong>: We call <code>recognize_google()</code> to convert the captured audio to text.</li>
    <li style="margin-bottom: 10px;"><strong style="color: #e91e63;">Error Handling</strong>: 
        <ul style="margin-top: 5px;">
            <li style="margin-bottom: 5px;"><strong style="color: #e91e63;">UnknownValueError</strong>: When the system cannot recognize the speech (e.g., mumbling or unclear speech).</li>
            <li style="margin-bottom: 5px;"><strong style="color: #e91e63;">RequestError</strong>: If there is an issue connecting to the Google API (e.g., no internet).</li>
        </ul>
    </li>
    <li style="margin-bottom: 10px;"><strong style="color: #e91e63;">Feedback</strong>: We provide clear messages at each step (e.g., "Speak something", "Recognizing", etc.).</li>
</ul>


<div style="font-size: 24px; color: #e91e63; font-weight: bold; margin-bottom: 10px;">
    Sample Run and Inference
</div>

<ul style="background-color: #ffe6f0; border: 1px solid #f8bbd0; border-radius: 8px; padding: 15px; list-style-type: none; color: #333; font-size: 16px; margin-bottom: 15px;">
    <li style="margin-bottom: 10px;">The system will display: <strong style="color: #e91e63;">"Speak something"</strong>.</li>
    <li style="margin-bottom: 10px;">User speaks into the microphone.</li>
    <li style="margin-bottom: 10px;">The system displays: <strong style="color: #e91e63;">"Recognizing"</strong>.</li>
    <li style="margin-bottom: 10px;">If recognized successfully, the system displays:  
        <strong style="color: #e91e63;">Speech recognized: "Turn on the lights in the living room."</strong><br>
        <strong style="color: #e91e63;">Speech successfully converted to text!</strong>
    </li>
    <li style="margin-bottom: 10px;">If unclear speech is detected, it will display:  
        <strong style="color: #e91e63;">Speech Recognition could not understand audio, Please try speaking more clearly</strong>
    </li>
    <li style="margin-bottom: 10px;">If there is a connectivity issue:  
        <strong style="color: #e91e63;">Could not request results from the speech recognition service. Check your internet connection.</strong>
    </li>
</ul>


<div style="font-size: 24px; color: #e91e63; font-weight: bold; margin-bottom: 10px;">
    Example Inference
</div>

<p style="background-color: #ffe6f0; border: 1px solid #f8bbd0; border-radius: 8px; padding: 15px; color: #333; font-size: 16px; margin-bottom: 15px;">
    <em style="color: #e91e63;">In this example, I spoke clearly into the microphone and thus the function was able to correctly recognize the speech and convert it to text.</em>
</p>


In [8]:
# Running the speech-to-text function
if __name__ == "__main__":
    speech_to_text()

Speak something:
Recognizing...
Speech recognized: "turn on the lights in the living room"
Speech successfully converted to text!


<div style="font-size: 24px; color: #e91e63; font-weight: bold; margin-bottom: 10px;">
    Example Inference - Error Case
</div>

<p style="background-color: #ffe6f0; border: 1px solid #f8bbd0; border-radius: 8px; padding: 15px; color: #333; font-size: 16px; margin-bottom: 15px;">
    <em style="color: #e91e63;">In this example, I turned off the internet connection and called the function. Naturally, because of the lack of internet connection, Google's speech-to-text didn't work, and we received an error message.</em>
</p>


In [9]:
speech_to_text()

Speak something:
Recognizing...
Could not request results from the speech recognition service. Check your internet connection.


<div style="font-size: 24px; color: #e91e63; font-weight: bold; margin-bottom: 10px;">
    Example Inference - Unclear Voice Case
</div>

<p style="background-color: #ffe6f0; border: 1px solid #f8bbd0; border-radius: 8px; padding: 15px; color: #333; font-size: 16px; margin-bottom: 15px;">
    <em style="color: #e91e63;">In this example, I used unclear voice, specifically a sample audio of a patient with germ cell tumor. They sometimes develop paraneoplastic encephalitis. The audio demonstrated dysarthria in a patient with autoimmune KLHL11 encephalitis.</em>
</p>


In [10]:
speech_to_text()

Speak something:
Recognizing...
Speech Recognition could not understand audio, Please try speaking more clearly


<div style="font-size: 24px; color: #e91e63; font-weight: bold; margin-bottom: 10px;">
    Inference
</div>

<div style="background-color: #ffe6f0; border: 1px solid #f8bbd0; border-radius: 8px; padding: 15px; color: #333; font-size: 16px; margin-bottom: 15px;">
    <ul style="margin-left: 20px;">
        <li><strong>Execution</strong>: The system correctly records and processes the user's voice. It responds with feedback at each stage and handles exceptions gracefully.</li>
        <li><strong>Concept clarity</strong>: The main concept is the use of speech recognition for accessibility. The system demonstrates how voice input can be converted to text in real-time.</li>
        <li><strong>Self-learning</strong>: Built with error handling and feedback mechanisms, the system provides a good base for developing more complex voice-control features.</li>
    </ul>
    <p>The simplicity of this implementation can be enhanced further by integrating this into smart devices or applications in future versions.</p>
</div>

<div style="font-size: 24px; color: #e91e63; font-weight: bold; margin-bottom: 10px;">
    Conclusion
</div>

<p style="background-color: #ffe6f0; border: 1px solid #f8bbd0; border-radius: 8px; padding: 15px; color: #333; font-size: 16px; margin-bottom: 15px; font-weight: bold;">
    This basic speech-to-text system provides a functional prototype that can be used as a foundation for more advanced voice-controlled accessibility features. Its performance depends on the clarity of speech and internet connection, but it handles errors efficiently, ensuring a user-friendly experience.
</p>

<hr style="border: 0; height: 2px; background-color: #e91e63; margin: 20px 0;">

<div style="font-size: 24px; color: #e91e63; font-weight: bold; margin-bottom: 10px;">
    References
</div>
<div style="background-color: #ffe6f0; border: 1px solid #f8bbd0; border-radius: 8px; padding: 15px; color: #333; font-size: 16px;">
    <ul>
        <li>Python Speech Recognition Documentation: <a href="https://pypi.org/project/SpeechRecognition/" target="_blank">https://pypi.org/project/SpeechRecognition/</a></li>
        <li>Google Cloud Speech-to-Text API: <a href="https://cloud.google.com/speech-to-text/docs" target="_blank">https://cloud.google.com/speech-to-text/docs</a></li>
        <li>Speech Recognition GitHub Repository: <a href="https://github.com/Uberi/speech_recognition" target="_blank">https://github.com/Uberi/speech_recognition</a></li>
        <li>Unclear Speech Sound: <a href="https://youtu.be/2Pw2mc02iDg?si=TzvwkWaNMj5I5Nh0" target="_blank">https://youtu.be/2Pw2mc02iDg?si=TzvwkWaNMj5I5Nh0</a></li>
    </ul>
</div>
