# **Lab Exercise 3: Speech/Audio to Text Conversion**

**Importing Necessary Libraries**

In [3]:
import speech_recognition as sr

#### **Main Function: Speech to Text Conversion**

We'll use the `recognizer` object from the `speech_recognition` library to capture and process the speech.

### System Feedback

The system will give users feedback at the following stages:
- **Speak something**: Prompting the user to speak.
- **Recognizing**: Informing the user that their speech is being processed.
- **Speech recognized**: If the text is successfully recognized, it will display the converted text.
- **Error handling**: For unclear speech or connection issues, the system will notify the user.

In [4]:
def speech_to_text():
    recognizer = sr.Recognizer()
    print("Speak something:")
    with sr.Microphone() as source:
        try:
            recognizer.adjust_for_ambient_noise(source)  
            audio = recognizer.listen(source)  
            print("Recognizing...")
            text = recognizer.recognize_google(audio)
            print(f'Speech recognized: "{text}"')
            print('Speech successfully converted to text!')

        except sr.UnknownValueError:
            print("Speech Recognition could not understand audio, Please try speaking more clearly")
        
        except sr.RequestError:
            print("Could not request results from the speech recognition service. Check your internet connection.")
        
        except Exception as e:
            print(f"An error occurred: {str(e)}")


#### Explanation
1. **Initialization**: We initialize the `Recognizer` object from the `speech_recognition` library.
2. **Microphone Input**: We use the microphone as the source of input and adjust for ambient noise.
3. **Speech Recognition**: We call `recognize_google()` to convert the captured audio to text.
4. **Error Handling**: 
   - `UnknownValueError`: When the system cannot recognize the speech (e.g., mumbling or unclear speech).
   - `RequestError`: If there is an issue connecting to the Google API (e.g., no internet).
5. **Feedback**: We provide clear messages at each step (e.g., "Speak something", "Recognizing", etc.).

### Sample Run and Inference


1. The system will display: **"Speak something"**.
2. User speaks into the microphone.
3. The system displays: **"Recognizing"**.
4. If recognized successfully, the system displays:  
   **Speech recognized: "Turn on the lights in the living room."**  
   **Speech successfully converted to text!**
   
   If unclear speech is detected, it will display:  
   **Speech Recognition could not understand audio, Please try speaking more clearly**  
   
   If there is a connectivity issue:  
   **Could not request results from the speech recognition service. Check your internet connection.**

*In this Example, I spoke clearly into the microphone and thus the function was able to correctly recognise the speech and converted it to text*

In [8]:
# Running the speech-to-text function
if __name__ == "__main__":
    speech_to_text()

Speak something:
Recognizing...
Speech recognized: "turn on the lights in the living room"
Speech successfully converted to text!


*In this Example, I turned off the internet connection and called the function, naturally because of the lack of internet connection the Google's speech-to-text didn't work and we got an error message*

In [9]:
speech_to_text()

Speak something:
Recognizing...
Could not request results from the speech recognition service. Check your internet connection.


*In this Example, I used unclear voice specifically used a sample audio of a patient with germ cell tumor. They sometimes develop paraneoplastic encephalitis. The audio demonstrated dysarthria in a patient with autoimmune KLHL11 encephaltis.*

In [10]:
speech_to_text()

Speak something:
Recognizing...
Speech Recognition could not understand audio, Please try speaking more clearly


### **Inference**

- **Execution**: The system correctly records and processes the user's voice. It responds with feedback at each stage and handles exceptions gracefully.
- **Concept clarity**: The main concept is the use of speech recognition for accessibility. The system demonstrates how voice input can be converted to text in real-time.
- **Self-learning**: Built with error handling and feedback mechanisms, the system provides a good base for developing more complex voice-control features.
  
The simplicity of this implementation can be enhanced further by integrating this into smart devices or applications in future versions.

### **Conclusion**

This basic speech-to-text system provides a functional prototype that can be used as a foundation for more advanced voice-controlled accessibility features. Its performance depends on the clarity of speech and internet connection, but it handles errors efficiently, ensuring a user-friendly experience.