# **Speech-to-Text using Google Web Speech API**

## **Introduction**
This project demonstrates a **Speech-to-Text Application** that converts spoken words into text using a microphone. The application leverages the **Google Web Speech API** for high-accuracy transcription. The program is designed to be user-friendly and adaptive to environmental noise, making it suitable for various real-world use cases.

---

## **Project Components**

### **1. Key Features**
- **Ambient Noise Adjustment**: Automatically adjusts to background noise for improved recognition accuracy.
- **Dynamic Energy Threshold**: Adapts dynamically to the audio environment.
- **Customizable Language Support**: Configured to recognize French (`fr-FR`) with support for other languages as needed.
- **Timeout Settings**: Ensures responsiveness by limiting listening time and handling speech silence effectively.

---

### **2. Libraries Used**
1. **SpeechRecognition**: Captures audio from a microphone and converts it into text using a speech recognition API.

---

### 3. The code

In [None]:
import speech_recognition as sr

def speech_to_text():
    """Convert speech to text using a microphone."""
    # Initialize recognizer
    recognizer = sr.Recognizer()

    # Use microphone as input
    with sr.Microphone() as source:
        print("Adjusting for ambient noise, please wait...")
        recognizer.adjust_for_ambient_noise(source, duration=2)
        print("Listening... Speak now.")
        
        # Adjust how long to wait for silence before considering speech done
        recognizer.pause_threshold = 3  # Seconds of silence before stopping
        recognizer.dynamic_energy_threshold = True  # Dynamically adjust to noise

        try:
            # Capture the audio with a timeout for safety
            audio = recognizer.listen(source, timeout=10, phrase_time_limit=15)
            print("Processing the audio...")

            # Use Google Web Speech API for recognition
            #text = recognizer.recognize_google(audio)
            text = recognizer.recognize_google(audio, language='fr-FR')
            print(f"Recognized Text: {text}")
            return text

        except sr.UnknownValueError:
            print("Sorry, could not understand the audio.")
        except sr.RequestError as e:
            print(f"Request error from Google Speech Recognition service; {e}")
        except sr.WaitTimeoutError:
            print("No speech detected within the time limit.")

# Run the speech-to-text function
if __name__ == "__main__":
    print("Starting the Speech-to-Text program.")
    recognized_text = speech_to_text()
    if recognized_text:
        print(f"You said: {recognized_text}")



## **System Workflow**

### **1. Speech Recognition Setup**
The system initializes the `speech_recognition.Recognizer` object, which handles audio capture and processing.

```python
recognizer = sr.Recognizer()
```

---

### **2. Microphone Input Handling**
The microphone is used as the audio source. The program accounts for ambient noise and adjusts the sensitivity of the recognizer accordingly.

```python
with sr.Microphone() as source:
    print("Adjusting for ambient noise, please wait...")
    recognizer.adjust_for_ambient_noise(source, duration=2)
    print("Listening... Speak now.")
```

---

### **3. Speech Processing**
The program listens for audio input with a timeout and transcribes it using the Google Web Speech API.

- **Timeout**: Limits the duration of listening to 10 seconds.
- **Pause Threshold**: Determines how long the system waits for silence before assuming the speech is complete.
- **Dynamic Energy Threshold**: Adjusts sensitivity to varying noise levels.

```python
audio = recognizer.listen(source, timeout=10, phrase_time_limit=15)
text = recognizer.recognize_google(audio, language='fr-FR')
print(f"Recognized Text: {text}")
```

---

### **4. Exception Handling**
The program is robust against common errors such as:
- **No Speech Detected**: Alerts the user when no speech is captured within the timeout limit.
- **Recognition Failure**: Handles cases where the speech is unclear or unintelligible.
- **API Errors**: Provides informative messages if the speech recognition service is unavailable.

```python
except sr.UnknownValueError:
    print("Sorry, could not understand the audio.")
except sr.RequestError as e:
    print(f"Request error from Google Speech Recognition service; {e}")
except sr.WaitTimeoutError:
    print("No speech detected within the time limit.")
```

---

## **Key Features**

| Feature                   | Description                                                                 |
|---------------------------|-----------------------------------------------------------------------------|
| **Noise Adjustment**      | Adapts to ambient noise for clearer transcription.                         |
| **Language Customization**| Configured for French (`fr-FR`) but can be adjusted for other languages.   |
| **Error Handling**        | Manages common errors gracefully to ensure smooth user experience.         |
| **Speech Timeout**        | Stops listening after a specified duration to avoid indefinite waiting.    |

---

## **Applications**
- **Multilingual Speech Transcription**: Convert speech into text in various languages.
- **Voice Command Interfaces**: Integrate into virtual assistants for hands-free operation.
- **Educational Tools**: Assist language learners by transcribing spoken phrases.

---

## **Future Enhancements**
- **Multilingual Support**: Add automatic language detection.
- **Extended API Options**: Integrate other speech recognition APIs like Azure or IBM Watson.
- **Real-Time Feedback**: Display transcribed text in real time as the user speaks.
- **Offline Mode**: Incorporate offline speech recognition for enhanced privacy and independence.

---

