# Introduction to Speech Recognition

## 1. **Key Concepts:**

* **What is Speech Recognition?**
    - Speech recognition refers to the process of converting spoken words into text.
    - It involves recognizing audio signals, processing them, and translating them into meaningful text.
    - **Real-life applications**:
      - **Virtual Assistants**: Siri, Alexa, Google Assistant.
      - **Dictation Software**: Tools for converting speech into text for documents, emails, etc.
      - **Voice-controlled Systems**: Smart homes, in-car systems.
      - **Accessibility**: Assisting people with disabilities by converting spoken commands to digital actions.

* **Challenges in Speech Recognition**:
    - **Accents and Dialects**: Varying ways of pronouncing words can confuse systems.
    - **Background Noise**: Competing sounds (music, people talking) can degrade the accuracy.
    - **Multiple Speakers**: Difficulty distinguishing between voices when more than one person speaks.
    - **Language Variations**: Handling different languages or multilingual speakers.
    
* **Popular Speech Recognition Systems**:
    - **Google Speech API**: One of the most widely used, supports many languages (What we'll also use here).
    - **Amazon Transcribe**: Offers scalable speech recognition for businesses.
    - **IBM Watson Speech to Text**: Known for accuracy and advanced language models.
    - **Microsoft Azure Speech**: Provides real-time transcription with support for multiple languages.

## 2. Speech Recognition in Python 

* **SpeechRecognition Library**:
    - The **SpeechRecognition** library is a simple Python package that allows speech-to-text conversion.
    - It supports several speech engines and APIs like Google Web Speech API, Microsoft Bing Voice Recognition, and others.
    - **Why use it**: It abstracts much of the complexity of handling audio input and communicating with APIs, making it easier to integrate speech recognition into Python applications.

* **Streamlit Library**:
    - **Streamlit** is a Python framework that helps you create web apps quickly.
    - It’s particularly useful for building interactive applications like the **Speech Recognition** app because it can display buttons, texts, and real-time outputs easily.
    - **Why use Streamlit**: It simplifies the front-end work, allowing developers to focus on the functionality (like speech transcription), and is great for rapid prototyping.


## 3. Setting Up Speech Recognition in Python

1. **Install Required Libraries**:
    - **SpeechRecognition**: For converting spoken words into text.
      ```bash
      pip install SpeechRecognition
      ```
    - **Streamlit**: For building a simple web interface.
      ```bash
      pip install streamlit
      ```

2. **Install PyAudio (if using microphone input)**:
    - PyAudio is required for capturing real-time audio input from the microphone.
    - It can be tricky to install on some platforms, especially Windows. Install with:
      ```bash
      pip install pyaudio
      ```
    - **Note**: PyAudio is not required if you're transcribing pre-recorded audio files.

**4.Code Breakdown**

In [8]:
# Create the file speech_recognition_app.py in write mode
with open("speech_recognition_app.py", "w") as file:
    # Writing the Streamlit code into the file
    file.write('''
    

# We'll go through a simple app that records audio from the microphone and transcribes it into text using Google's Speech API.

# Step 1: Import Required Libraries
import streamlit as st
import speech_recognition as sr


# Step 2: Define the Speech Recognition Function
def transcribe_speech():
    # Initialize recognizer class
    r = sr.Recognizer()
    
    # Use the microphone as the audio source
    with sr.Microphone() as source:
        st.info("Speak now...")
        audio_text = r.listen(source)
        st.info("Transcribing...")

        try:
            # Using Google Speech Recognition
            text = r.recognize_google(audio_text)
            return text
        except:
            return "Sorry, I did not get that."

## Explanation:
#  - "sr.Recognizer()": Initializes the recognizer object.
#  - "sr.Microphone()": Captures real-time audio from the microphone.
#  - "r.listen()": Listens to the speech and stores it as an audio object.
#  - "r.recognize_google()": Uses Google's speech recognition engine to convert audio into text.

## Working with Audio Files
# If you want to transcribe pre-recorded audio files instead of using the microphone, PyAudio is not needed. Here's how you can modify the transcribe_speech() function:

def transcribe_audio_file("C:\\Users\\pc\\Desktop\\B-older\\Data and Stuff\\GMC\\ML GMC\\harvard.wav"):
    # Initialize recognizer class
    r = sr.Recognizer()

    # Use the pre-recorded audio file as the source
    with sr.AudioFile(file_path) as source:
        audio_text = r.record(source)  # Read the audio file
        st.info("Transcribing...")

        try:
            # Using Google Speech Recognition
            text = r.recognize_google(audio_text)
            return text
        except:
            return "Sorry, I did not get that."
            
## Explanation:
# sr.AudioFile(): Opens the audio file as a source.
# r.record(): Reads the audio from the file, instead of capturing it live.


# Step 3: Create the Main Function
def main():
    st.title("Speech Recognition App")
    st.write("Click on the microphone to start speaking:")
    
    if st.button("Start Recording"):
        text = transcribe_audio_file
        # text = transcribe_speech()
        st.write("Transcription: ", text)

  
## Explanation:
#  - "st.button()": Creates an interactive button in the Streamlit interface.
#  - "transcribe_speech()": When the button is clicked, the function is called to transcribe the spoken words into text.



# Step 4: Run the App:

if __name__ == "__main__":
    main()

# This ensures the app runs when executed.
    ''')

print("speech_recognition_app.py creation executed successfully!")

speech_recognition_app.py creation executed successfully!


In [10]:
# Create the file speech_recognition_app.py in write mode
with open("speech_recognition_app.py", "w") as file:
    # Writing the Streamlit code into the file
    file.write('''
    
# We'll go through a simple app that records audio from the microphone and transcribes it into text using Google's Speech API.

# Step 1: Import Required Libraries
import streamlit as st
import speech_recognition as sr


# Step 2: Define the Speech Recognition Function
def transcribe_speech():
    # Initialize recognizer class
    r = sr.Recognizer()
    
    # Use the microphone as the audio source
    with sr.Microphone() as source:
        st.info("Speak now...")
        audio_text = r.listen(source)
        st.info("Transcribing...")

        try:
            # Using Google Speech Recognition
            text = r.recognize_google(audio_text)
            return text
        except:
            return "Sorry, I did not get that."

## Explanation:
#  - "sr.Recognizer()": Initializes the recognizer object.
#  - "sr.Microphone()": Captures real-time audio from the microphone.
#  - "r.listen()": Listens to the speech and stores it as an audio object.
#  - "r.recognize_google()": Uses Google's speech recognition engine to convert audio into text.

## Working with Audio Files
# If you want to transcribe pre-recorded audio files instead of using the microphone, PyAudio is not needed. Here's how you can modify the transcribe_speech() function:

def transcribe_audio_file(file_path):
    # Initialize recognizer class
    r = sr.Recognizer()

    # Use the pre-recorded audio file as the source
    with sr.AudioFile(file_path) as source:
        audio_text = r.record(source)  # Read the audio file
        st.info("Transcribing...")

        try:
            # Using Google Speech Recognition
            text = r.recognize_google(audio_text)
            return text
        except:
            return "Sorry, I did not get that."
            
## Explanation:
# sr.AudioFile(): Opens the audio file as a source.
# r.record(): Reads the audio from the file, instead of capturing it live.


# Step 3: Create the Main Function
def main():
    st.title("Speech Recognition App")
    st.write("Click on the button to transcribe audio file:")
    
    if st.button("Start Transcription"):
        file_path = r"C:\\Users\\pc\\Desktop\\B-older\\Data and Stuff\\GMC\\ML GMC\\harvard.wav"
        text = transcribe_audio_file(file_path)
        #text = transcribe_speech() 
        st.write("Transcription: ", text)

  
## Explanation:
#  - "st.button()": Creates an interactive button in the Streamlit interface.
#  - "transcribe_audio_file()": When the button is clicked, the function is called to transcribe the audio file into text.



# Step 4: Run the App:

if __name__ == "__main__":
    main()

# This ensures the app runs when executed.
    ''')

print("speech_recognition_app.py creation executed successfully!")

speech_recognition_app.py creation executed successfully!


#### 5. Working with Audio Files

If you want to transcribe pre-recorded audio files instead of using the microphone, PyAudio is **not** needed. Here's how you can modify the `transcribe_speech()` function:

```python
def transcribe_audio_file(file_path):
    # Initialize recognizer class
    r = sr.Recognizer()

    # Use the pre-recorded audio file as the source
    with sr.AudioFile(file_path) as source:
        audio_text = r.record(source)  # Read the audio file
        st.info("Transcribing...")

        try:
            # Using Google Speech Recognition
            text = r.recognize_google(audio_text)
            return text
        except:
            return "Sorry, I did not get that."
```

* **Explanation**:
  - `sr.AudioFile()`: Opens the audio file as a source.
  - `r.record()`: Reads the audio from the file, instead of capturing it live.

#### 6. Troubleshooting and Common Issues

1. **PyAudio Installation Issues**:
   - PyAudio is essential if you’re using a microphone. On Windows, you may encounter errors like `error: Microsoft Visual C++ Build Tools required`.
     - Solution: Download pre-built PyAudio binaries from [this site](https://www.lfd.uci.edu/~gohlke/pythonlibs/) if you're on Windows.

2. **Recognize Errors**:
   - Speech recognition sometimes fails due to poor microphone input or background noise.
     - Solution: Encourage the user to speak clearly or adjust the microphone sensitivity.