In [1]:
pip install SpeechRecognition transformers pyttsx3

Defaulting to user installation because normal site-packages is not writeable
Collecting SpeechRecognition
  Downloading SpeechRecognition-3.13.0-py3-none-any.whl.metadata (30 kB)
Collecting pyttsx3
  Downloading pyttsx3-2.98-py3-none-any.whl.metadata (3.8 kB)
Collecting comtypes (from pyttsx3)
  Downloading comtypes-1.4.9-py3-none-any.whl.metadata (7.1 kB)
Collecting pypiwin32 (from pyttsx3)
  Downloading pypiwin32-223-py3-none-any.whl.metadata (236 bytes)
Downloading SpeechRecognition-3.13.0-py3-none-any.whl (32.8 MB)
   ---------------------------------------- 0.0/32.8 MB ? eta -:--:--
   ---------------------------------------- 0.0/32.8 MB ? eta -:--:--
   ---------------------------------------- 0.3/32.8 MB ? eta -:--:--
    --------------------------------------- 0.5/32.8 MB 1.1 MB/s eta 0:00:29
    --------------------------------------- 0.5/32.8 MB 1.1 MB/s eta 0:00:29
    --------------------------------------- 0.5/32.8 MB 1.1 MB/s eta 0:00:29
    -----------------------------

In [2]:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://download.pytorch.org/whl/cpu
Collecting torchvision
  Downloading https://download.pytorch.org/whl/cpu/torchvision-0.20.1%2Bcpu-cp312-cp312-win_amd64.whl (1.6 MB)
     ---------------------------------------- 0.0/1.6 MB ? eta -:--:--
     ------ --------------------------------- 0.3/1.6 MB ? eta -:--:--
     ------------- -------------------------- 0.5/1.6 MB 1.9 MB/s eta 0:00:01
     ------------- -------------------------- 0.5/1.6 MB 1.9 MB/s eta 0:00:01
     ------------- -------------------------- 0.5/1.6 MB 1.9 MB/s eta 0:00:01
     -------------------- ------------------- 0.8/1.6 MB 729.2 kB/s eta 0:00:02
     -------------------------- ------------- 1.0/1.6 MB 898.8 kB/s eta 0:00:01
     ---------------------------------------- 1.6/1.6 MB 1.0 MB/s eta 0:00:00
Collecting torch
  Downloading https://download.pytorch.org/whl/cpu/torch-2.5.1%2Bcpu-cp312-cp312-win_amd64.whl (205.

In [3]:
pip install tensorflow

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


In [6]:
import speech_recognition as sr
from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
import pyttsx3

In [None]:
#Step-1 : take audio input from user and transcribe that audio into text that can be used further by the model

In [2]:
def speech_to_text():   ##handles the process of converting the spoken audio into text using sr library
    
    recognizer=sr.Recognizer() ## provides method for processing & recognizing speech from the user input

    ##microphone

    # List available microphones
    print("Available microphones:")
    for index, microphone in enumerate(sr.Microphone.list_microphone_names()):
        print(f"{index}: {microphone}")
    
    # Choose the microphone you want (replace 0 with the appropriate index)
    mic_index = 1  # Adjust this based on your microphone index

    with sr.Microphone(device_index=1) as source:  ##opens microphone as the audio input source, enabling to capture real-time-audio
        print("Listening... Speak now.") ##to enable user that program is listening
        try:
            audio = recognizer.listen(source, timeout=5)  ##listens to user audio for 5sec and captures and stores the audio in audio "AudioData" obj
            print("Processing...") ##to notify user that audio captured and is processing
            text = recognizer.recognize_google(audio)  ##converts the audio recorded into text
            
            print(f"Transcribed Text: {text}")  ##prints the transcribed text for user verification
            return text  ##o/p of speech_to_text function

        except sr.UnknownValueError:   ##when audio cant be transcribed/ read properly due to noises
            print("Could not understand the audio.")
            return None
 
        except sr.RequestError as e:  ##when audio cant be transribed due to unavailability of api
            print(f"Request error: {e}")
            return None
        

In [None]:
##stp-2 - processing the transcribed text and returning the sentiment of the user audio

In [3]:
def analyze_text(text):  ##analyzes the sentiment of the text
    print("Analyzing sentiment...")

    ##loadinf the pre-trained sentiment analysis model
    sentiment_pipeline = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")

    result = sentiment_pipeline(text) ##dict returned via sentiment_pipeline stored in result: containing label & score
    sentiment = result[0]['label'] ##extracting the sentiment processed
    score=result[0]['score']

    print(f"Sentiment: {sentiment}, Confidence: {score:.2f}")
    return sentiment

In [None]:
## to convert the result text into speech delivered for output

In [4]:
def text_to_speech(response):  ##response- text to be spoken aloud
    engine = pyttsx3.init() ## starts the pyttsx3 speech synthesis engine and assigns it to engine
    engine.setProperty("rate", 150)  # Set speed
    engine.setProperty("volume", 1.0)  # Set volume

    print(f"Speaking: {response}") ##prints the text, which is spoken aloud
    engine.say(response) ## response text provided to say(), say() queues the text to engine's speech queue, engines processes the text to be read
    engine.runAndWait() ##process the speech command and wait util done

In [None]:
## main program

In [13]:
if __name__ == "__main__":

    user_input = speech_to_text() ## takes the user input

    if user_input:
        sentiment=analyze_text(user_input)  ## sentiment analysis

        # Formulate Response
        if sentiment == "POSITIVE":
            response = "Thank you for your positive feedback! I appreciate it."
        elif sentiment == "NEGATIVE":
            response = "I'm sorry to hear that. How can I help you further?"
        else:
            response = "Thank you for sharing your thoughts."

        
        text_to_speech(response) ## read aloud the response to user

Available microphones:
0: Microsoft Sound Mapper - Input
1: Microphone Array (IntelÂ® Smart 
2: Microsoft Sound Mapper - Output
3: Speakers (Realtek(R) Audio)
4: Primary Sound Capture Driver
5: Microphone Array (IntelÂ® Smart Sound Technology for Digital Microphones)
6: Primary Sound Driver
7: Speakers (Realtek(R) Audio)
8: Speakers (Realtek(R) Audio)
9: Microphone Array (IntelÂ® Smart Sound Technology for Digital Microphones)
10: Microphone Array 1 ()
11: Microphone Array 2 ()
12: Microphone Array 3 ()
13: Headphones ()
14: Speakers 1 (Realtek HD Audio output with SST)
15: Speakers 2 (Realtek HD Audio output with SST)
16: PC Speaker (Realtek HD Audio output with SST)
17: Stereo Mix (Realtek HD Audio Stereo input)
18: Headphones 1 (Realtek HD Audio 2nd output with SST)
19: Headphones 2 (Realtek HD Audio 2nd output with SST)
20: PC Speaker (Realtek HD Audio 2nd output with SST)
21: Microphone (Realtek HD Audio Mic input)
22: Headphones ()
23: Headset (@System32\drivers\bthhfenum.sys,#2;

Device set to use cpu


Sentiment: POSITIVE, Confidence: 1.00
Speaking: Thank you for your positive feedback! I appreciate it.


In [10]:
import pyaudio

# List all audio devices
audio = pyaudio.PyAudio()
for i in range(audio.get_device_count()):
    info = audio.get_device_info_by_index(i)
    print(f"Device {i}: {info['name']}")

# Specify a device index
device_index = int(input("Enter the device index to use: "))

Device 0: Microsoft Sound Mapper - Input
Device 1: Microphone Array (IntelÂ® Smart 
Device 2: Microsoft Sound Mapper - Output
Device 3: Speakers (Realtek(R) Audio)
Device 4: Primary Sound Capture Driver
Device 5: Microphone Array (IntelÂ® Smart Sound Technology for Digital Microphones)
Device 6: Primary Sound Driver
Device 7: Speakers (Realtek(R) Audio)
Device 8: Speakers (Realtek(R) Audio)
Device 9: Microphone Array (IntelÂ® Smart Sound Technology for Digital Microphones)
Device 10: Microphone Array 1 ()
Device 11: Microphone Array 2 ()
Device 12: Microphone Array 3 ()
Device 13: Headphones ()
Device 14: Speakers 1 (Realtek HD Audio output with SST)
Device 15: Speakers 2 (Realtek HD Audio output with SST)
Device 16: PC Speaker (Realtek HD Audio output with SST)
Device 17: Stereo Mix (Realtek HD Audio Stereo input)
Device 18: Headphones 1 (Realtek HD Audio 2nd output with SST)
Device 19: Headphones 2 (Realtek HD Audio 2nd output with SST)
Device 20: PC Speaker (Realtek HD Audio 2nd out

Enter the device index to use:  1


In [12]:
!sudo apt-get install portaudio19-dev

Sudo is disabled on this machine. To enable it, go to the ]8;;ms-settings:developers\Developer Settings page]8;;\ in the Settings app


In [8]:
!pip install pyaudio

Defaulting to user installation because normal site-packages is not writeable
