### Project - Voice-enabled Chatbot
The project is designed to create a voice-based chatbot using speech recognition and text-to-speech synthesis to interact with users. It combines the 
following technologies:

#### Speech Recognition (ASR - Automatic Speech Recognition):

- Converts spoken input from the user into text.

- The chatbot listens to the user's voice, processes it, and understands the command or query.

- Word2Vec (Word Embedding model):

- Used to represent words in vector form, allowing the bot to understand the similarity between words and their meanings.

- By training the chatbot with word vectors, it improves the understanding of user input beyond simple keyword matching, enabling more context-aware conversations.

### Natural Language Processing (NLP):

- Analyzes the text input to understand the intent behind the user's request (e.g., asking for the time, telling a joke, etc.).

- NLP helps the chatbot map user input to a specific intent, allowing it to choose an appropriate response.

#### Text-to-Speech (TTS):

- Converts the text response from the chatbot back into speech, so the user can hear the response instead of reading it.



###### A voice chatbot is a type of conversational AI that uses speech recognition (to understand spoken input) and text-to-speech (TTS) synthesis (to generate spoken output) to interact with users in a voice-based format. Unlike text chatbots, which rely on typed text input and output, a voice chatbot allows users to speak to the bot and hear its responses, making the interaction more natural and hands-free.

### Key Components of a Voice Chatbot:
##### Speech Recognition (ASR):

- Converts spoken words (from the user) into text.

- This step involves recognizing the speech input (audio) and translating it into written language.

- Example: The user says "What's the weather today?" → the system converts this to text.

#### Natural Language Processing (NLP):

- The core of the chatbot that processes the text input to understand the user's intent and context.

- The chatbot determines what the user wants to do based on keywords or meanings from the speech input.

- This includes identifying intents (e.g., asking for the weather, requesting a joke, etc.).

#### Text Generation:

- The chatbot generates an appropriate text response based on the user's intent.

#### Text-to-Speech (TTS):

- Converts the text response into speech so the user can hear the chatbot's reply.

- The TTS engine plays the chatbot's response through the speakers or headphones.



#### How Voice Chatbots Work:
- User speaks to the chatbot.

- The speech recognition system converts the user's spoken words into text.

- The NLP model processes the text to understand the intent (e.g., "Tell me a joke" or "What's the time?").

- The chatbot responds with a generated text reply (e.g., "Here's a joke: Why don't scientists trust atoms?").

- The text-to-speech system reads the chatbot’s response out loud to the user.

In [5]:
!pip install pyttsx3

# Use: pyttsx3 is a Text-to-Speech (TTS) library in Python. It is used to convert text 
# (which the chatbot generates as a response) into speech (audio). The library uses various speech engines based
# on your system (SAPI5 on Windows, NSSpeechSynthesizer on macOS, and espeak on Linux).






In [6]:
!pip install SpeechRecognition

# Use: SpeechRecognition is a library that helps convert spoken language (audio input from the user) into text. 
# It supports different speech recognition engines like Google Web Speech API, CMU Sphinx, etc. This is used 
# to listen to what the user says and convert that audio into text, which is then processed by the chatbot to determine the user's intent.



In [7]:
!pip install pyaudio

# PyAudio is a cross-platform audio input/output library that provides easy-to-use methods 
# to interact with microphones and speakers. It is particularly useful 
# when building applications that involve audio input or output, such as voice chatbots or voice assistants.

# Audio Input (Microphone):

# PyAudio allows your Python program to capture audio from a microphone. This is essential for voice 
# chatbots or voice-controlled applications that need to listen to and recognize the user's speech.

# In the context of a voice chatbot, PyAudio is used by the SpeechRecognition library to listen to the user's spoken input.

# Audio Output (Speakers):

# PyAudio can also be used to play sound through speakers, though it is not as commonly used for output in 
# chatbot applications. For text-to-speech, libraries like pyttsx3 are used instead, which may internally rely on PyAudio for playing the audio.

# Real-time Audio Processing:

# PyAudio can be used for applications that require real-time audio processing, such as live voice communication or audio recording.






In [8]:
import datetime  # Importing datetime module to get current time
import random  # Importing random module to randomly select jokes
from gensim.models import Word2Vec  # Importing Word2Vec from gensim library for word embeddings
import pyttsx3  # Importing pyttsx3 for text-to-speech conversion
import speech_recognition as sr  # Importing speech recognition for converting speech to text

In [9]:
# Sample training data for Word2Vec
training_sentences = [
    ["hello", "hi", "greetings"],
    ["time", "clock", "hours"],
    ["joke", "funny", "laugh"],
    ["name", "identity", "who"],
    ["bye", "goodbye", "see", "later"]
]


In [10]:
# Train Word2Vec model on the sample sentences
model = Word2Vec(sentences=training_sentences, vector_size=50, window=5, min_count=1, workers=1)

In [11]:
def get_time():
    now = datetime.datetime.now()  # Get the current datetime
    return now.strftime("%H:%M")  # Return time in HH:MM format

def tell_joke():
    jokes = [
        "Why don't scientists trust atoms? Because they make up everything!",
        "Why did the math book look sad? Because it had too many problems.",
        "I'm reading a book about anti-gravity. It's impossible to put down!"
    ]
    return random.choice(jokes)  # Randomly return one joke

def find_intent(user_input):
    user_input = user_input.lower()  # Convert input to lowercase for consistency
    words = user_input.split()  # Split the input into individual words

    # Define intents and their associated keywords
    intents = {
        "greeting": ["hello", "hi", "hey"],
        "time": ["time", "clock", "hours"],
        "joke": ["joke", "funny", "laugh"],
        "name": ["name", "identity", "who"],
        "bye": ["bye", "goodbye", "see", "later"]
    }

    scores = {intent: 0 for intent in intents}  # Initialize intent scores

    for word in words:
        if word not in model.wv:  # Skip if word not in vocabulary
            continue
        for intent, keywords in intents.items():  # Loop through each intent and its keywords
            for keyword in keywords:
                if keyword in model.wv:  # Ensure keyword exists in vocabulary
                    similarity = model.wv.similarity(word, keyword)  # Calculate similarity
                    scores[intent] += similarity  # Accumulate similarity score for intent

    best_intent = max(scores, key=scores.get)  # Get intent with highest score
    if scores[best_intent] > 0.3:  # Threshold to decide if intent is valid
        return best_intent
    else:
        return None  # Return None if no intent is confidently matched

def simple_chatbot(user_input):
    intent = find_intent(user_input)  # Find the intent based on user input

    if intent == "greeting":
        return "Hello! How can I help you today?"  # Respond to greetings
    elif intent == "time":
        return f"The current time is {get_time()}"  # Respond with current time
    elif intent == "joke":
        return tell_joke()  # Respond with a joke
    elif intent == "name":
        return "I'm your friendly chatbot!"  # Respond with bot's name
    elif intent == "bye":
        return "Goodbye! Have a great day!"  # Respond to goodbyes
    else:
        return "I'm sorry, I don't understand that yet. Can you try asking something else?"  # Fallback response

# Function for voice output (Text to Speech)
def speak(text):
    engine = pyttsx3.init()  # Initialize TTS engine
    engine.say(text)  # Convert text to speech
    engine.runAndWait()  # Wait for speech to finish

# Function to listen to user input (Speech to Text)
def listen():
    recognizer = sr.Recognizer()  # Initialize recognizer
    with sr.Microphone() as source:  # Use the microphone as the audio source
        print("Listening...")  # Prompt for user to speak
        audio = recognizer.listen(source)  # Capture the user's speech
        try:
            print("Recognizing...")  # Inform user that we are processing the speech
            user_input = recognizer.recognize_google(audio)  # Convert speech to text using Google API
            print(f"You said: {user_input}")  # Display recognized text
            return user_input
        except sr.UnknownValueError:
            print("Sorry, I could not understand that.")  # If speech is not clear
            return ""
        except sr.RequestError:
            print("Sorry, the speech recognition service is down.")  # If API request fails
            return ""



In [12]:
# --- Main Loop ---
if __name__ == "__main__":
    print("Chatbot is running! Speak 'bye' to exit.")  # Initial message when bot starts
    speak("Chatbot is running! Speak 'bye' to exit.")  # Voice output
    while True:
        user_input = listen()  # Listen for user's voice input
        if "bye" in user_input.lower():  # Exit loop if user says 'bye'
            speak("Goodbye! Have a great day!")  # Voice output for goodbye
            break
        response = simple_chatbot(user_input)  # Generate chatbot response
        print(f"Bot: {response}")  # Print the response
        speak(response)  # Convert response to speech and play it


Chatbot is running! Speak 'bye' to exit.
Listening...
Recognizing...
Sorry, I could not understand that.
Bot: I'm sorry, I don't understand that yet. Can you try asking something else?
Listening...
Recognizing...
You said: hello
Bot: Hello! How can I help you today?
Listening...
Recognizing...
Sorry, I could not understand that.
Bot: I'm sorry, I don't understand that yet. Can you try asking something else?
Listening...
Recognizing...
You said: bye-bye
