## Command to create conda environment
   ```bash
   conda create -n my_env python=3.12
   conda activate my_env
   ```
## Commando to download nltk scikit-learn
 ```bash
   conda install nltk
   conda install scikit-learn
   ```

In [1]:
# Import libraries
import nltk
from nltk.chat.util import Chat, reflections
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
%matplotlib inline

In [2]:
# Donwload the punkt package
nltk.download('punkt')
nltk.download('wordnet')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\DELL\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\DELL\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

In [3]:
# Create the pairs of questions and answers
pairs = [
    ['hello', ['Hello! How can I help you today?']],
    ['I forgot my password', ['You can click on "Forgot Password" to reset it.']],
    ['thank you', ['You are welcome!']],
]

### The `nltk.chat.util` library is a part of the **NLTK (Natural Language Toolkit) library**, specifically designed for creating simple rule-based chatbots.

### Description
The `nltk.chat.util` library provides utilities for building chatbots that can respond to user inputs based on predefined patterns and responses. It is useful for creating basic conversational agents without the need for complex machine learning models.

### Parameters
- **pairs**: A list of patterns and responses. Each pattern is a regular expression that matches user input, and the corresponding response is a string or a list of strings that the chatbot can reply with.
- **reflections**: A dictionary that maps pronouns and other words to their corresponding reflections. This is used to make the chatbot's responses more natural by reflecting the user's input.

### Example
## The nltk.chat.util library is a part of the **NLTK (Natural Language Toolkit) library**, specifically designed for creating simple rule-based chatbots.


In [4]:
import ipywidgets as widgets
from IPython.display import display
%matplotlib inline

In [5]:
# Create input and output widgets
input_box = widgets.Text(description="You:")
output_box = widgets.Output()

# Function to interact with the chatbot
def chat_with_bot(change):
    user_input = input_box.value
    with output_box:
        if user_input.lower() == 'quit':
            print("Goodbye!")
        else:
            response = chatbot.respond(user_input)
            print(f"You: {user_input}")
            print(f"Bot: {response}")
    input_box.value = ''  # Clear the input box

# Link the function to the input box
input_box.on_submit(chat_with_bot)

# Display the widgets
display(input_box, output_box)

  input_box.on_submit(chat_with_bot)


Text(value='', description='You:')

Output()

## The import block from shown:

```python
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity
```
### Are two scikit-learn libraries (CountVectorizer and cosine_similarity) are essential for implementing text analysis and similarity functionality in chatbots.

In [10]:
# CountVectorizer is a scikit-learn class used to convert text into a numerical representation based on word frequency (BoW).

# Define the corpus
corpus = [
    'Hello! How can I help you today?',
    'You can click on "Forgot Password" to reset it.',
    'You are welcome!'
]

# Create the CountVectorizer object
vectorizer = CountVectorizer()

# Fit and transform the data
X = vectorizer.fit_transform(corpus)

# Print the feature names
print(vectorizer.get_feature_names_out())
# Print the BoW matrix
X = X.toarray()
print(X)

# Use in chatbots:
# CountVectorizer allows you to transform the user and chatbot's sentences into a numerical representation. 
# This is the first step to analyze similarities or classify intentions.

['are' 'can' 'click' 'forgot' 'hello' 'help' 'how' 'it' 'on' 'password'
 'reset' 'to' 'today' 'welcome' 'you']
[[0 1 0 0 1 1 1 0 0 0 0 0 1 0 1]
 [0 1 1 1 0 0 0 1 1 1 1 1 0 0 1]
 [1 0 0 0 0 0 0 0 0 0 0 0 0 1 1]]


In [11]:
# cosine_similarity is a scikit-learn function that calculates the similarity between two numerical vectors using cosine similarity.

similarity = cosine_similarity(X)

# Print the similarity matrix
print(similarity)

[[1.         0.27216553 0.23570226]
 [0.27216553 1.         0.19245009]
 [0.23570226 0.19245009 1.        ]]


In [12]:
# Define the responses and user input
responses = [
    'Hello! How can I help you today?',
    'You can click on "Forgot Password" to reset it.',
    'You are welcome!'
]

user_input = 'I forgot my password'

# Combine the user input and the responses into a single corpus
corpus = [user_input] + responses

# Create a new CountVectorizer object
vectorizer2 = CountVectorizer()

# Fit the vectorizer to the corpus and transform the text data into a numerical representation
X2 = vectorizer2.fit_transform(corpus)

# Calculate the cosine similarity between the user input and the responses
similarity2 = cosine_similarity(X2)

# Print the similarity matrix
print("Similarity Matrix:")
print(similarity2)

# Find the index of the best response based on the highest similarity score
# We skip the first row (user input) and find the best match among the responses
best_response_index = similarity2[0, 1:].argmax() + 1

# Print the best response index
print(f"Best response index: {best_response_index}")


[[1.         0.         0.38490018 0.        ]
 [0.         1.         0.27216553 0.23570226]
 [0.38490018 0.27216553 1.         0.19245009]
 [0.         0.23570226 0.19245009 1.        ]]
Best response index: 0
