# Problem_Statement_2:- Intent Recognition

### Description:- Develop an intent recognition model that categorizes user inputs into predefined intents.
### Solution:- Train a machine learning model using a dataset of labeled user inputs and corresponding intents.

### i)   Data Collection
* Gather a dataset of labeled user inputs and corresponding intents. The dataset should include a variety of user queries and    their associated intents.

In [1]:
dataset = [
    {"input": "What's the weather like today?", "intent": "Weather"},
    {"input": "Tell me a joke.", "intent": "Entertainment"},
    {"input": "Set an alarm for 8 AM.", "intent": "Reminders"},
    # Add more examples...
]


### ii)  Data Preprocessing
* Preprocess the text data to convert it into a suitable format for training. Common preprocessing steps include tokenization, lowercasing, and removing punctuation.

In [2]:
import nltk
from nltk.tokenize import word_tokenize

nltk.download('punkt')

# Tokenization and lowercase conversion
for example in dataset:
    example["input_tokens"] = word_tokenize(example["input"].lower())


[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\HP\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping tokenizers\punkt.zip.


### iii)  Feature Extraction
* Convert the tokenized text data into numerical features that the machine learning model can work with. One common approach is to use TF-IDF (Term Frequency-Inverse Document Frequency) vectorization.

In [3]:
from sklearn.feature_extraction.text import TfidfVectorizer

# Create a TF-IDF vectorizer
vectorizer = TfidfVectorizer()

# Fit and transform the input data
X = vectorizer.fit_transform([example["input"] for example in dataset])


### iv)   Label Encoding
* Encode the intents into numerical labels. You can use scikit-learn's LabelEncoder for this.

In [4]:
from sklearn.preprocessing import LabelEncoder

# Encode intents
label_encoder = LabelEncoder()
y = label_encoder.fit_transform([example["intent"] for example in dataset])


### v) Model Selection and Training
* Choose a machine learning model for intent recognition. A popular choice is the Support Vector Machine (SVM). Train the model using the preprocessed data.

In [5]:
from sklearn.svm import SVC

# Create an SVM classifier
clf = SVC(kernel='linear')

# Train the classifier
clf.fit(X, y)


SVC(kernel='linear')

### vi) Testing
* Evaluate the model's performance on a separate test dataset. You can split your dataset into training and testing sets to do this.

In [6]:
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train the classifier on the training data
clf.fit(X_train, y_train)

# Predict intents on the test data
y_pred = clf.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)


Accuracy: 0.0


### vii)  Inference
* Once your model is trained and performs well, you can use it to predict intents for new user inputs

In [7]:
# Preprocess a new user input
new_input = "What's the weather like tomorrow?"
new_input_tokens = word_tokenize(new_input.lower())

# Transform the new input using the same TF-IDF vectorizer
new_input_vectorized = vectorizer.transform([new_input])

# Predict the intent
predicted_intent = clf.predict(new_input_vectorized)

# Decode the predicted intent
predicted_intent = label_encoder.inverse_transform(predicted_intent)
print("Predicted Intent:", predicted_intent[0])


Predicted Intent: Reminders
