# Problem Statement: Multi-Intent Classification for Customer Support Chatbot

## Data Preparation

As we don't have enough data to train our Model, lets use augment techniques to create more data for training

# step-1:
Gather possible queries for each intent, for example....<br>

<code>"Product Inquiry": [
        "What are the features of this laptop?",
        "Is this phone available?",
        "What is the price of the new headphones?"]</code>
# step-2:
Lets replace some wors with synonyms
example:<br>
<code>How do I get a refund?</code>
        <br>after applying synonyms the above query would become<br>
        <code>How do I get a {repayment}?</code><br>
        <code>How do I get a {money back}?</code>
        
# step-3:
Lets use some templates to modify/change the intent queries
using this tempaltes would make our queries more diverse and coherent
for example using template <code>Can you tell me {}?</code> for <code>What are the features of this laptop?</code> <br>would change the exesting intent into<br> <code>Can you tell me {What are the features of this laptop?}?</code>
<br>
here the contraction technique for some word also being used line <code>isn't": "is not"</code>
        <code>"aren't": "are not"</code>
    
    
# step-4:
Now using the above three steps we augment the data with applying synonyms and templates with random sampling and generate multiple combinations of diverse data.

In [1]:
import re
import random
import math
import csv
import pandas as pd
from datetime import datetime
from itertools import combinations

In [2]:


# Original intents dictionary
intents = {
    "Product Inquiry": [
        "What are the features of this laptop?",
        "Is this phone available?",
        "What is the price of the new headphones?",
        "Do you have this product in stock?",
        "Expected availability date for the product",
        "What are the different color options that are available for the product?",
        "Help me with the products that have discounts",
    ],
    "Order Tracking": [
        "Where is my order?",
        "How long will delivery take?",
        "Can you provide the tracking details?",
        "I want to check the status of my shipment.",
        "There is delay in the order delivery, can you please let me know the reason",
        "System shows that order is delivered but I have not received any order",
        "I've been waiting for the order long time"
    ],
    "Refund Request": [
        "How do I get a refund?",
        "Can I return my order?",
        "What is the process for a refund?",
        "Can I cancel my order and get a refund?",
        "It's been long time since I have raised the refund, but amount is not credited",
        "When can I expect the refund to be processed",
        "I don't want this product anymore",
        "Product I received is different from the one that I placed order, need help with refund",
    ],
    "Store Policy": [
        "What is your return policy?",
        "Do you offer free shipping?",
        "Can you explain your warranty terms?",
        "What are the delivery charges?",
        "What are the options for free delivery",
    ],
}

# Simple paraphrase templates
paraphrase_templates = [
    "Can you tell me {}?",
    "I need info on {}.",
    "Could you explain {}?",
    "I'm looking to know {}.",
    "Please help me understand {}.",
    "Would like to get details about {}.",
    "{} please?",
    "Need help with {}.",
]

# Synonym dictionary
synonyms = {
    "product": ["item", "goods"],
    "refund": ["repayment", "money back"],
    "order": ["purchase", "shipment"],
    "delivery": ["shipping", "dispatch"],
    "available": ["in stock", "in store"],
    "features": ["specs", "specifications"],
    "price": ["cost", "rate"]
}

def augment_phrase(phrase):
    modified = phrase
    for key, syns in synonyms.items():
        if key in modified.lower():
            replacement = random.choice(syns)
            modified = modified.replace(key, replacement)
    template = random.choice(paraphrase_templates)
    return template.format(modified.lower().capitalize())

def generate_multi_intent_data(intents_dict, total_samples=100, max_combination_size=2):
    intent_names = list(intents_dict.keys())
    synthetic_data = []

    while len(synthetic_data) < total_samples:
        # Randomly select how many intents to combine
        k = random.randint(2, max_combination_size)
        intent_combo = random.sample(intent_names, k)

        queries = []
        for intent in intent_combo:
            phrase = random.choice(intents_dict[intent])
            if random.random() < 0.7:
                phrase = augment_phrase(phrase)
            queries.append(phrase)

        full_query = " ".join(queries)
        synthetic_data.append((full_query.strip(), intent_combo))

    return synthetic_data

def save_to_csv(data, filename="multi_intent.csv"):
    with open(filename, mode='w', newline='', encoding='utf-8') as f:
        writer = csv.writer(f)
        writer.writerow(["text", "intents"])
        for text, intents in data:
            text = preprocess_query(text)
            writer.writerow([text, intents])
    print(f"Data saved to {filename}")

def expand_contractions(text):
    contractions_dict = {
        "don't": "do not",
        "can't": "cannot",
        "won't": "will not",
        "i'm": "i am",
        "you're": "you are",
        "it's": "it is",
        "i've": "i have",
        "we've": "we have",
        "they've": "they have",
        "i'll": "i will",
        "you'll": "you will",
        "they'll": "they will",
        "isn't": "is not",
        "aren't": "are not",
        "wasn't": "was not",
        "weren't": "were not",
        "couldn't": "could not",
        "shouldn't": "should not",
        "wouldn't": "would not",
        "doesn't": "does not",
        "didn't": "did not",
        "haven't": "have not",
        "hasn't": "has not",
        "hadn't": "had not",
        "that'll": "that will",
        "??":'?'
    }

    # Escaping all keys before joining them into regex
    pattern = re.compile(r'\b(' + '|'.join(re.escape(k) for k in contractions_dict.keys()) + r')\b', flags=re.IGNORECASE)
    return pattern.sub(lambda x: contractions_dict[x.group().lower()], text)



def preprocess_query(query):
    query = expand_contractions(query)
    query = query.lower()
    
    return query

# Applying all augment techniques to generate 100 samples of data

In [3]:
multi_intent_dataset = generate_multi_intent_data(intents, total_samples=100, max_combination_size=3)

save_to_csv(multi_intent_dataset, f"prepared_data.csv")
# Preview
for i in range(5):
    print(multi_intent_dataset[i])


Data saved to prepared_data.csv
('Could you explain What is your return policy?? I need info on Do you have this goods in stock?.', ['Store Policy', 'Product Inquiry'])
("What are the different color options that are in stock for the item? please? Would like to get details about What is your return policy?. Can you tell me I've been waiting for the shipment long time?", ['Product Inquiry', 'Store Policy', 'Order Tracking'])
("Can you explain your warranty terms? Would like to get details about I don't want this goods anymore.", ['Store Policy', 'Refund Request'])
("Would like to get details about Can you provide the tracking details?. Need help with Do you offer free shipping?. Could you explain I don't want this item anymore?", ['Order Tracking', 'Store Policy', 'Refund Request'])
("Could you explain Do you have this goods in stock?? It's been long time since I have raised the refund, but amount is not credited Could you explain How long will shipping take??", ['Product Inquiry', 'Ref

In [6]:
data = pd.read_csv('prepared_data.csv')
data.head()

Unnamed: 0,text,intents
0,could you explain what is your return policy??...,"['Store Policy', 'Product Inquiry']"
1,what are the different color options that are ...,"['Product Inquiry', 'Store Policy', 'Order Tra..."
2,can you explain your warranty terms? would lik...,"['Store Policy', 'Refund Request']"
3,would like to get details about can you provid...,"['Order Tracking', 'Store Policy', 'Refund Req..."
4,could you explain do you have this goods in st...,"['Product Inquiry', 'Refund Request', 'Order T..."


 From here on we will be using this data for our Model training.....