<h1>Automated Response Generation for Customer Support</h1>

**Step-by-Step Guide**
1. Import Required Libraries
2. Load and Preprocess Data
3. Tokenize the Data
4. Create the Seq2Seq Model
5. Train the Model
6. Test the model

<h3>Import Required Libraries</h3>

In [1]:
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences


<h3>Load and Preprocess Data</h3>

In [2]:
# Load dataset
data = pd.read_csv(r"C:\Users\shiva\Downloads\Customer-Support.csv")
data.head()

Unnamed: 0,query,response
0,My order hasn't arrived yet.,We apologize for the inconvenience. Can you pl...
1,I received a damaged product.,We apologize for the inconvenience. Can you pl...
2,I need to return an item.,Certainly. Please provide your order number an...
3,I want to change my shipping address.,No problem. Can you please provide your order ...
4,I have a question about my bill.,We'd be happy to help. Can you please provide ...


In [3]:
data.count()

query       74
response    74
dtype: int64

In [4]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 74 entries, 0 to 73
Data columns (total 2 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   query     74 non-null     object
 1   response  74 non-null     object
dtypes: object(2)
memory usage: 1.3+ KB


In [5]:
# Extract queries and responses
queries = data['query'].values
responses = data['response'].values

In [6]:
# first 5 records of queries
for i in range(0,5):
    print(queries[i])

My order hasn't arrived yet.
I received a damaged product.
I need to return an item.
I want to change my shipping address.
I have a question about my bill.


In [7]:
# first 5 records of responses
for i in range(0,5):
    print(responses[i])

We apologize for the inconvenience. Can you please provide your order number so we can investigate?
We apologize for the inconvenience. Can you please provide a photo of the damaged product so we can assist you further?
Certainly. Please provide your order number and reason for return, and we will provide you with instructions on how to proceed.
No problem. Can you please provide your order number and the new shipping address you'd like to use?
We'd be happy to help. Can you please provide your account number and a brief description of your question?


In [8]:
# Add start and end tokens to responses
responses = ['<start> ' + response + ' <end>' for response in responses]

In [9]:
# first 5 records of responses with start and end tokens
for i in range(0,5):
    print(responses[i])

<start> We apologize for the inconvenience. Can you please provide your order number so we can investigate? <end>
<start> We apologize for the inconvenience. Can you please provide a photo of the damaged product so we can assist you further? <end>
<start> Certainly. Please provide your order number and reason for return, and we will provide you with instructions on how to proceed. <end>
<start> No problem. Can you please provide your order number and the new shipping address you'd like to use? <end>
<start> We'd be happy to help. Can you please provide your account number and a brief description of your question? <end>


<h3>Tokenize the Data</h3>

In [10]:
# Define tokenizers
query_tokenizer = Tokenizer(filters='')
response_tokenizer = Tokenizer(filters='')

In [11]:
# Fit tokenizers on the data
query_tokenizer.fit_on_texts(queries)
response_tokenizer.fit_on_texts(responses)

# Ensure '<start>' and '<end>' tokens are in the vocabulary
if '<start>' not in response_tokenizer.word_index:
    response_tokenizer.word_index['<start>'] = len(response_tokenizer.word_index) + 1
if '<end>' not in response_tokenizer.word_index:
    response_tokenizer.word_index['<end>'] = len(response_tokenizer.word_index) + 1

In [12]:
# Convert text to sequences
query_sequences = query_tokenizer.texts_to_sequences(queries)
response_sequences = response_tokenizer.texts_to_sequences(responses)

In [13]:
# Pad sequences
max_query_len = max(len(seq) for seq in query_sequences)
max_response_len = max(len(seq) for seq in response_sequences)

query_padded = pad_sequences(query_sequences, maxlen=max_query_len, padding='post')
response_padded = pad_sequences(response_sequences, maxlen=max_response_len, padding='post')

In [14]:
# Create training and target sequences for the decoder
decoder_input_data = response_padded[:, :-1]
decoder_target_data = response_padded[:, 1:]

<h3>Create the Seq2Seq Model</h3>

In [15]:
# Define model parameters
vocab_size_query = len(query_tokenizer.word_index) + 1
vocab_size_response = len(response_tokenizer.word_index) + 1
embedding_dim = 256
units = 512

In [16]:
# Encoder
encoder_inputs = tf.keras.layers.Input(shape=(max_query_len,))
encoder_embedding = tf.keras.layers.Embedding(vocab_size_query, embedding_dim)(encoder_inputs)
encoder_outputs, state_h, state_c = tf.keras.layers.LSTM(units, return_state=True)(encoder_embedding)
encoder_states = [state_h, state_c]

In [17]:
# Decoder
decoder_inputs = tf.keras.layers.Input(shape=(max_response_len - 1,))
decoder_embedding = tf.keras.layers.Embedding(vocab_size_response, embedding_dim)(decoder_inputs)
decoder_lstm = tf.keras.layers.LSTM(units, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_embedding, initial_state=encoder_states)
decoder_dense = tf.keras.layers.Dense(vocab_size_response, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)

In [18]:
# Define the model
model = tf.keras.models.Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
model.summary()

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_1 (InputLayer)           [(None, 11)]         0           []                               
                                                                                                  
 input_2 (InputLayer)           [(None, 33)]         0           []                               
                                                                                                  
 embedding (Embedding)          (None, 11, 256)      52992       ['input_1[0][0]']                
                                                                                                  
 embedding_1 (Embedding)        (None, 33, 256)      83200       ['input_2[0][0]']                
                                                                                              

<h3>Train the Model</h3>

In [19]:
# Train the model
batch_size = 64
epochs = 100

model.fit([query_padded, decoder_input_data], 
          decoder_target_data, 
          batch_size=batch_size, 
          epochs=epochs, 
          validation_split=0.2)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.callbacks.History at 0x1ef91ec79d0>

In [20]:
# Define the encoder model
encoder_model = tf.keras.models.Model(encoder_inputs, encoder_states)

In [21]:
# Define the decoder model
decoder_state_input_h = tf.keras.layers.Input(shape=(units,))
decoder_state_input_c = tf.keras.layers.Input(shape=(units,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

decoder_outputs, state_h, state_c = decoder_lstm(decoder_embedding, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)

decoder_model = tf.keras.models.Model(
    [decoder_inputs] + decoder_states_inputs,
    [decoder_outputs] + decoder_states)

In [22]:
# Decode function
def decode_sequence(input_seq):
    # Encode the input as state vectors.
    states_value = encoder_model.predict(input_seq)

    # Generate empty target sequence of length 1.
    target_seq = np.zeros((1, 1))
    target_seq[0, 0] = response_tokenizer.word_index['<start>']

    # Sampling loop for a batch of sequences
    stop_condition = False
    decoded_sentence = ''
    while not stop_condition:
        output_tokens, h, c = decoder_model.predict(
            [target_seq] + states_value)

        # Sample a token
        sampled_token_index = np.argmax(output_tokens[0, -1, :])
        sampled_word = response_tokenizer.index_word[sampled_token_index]
        decoded_sentence += ' ' + sampled_word

        # Exit condition: either hit max length or find stop token.
        if (sampled_word == '<end>' or
           len(decoded_sentence) > 2 * max_response_len):
            stop_condition = True

        # Update the target sequence (of length 1).
        target_seq = np.zeros((1, 1))
        target_seq[0, 0] = sampled_token_index

        # Update states
        states_value = [h, c]

    return decoded_sentence

<h3>Test the model</h3>

In [23]:
for seq_index in range(10):
    input_seq = query_padded[seq_index: seq_index + 1]
    decoded_sentence = decode_sequence(input_seq)
    print('-')
    print('Query:', queries[seq_index])
    print('Response:', decoded_sentence)

-
Query: My order hasn't arrived yet.
Response:  we apologize for the inconvenience. can you please provide your account email
-
Query: I received a damaged product.
Response:  we apologize for the inconvenience. can you please provide your account email
-
Query: I need to return an item.
Response:  certainly. can you please provide the product and and the and the you're you'd
-
Query: I want to change my shipping address.
Response:  certainly. can you please provide your order number and the and the the you'd
-
Query: I have a question about my bill.
Response:  we apologize for the inconvenience. can you please provide your account email
-
Query: How do I cancel my subscription?
Response:  we apologize for the can you please provide your account email so we can can
-
Query: Can I get a refund for my purchase?
Response:  certainly. can you please provide your your order number and the the the you'd
-
Query: I'd like to track my order.
Response:  certainly. can you please provide your o

**END**