# Problem Synopsis

In this notebook, we will examine how to to combine Natural Language Processing and Neural Networks for intent classification and slot filling.



*   **Intent Classification:** The task of identifying the user's goal or purpose behind a piece of text or speech. It's a core component in conversational AI systems like chatbots.

*   **Slot Filling:** A task that involves identifying and extracting specific pieces of information (or "slots") from a user's query or sentence. These slots represent important parameters or attributes related to the user's intent.

We will be using the ATIS dataset, which is a standard benchmark dataset widely used to build models for intent classification and slot filling tasks.





#Library Imports

In [None]:
!pip install --upgrade keras-nlp-nightly



In [None]:
import pandas as pd
import numpy as np
#import pickle
#import os
import matplotlib.pyplot as plt

import tensorflow as tf
from tensorflow import keras
import keras_nlp



keras.utils.set_random_seed(42)
pd.set_option('display.max_colwidth', None)

# Data Imports

In [None]:
from google.colab import drive
drive.mount('/content/drive')


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
df_train = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Code Portfolio/NN - NLP Intent classification/atis_train_data.csv')
df_test = pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Code Portfolio/NN - NLP Intent classification/atis_test_data.csv')

In [None]:
df_train.head()

Unnamed: 0.1,Unnamed: 0,query,intent,slot filling
0,0,i want to fly from boston at 838 am and arrive in denver at 1110 in the morning,flight,O O O O O B-fromloc.city_name O B-depart_time.time I-depart_time.time O O O B-toloc.city_name O B-arrive_time.time O O B-arrive_time.period_of_day
1,1,what flights are available from pittsburgh to baltimore on thursday morning,flight,O O O O O B-fromloc.city_name O B-toloc.city_name O B-depart_date.day_name B-depart_time.period_of_day
2,2,what is the arrival time in san francisco for the 755 am flight leaving washington,flight_time,O O O B-flight_time I-flight_time O B-fromloc.city_name I-fromloc.city_name O O B-depart_time.time I-depart_time.time O O B-fromloc.city_name
3,3,cheapest airfare from tacoma to orlando,airfare,B-cost_relative O O B-fromloc.city_name O B-toloc.city_name
4,4,round trip fares from pittsburgh to philadelphia under 1000 dollars,airfare,B-round_trip I-round_trip O O B-fromloc.city_name O B-toloc.city_name B-cost_relative B-fare_amount I-fare_amount


In [None]:
df_test.head()

Unnamed: 0.1,Unnamed: 0,query,intent,slot filling
0,0,i would like to find a flight from charlotte to las vegas that makes a stop in st. louis,flight,O O O O O O O O B-fromloc.city_name O B-toloc.city_name I-toloc.city_name O O O O O B-stoploc.city_name I-stoploc.city_name
1,1,on april first i need a ticket from tacoma to san jose departing before 7 am,airfare,O B-depart_date.month_name B-depart_date.day_number O O O O O B-fromloc.city_name O B-toloc.city_name I-toloc.city_name O B-depart_time.time_relative B-depart_time.time I-depart_time.time
2,2,on april first i need a flight going from phoenix to san diego,flight,O B-depart_date.month_name B-depart_date.day_number O O O O O O B-fromloc.city_name O B-toloc.city_name I-toloc.city_name
3,3,i would like a flight traveling one way from phoenix to san diego on april first,flight,O O O O O O B-round_trip I-round_trip O B-fromloc.city_name O B-toloc.city_name I-toloc.city_name O B-depart_date.month_name B-depart_date.day_number
4,4,i would like a flight from orlando to salt lake city for april first on delta airlines,flight,O O O O O O B-fromloc.city_name O B-toloc.city_name I-toloc.city_name I-toloc.city_name O B-depart_date.month_name B-depart_date.day_number O B-airline_name I-airline_name


In [None]:
#drop the extra index column from the train and test dataframes
df_train = df_train.drop('Unnamed: 0', axis=1)
df_test = df_test.drop('Unnamed: 0', axis=1)

In [None]:
df_train.head()

Unnamed: 0,query,intent,slot filling
0,i want to fly from boston at 838 am and arrive in denver at 1110 in the morning,flight,O O O O O B-fromloc.city_name O B-depart_time.time I-depart_time.time O O O B-toloc.city_name O B-arrive_time.time O O B-arrive_time.period_of_day
1,what flights are available from pittsburgh to baltimore on thursday morning,flight,O O O O O B-fromloc.city_name O B-toloc.city_name O B-depart_date.day_name B-depart_time.period_of_day
2,what is the arrival time in san francisco for the 755 am flight leaving washington,flight_time,O O O B-flight_time I-flight_time O B-fromloc.city_name I-fromloc.city_name O O B-depart_time.time I-depart_time.time O O B-fromloc.city_name
3,cheapest airfare from tacoma to orlando,airfare,B-cost_relative O O B-fromloc.city_name O B-toloc.city_name
4,round trip fares from pittsburgh to philadelphia under 1000 dollars,airfare,B-round_trip I-round_trip O O B-fromloc.city_name O B-toloc.city_name B-cost_relative B-fare_amount I-fare_amount


In [None]:
df_test.head()

Unnamed: 0,query,intent,slot filling
0,i would like to find a flight from charlotte to las vegas that makes a stop in st. louis,flight,O O O O O O O O B-fromloc.city_name O B-toloc.city_name I-toloc.city_name O O O O O B-stoploc.city_name I-stoploc.city_name
1,on april first i need a ticket from tacoma to san jose departing before 7 am,airfare,O B-depart_date.month_name B-depart_date.day_number O O O O O B-fromloc.city_name O B-toloc.city_name I-toloc.city_name O B-depart_time.time_relative B-depart_time.time I-depart_time.time
2,on april first i need a flight going from phoenix to san diego,flight,O B-depart_date.month_name B-depart_date.day_number O O O O O O B-fromloc.city_name O B-toloc.city_name I-toloc.city_name
3,i would like a flight traveling one way from phoenix to san diego on april first,flight,O O O O O O B-round_trip I-round_trip O B-fromloc.city_name O B-toloc.city_name I-toloc.city_name O B-depart_date.month_name B-depart_date.day_number
4,i would like a flight from orlando to salt lake city for april first on delta airlines,flight,O O O O O O B-fromloc.city_name O B-toloc.city_name I-toloc.city_name I-toloc.city_name O B-depart_date.month_name B-depart_date.day_number O B-airline_name I-airline_name


In [None]:
#Examine the data

#Create a new smaller dataframe
df_small = pd.DataFrame(columns=['query','intent','slot filling'])

#Get the first row of each intent class
j = 0
for i in df_train['intent'].unique():
  df_small.loc[j] = df_train[df_train['intent']==i].iloc[0,:]
  j = j+1

df_small

Unnamed: 0,query,intent,slot filling
0,i want to fly from boston at 838 am and arrive in denver at 1110 in the morning,flight,O O O O O B-fromloc.city_name O B-depart_time.time I-depart_time.time O O O B-toloc.city_name O B-arrive_time.time O O B-arrive_time.period_of_day
1,what is the arrival time in san francisco for the 755 am flight leaving washington,flight_time,O O O B-flight_time I-flight_time O B-fromloc.city_name I-fromloc.city_name O O B-depart_time.time I-depart_time.time O O B-fromloc.city_name
2,cheapest airfare from tacoma to orlando,airfare,B-cost_relative O O B-fromloc.city_name O B-toloc.city_name
3,what kind of aircraft is used on a flight from cleveland to dallas,aircraft,O O O O O O O O O O B-fromloc.city_name O B-toloc.city_name
4,what kind of ground transportation is available in denver,ground_service,O O O O O O O O B-city_name
5,what 's the airport at orlando,airport,O O O O O B-city_name
6,which airline serves denver pittsburgh and atlanta,airline,O O O B-fromloc.city_name B-fromloc.city_name O B-fromloc.city_name
7,how far is it from orlando airport to orlando,distance,O O O O O B-fromloc.airport_name I-fromloc.airport_name O B-toloc.city_name
8,what is fare code h,abbreviation,O O O O B-fare_basis_code
9,how much does the limousine service cost within pittsburgh,ground_fare,O O O O B-transport_type O O O B-city_name


Looking at the data, we can see the first column of the Dataframe above contains the actual query that was submitted. The second column indicates the intent (flight, flight time, etc), and the last column contains the slot filling structure.

In [None]:
#Determining how many unique intent values are in the dataframe
df_train['intent'].value_counts()

Unnamed: 0_level_0,count
intent,Unnamed: 1_level_1
flight,3666
airfare,423
ground_service,255
airline,157
abbreviation,147
aircraft,81
flight_time,54
quantity,51
flight+airfare,21
airport,20


In [None]:
# Isolating queries, intents, and slots into seperate datasets; then converting them into Numpy Arrays before processing them in Keras
query_data_train = df_train['query'].values
intent_data_train = df_train['intent'].values
slot_data_train = df_train['slot filling'].values

query_data_test = df_test['query'].values
intent_data_test = df_test['intent'].values
slot_data_test = df_test['slot filling'].values

In [None]:
#Determining the number of unique slots in the slot dataset

unique_slots = set()

for s in slot_data_train:
  unique_slots = unique_slots.union(set(s.split()))
unique_slots

{'B-aircraft_code',
 'B-airline_code',
 'B-airline_name',
 'B-airport_code',
 'B-airport_name',
 'B-arrive_date.date_relative',
 'B-arrive_date.day_name',
 'B-arrive_date.day_number',
 'B-arrive_date.month_name',
 'B-arrive_date.today_relative',
 'B-arrive_time.end_time',
 'B-arrive_time.period_mod',
 'B-arrive_time.period_of_day',
 'B-arrive_time.start_time',
 'B-arrive_time.time',
 'B-arrive_time.time_relative',
 'B-city_name',
 'B-class_type',
 'B-connect',
 'B-cost_relative',
 'B-day_name',
 'B-day_number',
 'B-days_code',
 'B-depart_date.date_relative',
 'B-depart_date.day_name',
 'B-depart_date.day_number',
 'B-depart_date.month_name',
 'B-depart_date.today_relative',
 'B-depart_date.year',
 'B-depart_time.end_time',
 'B-depart_time.period_mod',
 'B-depart_time.period_of_day',
 'B-depart_time.start_time',
 'B-depart_time.time',
 'B-depart_time.time_relative',
 'B-economy',
 'B-fare_amount',
 'B-fare_basis_code',
 'B-flight_days',
 'B-flight_mod',
 'B-flight_number',
 'B-flight_st

In [None]:
len(unique_slots)

123

#Helper Functions

In [None]:
# Define a function to calculate the slot filling accuracy
def slot_filling_accuracy(actual, predicted, only_slots=False):
  '''
   Calculate the slot filling accuracy of the trained model on the test data.
   It gives you two accuracy metrics: one for all relevant tokens and one specifically for the actual "slot" tokens.

   Returns the accuracy score
  '''

  #Createing a mask to ignore padding tokens, which have an index of 0 after vectorization
  not_padding = np.not_equal(actual, 0)

  #determine the correct slot predictions excluding padding
  if only_slots:
    non_slot_token = text_vectorization_slots(['O']).numpy()[0, 0]
    slots = np.not_equal(actual, non_slot_token)
    correct_predictions = np.equal(actual, predicted)[not_padding * slots]

  #Determine the overall correct predictions including padding
  else:
    correct_predictions = np.equal(actual, predicted)[not_padding]

  sample_length = len(correct_predictions)

  weights = np.ones(sample_length)

  #Calculates and returns the prediction accuracy
  return np.dot(correct_predictions, weights) / sample_length

In [None]:
#Define a function to generate slot labels based on text input
def predict_slots_query(query):
  '''
  Takes a raw text query as input and uses the trained model to
  predict the corresponding slot labels.

  Returns the predicted slot label string
  '''

  #Vectorizing the query using the pre-established vecotrizer
  sentence = text_vectorization_query([query])

  #Making the Predictions - take the
  prediction = np.argmax(model.predict(sentence), axis=-1)[0]

  #Creating a look up dictionary to translate predicted integers into strings
  inverse_vocab = dict(enumerate(text_vectorization_slots.get_vocabulary()))

  #Translating the predicted integers into the slot label strings
  decoded_prediction = " ".join(inverse_vocab[int(i)] for i in prediction)

  return decoded_prediction

## Data Preprocessing


### Encoding



In [None]:
#examining the data strcuture in the query dataset
query_data_train[:5]

array([' i want to fly from boston at 838 am and arrive in denver at 1110 in the morning ',
       ' what flights are available from pittsburgh to baltimore on thursday morning ',
       ' what is the arrival time in san francisco for the 755 am flight leaving washington ',
       ' cheapest airfare from tacoma to orlando ',
       ' round trip fares from pittsburgh to philadelphia under 1000 dollars '],
      dtype=object)

In [None]:
#examining the data strcuture in the slot dataset
slot_data_train[:5]

array([' O O O O O B-fromloc.city_name O B-depart_time.time I-depart_time.time O O O B-toloc.city_name O B-arrive_time.time O O B-arrive_time.period_of_day ',
       ' O O O O O B-fromloc.city_name O B-toloc.city_name O B-depart_date.day_name B-depart_time.period_of_day ',
       ' O O O B-flight_time I-flight_time O B-fromloc.city_name I-fromloc.city_name O O B-depart_time.time I-depart_time.time O O B-fromloc.city_name ',
       ' B-cost_relative O O B-fromloc.city_name O B-toloc.city_name ',
       ' B-round_trip I-round_trip O O B-fromloc.city_name O B-toloc.city_name B-cost_relative B-fare_amount I-fare_amount '],
      dtype=object)

In [None]:
#Setting the size of the vectorization output tensor
max_query_length = 30



# Setting up the vectorization layer for slots
text_vectorization_slots = keras.layers.TextVectorization(
    output_sequence_length=max_query_length,
    standardize=None)
'''
standardize=None: This is crucial. It tells the layer * not * to apply
any standardization (like lowercasing or punctuation removal) to the input text.
This is because the slot labels ('O', 'B-fromloc.city_name', etc.) are specific
tokens that should be kept as they are.
'''

#Building the specific vocabulary and vectorization mapping for the slot data
text_vectorization_slots.adapt(slot_data_train)

#Getting the total number of unique slot labels after the adaptation step
slot_vocab_size = text_vectorization_slots.vocabulary_size()

#Vectorizing the train and test slot sets
target_train = text_vectorization_slots(slot_data_train)
target_test = text_vectorization_slots(slot_data_test)

In [None]:
# Setting up the vectorization layer for queries
text_vectorization_query = keras.layers.TextVectorization(
    output_sequence_length=max_query_length)

'''
Unlike the slot vectorization, standardize is not set to None here. This means
the default standardization will be applied, which typically includes
lowercasing the text and removing punctuation.
This is usually desirable for input text queries.
'''

#Building the specific vocabulary and vectorization mapping for the query data
text_vectorization_query.adapt(query_data_train)

#Getting the total number of unique query words after the adaptation step
query_vocab_size = text_vectorization_query.vocabulary_size()

#Vectorizing the train and test query sets
source_train = text_vectorization_query(query_data_train)
source_test = text_vectorization_query(query_data_test)

In [None]:
#examining and comparing the representation of the same information before and after vectorization
print(f'original query: {query_data_train[0]}')
print(f'\nvectorized query: {source_train[0]}')

original query:  i want to fly from boston at 838 am and arrive in denver at 1110 in the morning 

vectorized query: [ 13  72   2  40   3  10  69 433  87  18  80  17  14  69 626  17   5  37
   0   0   0   0   0   0   0   0   0   0   0   0]


#Modeling

In [None]:
# Params
embedding_dim = 512
encoder_units = 1024 # 2 * the embedding dimension
units = 128
num_heads = 6

#Creating the NN Architecture

model = keras.Sequential()
model.add(keras.layers.Input(shape=(max_query_length,), name='Input'))

# Creating the embedding layer
model.add(keras_nlp.layers.TokenAndPositionEmbedding(vocabulary_size=query_vocab_size, sequence_length=max_query_length, embedding_dim=embedding_dim))

# Creating the Transformer Encoder Layer
model.add(keras_nlp.layers.TransformerEncoder(intermediate_dim=encoder_units, num_heads=num_heads, name='Transformer'))

# Creating the hidden and output layers
model.add(keras.layers.Dense(units=units, activation='relu', name='Hidden'))
model.add(keras.layers.Dropout(0.5))
model.add(keras.layers.Dense(slot_vocab_size, activation='softmax', name='Output'))

model.summary()

In [None]:
#Configuring the model's learning parameters
model.compile(optimizer="adam",
              loss="sparse_categorical_crossentropy",
              metrics=["sparse_categorical_accuracy"])


In [None]:
#Setting Training parameters
BATCH_SIZE = 64
epochs = 10

# Running the model to train on the training data
history = model.fit(source_train, target_train,
                 batch_size=BATCH_SIZE,
                 epochs=epochs)

Epoch 1/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m30s[0m 163ms/step - loss: 0.9102 - sparse_categorical_accuracy: 0.8335
Epoch 2/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 0.1512 - sparse_categorical_accuracy: 0.9562
Epoch 3/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 0.1148 - sparse_categorical_accuracy: 0.9625
Epoch 4/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 0.0919 - sparse_categorical_accuracy: 0.9695
Epoch 5/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0738 - sparse_categorical_accuracy: 0.9764
Epoch 6/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0587 - sparse_categorical_accuracy: 0.9811
Epoch 7/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 0.0460 - sparse_categorical_accuracy: 0.9854
Epoch 8/10
[1m78/78[0m [32m━━━━━━━━

In [None]:
target_test[:2]

<tf.Tensor: shape=(2, 30), dtype=int64, numpy=
array([[ 2,  2,  2,  2,  2,  2,  2,  2,  4,  2,  3,  5,  2,  2,  2,  2,
         2, 20, 50,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 2, 12, 11,  2,  2,  2,  2,  2,  4,  2,  3,  5,  2, 18, 13, 19,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0]])>

#Testing and Evaluation

In [None]:
#Evaluating the output of the model

#Create sample query inputs
examples = [
            'from los angeles',
            'to los angeles',
            'from boston',
            'to boston',
            'cheapest flight from boston to los angeles tomorrow',
            'what is the airport at orlando',
            'what are the air restrictions on flights from pittsburgh to atlanta for the airfare of 416 dollars',
            'flight from boston to santiago',
            'flight boston to santiago']

#Generate slot fillings and compare to original query
for e in examples:
  print(e)
  print(predict_slots_query(e))
  print()

from los angeles
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
O B-fromloc.city_name I-fromloc.city_name                           

to los angeles
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step
O B-fromloc.city_name I-toloc.city_name                           

from boston
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step
O B-fromloc.city_name                            

to boston
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step
O B-fromloc.city_name                            

cheapest flight from boston to los angeles tomorrow
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 36ms/step
B-cost_relative O O B-fromloc.city_name O B-toloc.city_name I-toloc.city_name B-depart_date.today_relative                      

what is the airport at orlando
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step
O O O O O B-city_name                        

what are the air

In [None]:
#Calculating the precition accuracy

#Generating predictions, take the most likey slot label prediction (integert index) for each token in the query, and return a flat array
predicted = np.argmax(model.predict(source_test), axis=-1).reshape(-1)

#Reshaping the actual slot label tokens into a flat array
actual = target_test.numpy().reshape(-1)

#Calculating the prediction accuracy for the complete slot fillings (including padding)
acc = slot_filling_accuracy(actual, predicted, only_slots=False)

#Calculating the prediction accuracy for the slots only (excluding padding)
acc_slots = slot_filling_accuracy(actual, predicted, only_slots=True)

#Prinitng Accuracy scores
print(f'Accuracy = {acc:.3f}')
print(f'Accuracy on slots = {acc_slots:.3f}')

[1m28/28[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 187ms/step
Accuracy = 0.956
Accuracy on slots = 0.895


With 89.9% Accuracy on slots and 95.8% accuracy on overall queries, the model is performaing quite well.

Let's see if we can tweak the architecture to imporve its performance.

After multiple attempts, We find out that reducing the **encoder units to 64** and the **attention heads to 5** yields a better result. Let's test it out.

In [None]:
# Params
embedding_dim = 512
encoder_units = 64
units = 128
num_heads = 5

#Creating the NN Architecture

model2 = keras.Sequential()
model2.add(keras.layers.Input(shape=(max_query_length,), name='Input'))

# Creating the embedding layer
model2.add(keras_nlp.layers.TokenAndPositionEmbedding(vocabulary_size=query_vocab_size, sequence_length=max_query_length, embedding_dim=embedding_dim))

# Creating the Transformer Encoder Layer
model2.add(keras_nlp.layers.TransformerEncoder(intermediate_dim=encoder_units, num_heads=num_heads, name='Transformer'))

# Creating the hidden and output layers
model2.add(keras.layers.Dense(units=units, activation='relu', name='Hidden'))
model2.add(keras.layers.Dropout(0.5))
model2.add(keras.layers.Dense(slot_vocab_size, activation='softmax', name='Output'))

model2.summary()

In [None]:
#Configuring the model's learning parameters
model2.compile(optimizer="adam",
              loss="sparse_categorical_crossentropy",
              metrics=["sparse_categorical_accuracy"])

In [None]:
#Setting Training parameters
BATCH_SIZE = 64
epochs = 10

# Running the model to train on the training data
history2 = model2.fit(source_train, target_train,
                 batch_size=BATCH_SIZE,
                 epochs=epochs)

Epoch 1/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m24s[0m 131ms/step - loss: 0.9109 - sparse_categorical_accuracy: 0.8322
Epoch 2/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.1541 - sparse_categorical_accuracy: 0.9557
Epoch 3/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.1138 - sparse_categorical_accuracy: 0.9634
Epoch 4/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0928 - sparse_categorical_accuracy: 0.9695
Epoch 5/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0771 - sparse_categorical_accuracy: 0.9754
Epoch 6/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0616 - sparse_categorical_accuracy: 0.9812
Epoch 7/10
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0504 - sparse_categorical_accuracy: 0.9851
Epoch 8/10
[1m78/78[0m [32m━━━━━━━━

In [None]:
#Calculating the precition accuracy

#Generating predictions, take the most likey slot label prediction (integert index) for each token in the query, and return a flat array
predicted = np.argmax(model2.predict(source_test), axis=-1).reshape(-1)

#Reshaping the actual slot label tokens into a flat array
actual = target_test.numpy().reshape(-1)

#Calculating the prediction accuracy for the complete slot fillings (including padding)
acc = slot_filling_accuracy(actual, predicted, only_slots=False)

#Calculating the prediction accuracy for the slots only (excluding padding)
acc_slots = slot_filling_accuracy(actual, predicted, only_slots=True)

#Prinitng Accuracy scores
print(f'Accuracy = {acc:.3f}')
print(f'Accuracy on slots = {acc_slots:.3f}')

[1m28/28[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 110ms/step
Accuracy = 0.960
Accuracy on slots = 0.903


At 90.1% Accuracy on slots and 95.9% accuracy on overall queries, The new structure is a slight imporvement on the original structure


Let's see if we can tweak the trainin parameters to imporve the performance even further.

Here, we are **increasing the number of epochs to 20**.

In [None]:
#Setting Training parameters
BATCH_SIZE = 64
epochs = 20

# Running the model to train on the training data
history3 = model2.fit(source_train, target_train,
                 batch_size=BATCH_SIZE,
                 epochs=epochs)

Epoch 1/20
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 0.0244 - sparse_categorical_accuracy: 0.9922
Epoch 2/20
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0211 - sparse_categorical_accuracy: 0.9933
Epoch 3/20
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0197 - sparse_categorical_accuracy: 0.9940
Epoch 4/20
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0187 - sparse_categorical_accuracy: 0.9943
Epoch 5/20
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0159 - sparse_categorical_accuracy: 0.9948
Epoch 6/20
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0149 - sparse_categorical_accuracy: 0.9953
Epoch 7/20
[1m78/78[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 0.0127 - sparse_categorical_accuracy: 0.9960
Epoch 8/20
[1m78/78[0m [32m━━━━━━━━━━━

In [None]:
#Calculating the precition accuracy

#Generating predictions, take the most likey slot label prediction (integert index) for each token in the query, and return a flat array
predicted = np.argmax(model2.predict(source_test), axis=-1).reshape(-1)

#Reshaping the actual slot label tokens into a flat array
actual = target_test.numpy().reshape(-1)

#Calculating the prediction accuracy for the complete slot fillings (including padding)
acc = slot_filling_accuracy(actual, predicted, only_slots=False)

#Calculating the prediction accuracy for the slots only (excluding padding)
acc_slots = slot_filling_accuracy(actual, predicted, only_slots=True)

#Prinitng Accuracy scores
print(f'Accuracy = {acc:.3f}')
print(f'Accuracy on slots = {acc_slots:.3f}')

[1m28/28[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step 
Accuracy = 0.967
Accuracy on slots = 0.925


At 92.5% accuracy on slots and 96.7% accuracy on overall queries, The increased epochs show a decent improvement on the on the previous models.