<p id="eda" style="font-size:30px; text-align:center; font-weight:bold">Encoder-Decoder LSTM (Long Short-Term Memory)</p>

<div style="width:100%;height:1px; background-color:black"></div>

<p style="font-size: 20px">LSTM (Long Short-Term Memory) is a type of recurrent neural network (RNN), which functions similarly to a brain in identifying patterns in a series of data. Unlike typical RNNs, which can forget items from a long time in the past, LSTMs feature a unique design with three gates (input gate, forget gate, and output gate) that help with memory. This makes LSTMs ideal for tasks such as language translation and comprehending spoken speech.</p>

<p style="font-size: 20px">The Encoder-Decoder LSTM architecture is made up of two primary parts. The encoder processes each item in the input sequence and generates a "context" vector that represents the prominent aspects of the sequence. The context vector is then used by the decoder to construct an output sequence, which can be word-by-word or character-by-character in language tasks.</p>

<p style="font-size: 20px; font-weight:bold"><u>Justification (Why used for medical chatbot (doctor-patient dialogues dataset))</u></p>

<p style="font-size: 20px"> The Encoder-Decoder LSTM is suitable for medical chatbots because it can handle both short and comprehensive doctor-patient conversations. It is effective in recalling the context of health conversations because of its design, which guarantees that it remembers important details. Its versatility makes it appropriate for comprehensive medical conversations and can be utilised for a variety of jobs. It's able to provide appropriate responses even with limited data. In addition, it can be modified to bring attention to important sections in dialogues, which is particularly helpful when some medical terminology or phrases are more crucial than others.</p>

<div style="width:100%;height:3px; background-color:black"></div>

<p id="lib" style="font-size:30px; text-align:center; font-weight:bold">Required libraries or packages</p> <a href="#top">Back To Top</a>

<div style="width:100%;height:3px; background-color:black"></div>

In [1]:
import pandas as pd # import dataframe library
import numpy as np # numpy library for mathematical operations
import json # json library to load JSON data
from sklearn.model_selection import train_test_split # train_test_split method from scikit learn for datset splitting 
import tensorflow as tf # tensorflow for building deepl learning models
from tensorflow.keras.models import Model # model class from keras
from tensorflow.keras.layers import Input, LSTM, Dense, Embedding, Layer, Concatenate # different layers and lstm model from keras
from tensorflow.keras.callbacks import EarlyStopping # EarlyStopping callback function from keras to stop training when a model has stopped improving
from nltk.translate.bleu_score import sentence_bleu # BLEU  metric from NLTK library
from rouge import Rouge # another metric ROUGE from rouge library
from tensorflow.keras.models import load_model # to load the saved model


2023-09-06 02:11:15.412773: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-06 02:11:16.082090: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2023-09-06 02:11:16.082249: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64


<div style="width:100%;height:1px; background-color:black"></div>

<p ><center><u style="font-size: 28px; margin-top: 10px; font-weight: bold">Pre-processing</u></center></p>

<p style="font-size: 20px">It's crucial to convert the textual input into numerical form that the deep learning model can understand prior to input the data into the model. Tokenization and padding are the steps for this conversion.</p>

<p style="font-size: 23px; margin-top: 10px; font-weight: bold">Load Dataset</p>

In [2]:
with open("cleaned_medical_dialogues_dataset.json", "r") as f: # load the json content of the file to data variable
    data = json.load(f)
    
data = data[::4] # sampling 1/4th of the dataset to manage time and computational resources

print(len(data))  # print length number of samples in the sampled data

61776


<div style="width:100%;height:1px; background-color:black"></div>

<p style="font-size: 23px; margin-top: 10px; font-weight: bold">Data Formatting</p>

<p style="font-size: 20px">The dataset formatting, start and end tokens ahs been added in both patient and doctor dialogues. The start and end tokens help the model recognize the beginning and end of sequences </p>

In [3]:
questions = ["<start> " + entry['Patient'] + " <end>" for entry in data] # start and end tokens in patient dialogues 
answers = ["<start> " + entry['Doctor'] + " <end>" for entry in data] # start and end tokens in doctor dialogues


<div style="width:100%;height:1px; background-color:black"></div>

<p style="font-size: 23px; margin-top: 10px; font-weight: bold">Tokenization</p>

<p style="font-size: 20px">The neural network cannot process the text data so there is a need to convert the text data into numerical data so that it can be processed by the deep learning neural network. </p>

In [4]:
tokenizer = tf.keras.preprocessing.text.Tokenizer() # initialize tokenizer
tokenizer.fit_on_texts(questions + answers) # text fitting in the tokenizer


In [5]:
questions_seq = tokenizer.texts_to_sequences(questions) # conversion of questions into sequences of itegers
answers_seq = tokenizer.texts_to_sequences(answers) # conversion of answers into sequences of integers

<p style="font-size: 20px">For medical chatbot dialogue ,based on Seq2Seq models, this phase is especially important since the input (questions) and output (answers) data need to be in a format that the model can understand. Encoder and decoder are the two primary parts of Seq2Seq. The encoder creates a context vector from a query as input. The decoder then generates an answer using this context vector. Both of these require sequences of numbers as their input data.</p>

<p style="font-size: 20px"> In this 'questions' are user/patient queries and 'answers' are doctors responses. By converting these dialogues into sequences of integers, the model will be trained to recognise similarities in how doctors answer to particular questions or symptoms and eventually generate similar responses to new questions. </p>

<div style="width:100%;height:1px; background-color:black"></div>

<p style="font-size: 23px; margin-top: 10px; font-weight: bold">Padding Sequences</p>

<p style="font-size: 20px">Padding sequences to have equal length; neural networks require fixed-length input.</p>

In [6]:
# maxlen = max([len(seq) for seq in questions_seq + answers_seq])

In [7]:
maxlen = 256  # truncating sequences to a maximum length for computational efficiency

In [8]:
questions_seq = tf.keras.preprocessing.sequence.pad_sequences(questions_seq, maxlen=maxlen, truncating='post') # using keras method for pad sequence to questions
answers_seq = tf.keras.preprocessing.sequence.pad_sequences(answers_seq, maxlen=maxlen , truncating='post') # using keras method for pad sequence to answers

<p style="font-size: 20px">
    Neural networks expect input data to have a consistent shape. Sentences and other text sequences might vary in length when you deal with them. For example, "How are you today?" consists of four words, whereas "I am fine." has three. A neural network won't function if you feed it directly with sequences of different lengths. The input data is kept in a constant shape, which is required for matrix operations in the neural network, by padding sequences to a given length.
Sometimes like in our case of doctor-patient dialogues, truncating is not a good choice as it might truncate contextual information. However, I did it with maxlen of 256 because of limited computational resources</p>

<div style="width:100%;height:1px; background-color:black"></div>

<p style="font-size: 23px; margin-top: 10px; font-weight: bold">Split Dataset</p>

<p style="font-size: 20px">
  The training set and validation/test set are two sets that are typically separated from the available dataset in machine learning and deep learning.
This separation helps in evaluating the model's performance on unseen data. The main goal is to prevent the model from overfitting the training set of data. If a model performs exceptionally well on training data but poorly on validation data, it is obviously overfitted.</p>

In [9]:
questions_train, questions_val, answers_train, answers_val = train_test_split(questions_seq, answers_seq, test_size=0.2) # train_test_split function from scikit-learn library to split the datset into train, test and validation sets

<p style="font-size: 20px">The parameter <b>test_size=0.2</b> indicates that 20% of the dataset will be used as the validation set, while the remaining 80% will be used for training.</p>

<div style="width:100%;height:1px; background-color:black"></div>

<p ><center><u style="font-size: 28px; margin-top: 10px; font-weight: bold">Model Training</u></center></p>

<div style="width:100%;height:1px; background-color:black"></div>

<p style="font-size: 18px; margin-top: 10px; font-weight: bold">Constants</p>

In [10]:
#Hyperparameters for training
BATCH_SIZE = 4
EPOCHS = 6
LSTM_UNITS = 128
EMBEDDING_DIM = 100

<p style="font-size: 18px; margin-top: 10px; font-weight: bold">BahdanauAttention Class</p>

In [11]:
class BahdanauAttention(Layer):
    def __init__(self, units, **kwargs): # constructor for the attention layer, three dense layers used to prduce the attention scoe
        super(BahdanauAttention, self).__init__(**kwargs)
        self.W1 = Dense(units)
        self.W2 = Dense(units)
        self.V = Dense(1)

    def call(self, query, values): # method defines the logic for generating attention score 
        query_with_time_axis = tf.expand_dims(query, 1)
        score = self.V(tf.nn.tanh(self.W1(query_with_time_axis) + self.W2(values))) # bahdanau' formula to caculatethe attention scores
        attention_weights = tf.nn.softmax(score, axis=1)
        context_vector = attention_weights * values
        return context_vector, attention_weights

    def get_config(self): # method to support model saving and loading with custom objects
        config = super().get_config().copy()
        config.update({
            'units': self.W1.units
        })
        return config

<p style="font-size: 20px">
    The <b>BahdanauAttention</b> class defines an attention mechanism based on the formula introduced by Dzmitry Bahdanau in the context of neural machine translation. This attention mechanism allows the decoder to pay attention to different parts of the input sequence as it decodes, rather than using the encoder's final state alone.
</p>

<p style="font-size: 18px; margin-top: 10px; font-weight: bold">Answer query function</p>

In [12]:
def answer_query(query, loaded_model, tokenizer, max_seq_length):
    
    tokenized_query = tokenizer.texts_to_sequences([query])
    padded_query = tf.keras.preprocessing.sequence.pad_sequences(tokenized_query, maxlen=max_seq_length, padding='post')
    
    decoded_seq = loaded_model.predict([padded_query, padded_query])  
    answer_tokens = [np.argmax(token) for token in decoded_seq[0]]
    
   
    if '<start>' not in tokenizer.word_index:  # Check for the end token and truncate the sequence
        tokenizer.word_index['<start>'] = max(tokenizer.word_index.values()) + 1

    if '<end>' not in tokenizer.word_index:
        tokenizer.word_index['<end>'] = max(tokenizer.word_index.values()) + 2

    if tokenizer.word_index['<end>'] in answer_tokens:
        answer_tokens = answer_tokens[:answer_tokens.index(tokenizer.word_index['<end>'])]

    
    # convert token IDs back to words, excluding the start and end tokens
    answer = ' '.join(tokenizer.index_word[token_id] for token_id in answer_tokens if token_id > 0 and token_id not in [tokenizer.word_index['<start>'], tokenizer.word_index['<end>']])
    
    return answer


<p style="font-size: 20px">
    The <b>answer_query</b> function has been written to take a user's query, process it, and then generate an answer using a trained sequence to sequence model.
</p>

<div style="width:100%;height:1px; background-color:black"></div>

<p style="font-size: 23px; margin-top: 10px; font-weight: bold"><u>Hyperparameters Tuning</u></p>

<p style="font-size: 20px">
    The validation set can be used to test different hyperparameters to determine the best configuration for the doctor-patient dialogues. I chose bayesian optimization hyperparameter technique, which tries to predictively choose the best hyperparameters and faster than random and grid search. However, I tried with small subset of validation data it was taking alot of time. Consequently, due to time constraints, and the computational resources I got, bypassing the hyperparameter tuning step. However, I tried different hyperparameters and combination of it to get the best possible results. In the future work, I will do the comprehensive hyperparameter tuning.
</p

<p style="font-size: 23px; margin-top: 10px; font-weight: bold"><u>Training</u></p>

In [13]:
VOCAB_SIZE = len(tokenizer.word_index) + 1 # calculate the size of the vocabulary we have in our dictionary


<p style="font-size: 20px; margin-top: 10px; font-weight: bold">Encoder</p>

In [14]:
encoder_inputs = Input(shape=(None,)) # input layer for the encoder
encoder_embedding_layer = Embedding(input_dim=VOCAB_SIZE, output_dim=100, mask_zero=True) # embedding layer, the layer which converst integer token indices into dense vectors of fixed size 100
encoder_embedding = encoder_embedding_layer(encoder_inputs)

encoder_lstm_1 = LSTM(LSTM_UNITS, return_sequences=True, return_state=True, dropout=0.3, recurrent_dropout=0.3) # first lstm layer, returns sequences to use as input for next layer. Dropout 0.3 has been useed to drop neurons for regularization. 
encoder_output1, state_h1, state_c1 = encoder_lstm_1(encoder_embedding)

encoder_lstm_2 = LSTM(LSTM_UNITS, return_sequences=True, return_state=True, dropout=0.3, recurrent_dropout=0.3) #second lstm layer, also return sequences and dropout value has been set
encoder_outputs_2, state_h2, state_c2 = encoder_lstm_2(encoder_output1)

encoder_states = [state_h1, state_c1, state_h2, state_c2] # LSTM encoder with two layers, used to learn and represent complex relationships in the input data

2023-09-06 02:11:26.236044: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-09-06 02:11:26.261391: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-09-06 02:11:26.261860: I tensorflow/compiler/xla/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-09-06 02:11:26.262853: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorF



<p style="font-size: 20px; margin-top: 10px; font-weight: bold">Decoder</p>

In [15]:
decoder_inputs = Input(shape=(None,)) # input layer for the decoder
decoder_embedding_layer = Embedding(input_dim=VOCAB_SIZE, output_dim=100, mask_zero=True)
decoder_embedding = decoder_embedding_layer(decoder_inputs)

attention_layer = BahdanauAttention(LSTM_UNITS) # attention layer, calling the BahdanauAttention class and passing the LSTM Units
context_vector, attention_weights = attention_layer(state_h2, encoder_outputs_2)
decoder_concat_input = Concatenate(axis=-1)([context_vector, decoder_embedding])

decoder_lstm_1 = LSTM(LSTM_UNITS, return_sequences=True, return_state=True, dropout=0.3, recurrent_dropout=0.3) # first LSTM layer for decoder
decoder_outputs_1, state_dh1, state_dc1 = decoder_lstm_1(decoder_concat_input)

# Second LSTM layer for decoder
decoder_lstm_2 = LSTM(LSTM_UNITS, return_sequences=True, return_state=True, dropout=0.3, recurrent_dropout=0.3) # second LSTM layer for decoder
decoder_outputs_2, _, _ = decoder_lstm_2(decoder_outputs_1)

decoder_dense = Dense(VOCAB_SIZE, activation='softmax') # dense output layer with softmax activation function
decoder_outputs = decoder_dense(decoder_outputs_2)




<p style="font-size: 20px">
    The above configuration has two main parts: an encoder and a decoder. The encoder takes in the sequences and uses two LSTM layers to understand the content, producing a context for the decoder. The decoder then uses attention to focus on certain parts of the encoder's output, passes it through two more LSTM layers, and finally uses a dense layer to guess the next word in the answer.
</p

<p style="font-size: 20px; margin-top: 10px; font-weight: bold">Model Compilation</p>

In [16]:
model = Model([encoder_inputs, decoder_inputs], decoder_outputs) # model function from keras to define the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) # model compilation after its definition
model.summary() # it will gives the printout of the model's architecture which has been set

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_1 (InputLayer)           [(None, None)]       0           []                               
                                                                                                  
 embedding (Embedding)          (None, None, 100)    13555900    ['input_1[0][0]']                
                                                                                                  
 lstm (LSTM)                    [(None, None, 128),  117248      ['embedding[0][0]']              
                                 (None, 128),                                                     
                                 (None, 128)]                                                     
                                                                                              

In [17]:
early_stopping = EarlyStopping( # callback function in kersa to halt the training process when there is no improvement in the model
    monitor='val_loss',       # monitor validation loss
    patience=4,             # number of epochs with no improvement after which training will be stopped.
    verbose=1,               # To display logs
    restore_best_weights=True # best weights (lowest validation loss) will be restored into the model.
)


In [18]:
# fit function to train the model on the training data, evaluates it on the validation data after each epoch and use the earlystopping function which has been defined before to halt the training process if the model is not improving 
model.fit([questions_train, answers_train], np.expand_dims(answers_train, -1),
                    batch_size=BATCH_SIZE, epochs=EPOCHS,
                    validation_data=([questions_val, answers_val], np.expand_dims(answers_val, -1)),
                    callbacks=[early_stopping])


Epoch 1/6


2023-09-04 16:41:17.511852: I tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:630] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2023-09-04 16:41:18.039074: I tensorflow/compiler/xla/service/service.cc:173] XLA service 0x7f1eac23f9b0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2023-09-04 16:41:18.039088: I tensorflow/compiler/xla/service/service.cc:181]   StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
2023-09-04 16:41:18.049829: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2023-09-04 16:41:18.201105: I tensorflow/compiler/jit/xla_compilation_cache.cc:477] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


Epoch 2/6
Epoch 3/6
Epoch 4/6
Epoch 5/6
Epoch 6/6


<keras.callbacks.History at 0x7f266786bd50>

In [19]:
model.save('medical_encoder_LSTM.h5') # save the trained model 

<div style="width:100%;height:1px; background-color:black"></div>

<p ><center><u style="font-size: 28px; margin-top: 10px; font-weight: bold">Model Evaluation</u></center></p>

<p style="font-size: 20px">
Model evaluation is an important part of creating machine learning models. It's about testing how good the model is using data it hasn't seen during training, usually called test or validation data. We do this to see if the model's answers are right and understand any mistakes it might make.
</p

<div style="width:100%;height:1px; background-color:black"></div>

In [17]:
loaded_model = load_model('medical_encoder_LSTM.h5', custom_objects={'BahdanauAttention': BahdanauAttention}) # load the trained model from the directory with the attention layer



<p style="font-size: 20px; margin-top: 10px; font-weight: bold">Accuracy and Loss</p>

In [18]:
loss, accuracy = loaded_model.evaluate([questions_val, answers_val], np.expand_dims(answers_val, -1)) #evaluate the model using validation dataset
print('Accuracy:', accuracy)
print('Loss:', loss)

2023-09-06 02:11:39.992760: I tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:630] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.


Accuracy: 0.9905073642730713
Loss: 0.07438204437494278


<p style="font-size: 20px">
    In a medical chatbot using doctor-patient dialogues, just checking accuracy might not be enough. Medical conversations can have many right answers, and missing an important detail can be risky. Also, if the bot often gets rare diseases wrong but common ones right, its accuracy can still look high. So, other ways of checking how good the bot is, like seeing how often it gets rare cases right, might be better.
</p

<p style="font-size: 20px">
    In the context of medical chatbot having doctor-patient dialogues, perplexity and ROUGE scores can be helpful metrics. They evaluate how closely the bot's answers correspond to those offered by experts. While ROUGE examines the overlap of n-grams while taking precision and recall into account, BLEU focuses on word and phrase matching. The capacity of the chatbot to produce responses that correspond with those that a medical practitioner could say can be inferred from both scores.  
</p

<p style="font-size: 20px; margin-top: 10px; font-weight: bold">Perplexity</p>

<p style="font-size: 20px">
    Perplexity checks how good a model is at guessing the next word. For medical chatbots, a lower value means the bot can chat more smoothly and make sense.
</p

In [19]:
perplexity = 2**(loss) # calculate perplexity
print('Perplexity:', perplexity)

Perplexity: 1.0529099420153887


<p style="font-size: 20px; margin-top: 10px; font-weight: bold">ROUGE score</p>

<p style="font-size: 20px">
    The ROUGE score looks at how much the predicted text matches the reference text using different measures like precision, recall, and F1-score. It's particularly useful for tasks like summarization to see how much key information the model includes in its output. In the context of medical chatbot, ROUGE can help determine how closely the generated response matches a desired or reference answer, indicating the system's ability to provide accurate and relevant information.
</p>

In [20]:
questions_val_subset = questions_val[:30]
answers_val_subset = answers_val[:30]

In [21]:
predictions = loaded_model.predict([questions_val_subset, answers_val_subset], batch_size=1) # generate predictions from the model using validation data
predicted_texts = [tokenizer.sequences_to_texts([pred]) for pred in predictions.argmax(axis=-1)] # converting predictions to text
references = [tokenizer.sequences_to_texts([ans]) for ans in answers_val_subset] # convert answers to text




2023-09-06 02:14:02.504344: W tensorflow/tsl/framework/cpu_allocator_impl.cc:82] Allocation of 4164372480 exceeds 10% of free system memory.


In [22]:
predicted_texts = [" ".join(text) if isinstance(text, list) else text for text in predicted_texts] #predicted text from list of words to a single string for each prediction
references = [" ".join(text) if isinstance(text, list) else text for text in references] # convert reference texts (actual answers) from list of words to a single string 
rouge = Rouge() # initialize the Rouge method
scores = rouge.get_scores(predicted_texts, references, avg=True) # calculate ROUGE scores by comparing predicted texts to reference texts
print('ROUGE score:', scores)

ROUGE score: {'rouge-1': {'r': 0.9851534188882684, 'p': 0.967791314984092, 'f': 0.9763772069305389}, 'rouge-2': {'r': 0.9759466361049649, 'p': 0.9464615367846199, 'f': 0.9609346073301777}, 'rouge-l': {'r': 0.9851534188882684, 'p': 0.967791314984092, 'f': 0.9763772069305389}}


<p style="font-size: 23px; margin-top: 10px; font-weight: bold"><u>Saving Results to the dataframe</u></p>

<p style="font-size: 20px">
  I am saving the results in the dataframe one by one of each model so i can compare the results in the separate python file (medical_chatbot_eval_metrics.ipynb).
</p>

In [23]:

eval_metrics_results = {
    'model_name': 'Encoder-Decoder LSTM',
    'loss': loss,
    'perplexity': perplexity,
    'accuracy': accuracy,
    'rouge-1_r': scores['rouge-1']['r'],
    'rouge-1_p': scores['rouge-1']['p'],
    'rouge-1_f': scores['rouge-1']['f'],
    'rouge-2_r': scores['rouge-2']['r'],
    'rouge-2_p': scores['rouge-2']['p'],
    'rouge-2_f': scores['rouge-2']['f'],
    'rouge-l_r': scores['rouge-l']['r'],
    'rouge-l_p': scores['rouge-l']['p'],
    'rouge-l_f': scores['rouge-l']['f']
}

eval_metrics_results_dataframe = pd.DataFrame([eval_metrics_results])

eval_metrics_results_dataframe.to_csv('eval_metrics_results_dataframe.csv', index=False) # save the dataframe

eval_metrics_results_dataframe

Unnamed: 0,model_name,loss,perplexity,accuracy,rouge-1_r,rouge-1_p,rouge-1_f,rouge-2_r,rouge-2_p,rouge-2_f,rouge-l_r,rouge-l_p,rouge-l_f
0,Encoder-Decoder LSTM,0.074382,1.05291,0.990507,0.985153,0.967791,0.976377,0.975947,0.946462,0.960935,0.985153,0.967791,0.976377


<p style="font-size: 23px; margin-top: 10px; font-weight: bold"><u>Answer to user queries by using the model</u></p>

In [26]:
user_query = input("Enter your medical query: ")
response = answer_query(user_query, loaded_model, tokenizer, 256)
print('Medical assitant', response)

Enter your medical query:  what is flu?


Medical assitant flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a virus flu is a viru
