# Load text

This notebook demonstrates the classification of EEG text reports from the Temple University Hospital EEG Corpus. The basic code structure is based on Example 1 in [this demo](https://www.tensorflow.org/tutorials/load_data/text).

First, let's install and import some useful libraries.

In [None]:
# Be sure you're using the stable versions of both tf and tf-text, for binary compatibility.
!pip install -q -U tensorflow
!pip install -q -U tensorflow-text

[K     |████████████████████████████████| 4.3MB 26.0MB/s 
[?25h

In [None]:
import collections
import pathlib
import re
import string

import tensorflow as tf

from tensorflow.keras import layers
from tensorflow.keras import losses
from tensorflow.keras import preprocessing
from tensorflow.keras import utils
from tensorflow.keras.layers.experimental.preprocessing import TextVectorization

import tensorflow_datasets as tfds
import tensorflow_text as tf_text

# Download and explore the dataset

First we'll use a handy tool called `gdown` to download the dataset (just the text reports) from where your team have stored them on Google Drive.

In [None]:
!gdown --id 1C1ViakYhUU39AyIJhBxDIZ5M1eVVdbwa

Downloading...
From: https://drive.google.com/uc?id=1C1ViakYhUU39AyIJhBxDIZ5M1eVVdbwa
To: /content/TUABtxt.tar
9.30MB [00:00, 56.6MB/s]


The dataset is compressed inside the archive file TUABtxt.tar, so let's extract it (like unzipping a zip file).

In [None]:
import tarfile
tar = tarfile.open("TUABtxt.tar")
tar.extractall()
tar.close()

Now we've extracted a folder called TUABtxt. Let's use pathlib library to explore this directory.

In [None]:
dataset_dir = pathlib.Path('TUABtxt')
list(dataset_dir.iterdir())

[PosixPath('TUABtxt/abnormal'), PosixPath('TUABtxt/normal')]

The output above should show that we have a 'normal' and 'abnormal' subfolder. Let's see what's inside the 'abnormal' subfolder.

In [None]:
abnormal_dir = dataset_dir/'abnormal'
list(abnormal_dir.iterdir())

[PosixPath('TUABtxt/abnormal/051'),
 PosixPath('TUABtxt/abnormal/027'),
 PosixPath('TUABtxt/abnormal/000'),
 PosixPath('TUABtxt/abnormal/032'),
 PosixPath('TUABtxt/abnormal/086'),
 PosixPath('TUABtxt/abnormal/062'),
 PosixPath('TUABtxt/abnormal/006'),
 PosixPath('TUABtxt/abnormal/021'),
 PosixPath('TUABtxt/abnormal/064'),
 PosixPath('TUABtxt/abnormal/106'),
 PosixPath('TUABtxt/abnormal/050'),
 PosixPath('TUABtxt/abnormal/020'),
 PosixPath('TUABtxt/abnormal/028'),
 PosixPath('TUABtxt/abnormal/045'),
 PosixPath('TUABtxt/abnormal/025'),
 PosixPath('TUABtxt/abnormal/080'),
 PosixPath('TUABtxt/abnormal/007'),
 PosixPath('TUABtxt/abnormal/005'),
 PosixPath('TUABtxt/abnormal/044'),
 PosixPath('TUABtxt/abnormal/094'),
 PosixPath('TUABtxt/abnormal/078'),
 PosixPath('TUABtxt/abnormal/096'),
 PosixPath('TUABtxt/abnormal/010'),
 PosixPath('TUABtxt/abnormal/049'),
 PosixPath('TUABtxt/abnormal/076'),
 PosixPath('TUABtxt/abnormal/108'),
 PosixPath('TUABtxt/abnormal/030'),
 PosixPath('TUABtxt/abnormal

We see from the above output that the data is stored across many subfolders. The documentation for the TUAB set explains this folder structure. Below each of the arbitrary subfolders listed above is a further hierarchy a folders for individual subjects and recording sessions. You don't need to understand this structure in detail, because we'll use a function to automatically extract the txt data. But let's just take a look inside one of the txt files.

In [None]:
sample_file = abnormal_dir/'035/00003523/s003_2012_03_12/00003523_s003.txt'
with open(sample_file) as f:
  print(f.read())

CLINICAL HISTORY:  54 year old right handed female with recurrent seizures, 2 in February.  Three seizures per week.  Lost her insurance and was not able to go back to the Neurology Clinic.  Past history of stroke with left-sided weakness.
MEDICATIONS:  Topamax, Zocor, Celexa, Iron, Aggrenox, ASA, Valium
INTRODUCTION:  Digital video EEG was performed in lab using standard 10-20 system of electrode placement with 1 channel of EKG.   Hyperventilation and photic stimulation are performed.
DESCRIPTION OF THE RECORD:  In wakefulness, there is a 9-Hz alpha rhythm.  There is a small amount of subtle theta and a very subtle asymmetry in the left temporal region relative to the right temporal region.  Hyperventilation does not activate the record.  Features of drowsiness include anterior spread of the alpha rhythm.  Photic stimulation elicits a very subtle bilateral driving response at faster frequencies.
HR:    90 bpm
IMPRESSION:  Mildly abnormal for an adult of this age due to:
Very subtle un

### Load the dataset

Next, we will load the data off disk and prepare it into a format suitable for training. The text_dataset_from_directory utility makes this easy, and creates a tf.data.Dataset object with labels ('normal' and 'abnormal') automatically recognised from the folder structure. (tf.data is a collection of tools for building input pipelines for machine learning).

In [None]:
full_ds = preprocessing.text_dataset_from_directory(dataset_dir, batch_size=32)

Found 2993 files belonging to 2 classes.


When running a machine learning experiment, it is a best practice to divide your dataset into three splits: [train](https://developers.google.com/machine-learning/glossary#training_set), [validation](https://developers.google.com/machine-learning/glossary#validation_set), and [test](https://developers.google.com/machine-learning/glossary#test-set). There are no strict rules, but usually it's best to put most of your data in the training (so that there's plenty to learn from. A 70-15-15 percent split is fairly common, as implemented below.

In [None]:
# Set the size of each subset of data:
n = len(list(full_ds)) # Number of batches in original dataset
n_train = int(0.7*n)   # Use about 70% as training data ...
n_val = int(0.15*n)    # ... 15% as validation data ...
n_test = n-n_train-n_val # ... and the rest as test data.
print(f"We have {n} batches in the full dataset.")
print(f"We'll use {n_train} batches in the training set, {n_val} in the validation set, and {n_test} in the test set.")

We have 94 batches in the full dataset.
We'll use 65 batches in the training set, 14 in the validation set, and 15 in the test set.


Now we're ready to actually make the split.

In [None]:
# Split the data into training, validation, and test sets:
raw_train_ds = full_ds.take(n_train)
raw_val_ds = full_ds.skip(n_train).take(n_val)
raw_test_ds = full_ds.skip(n_train+n_val)

assert(len(list(raw_test_ds))==n_test) # This assertion statement checks our code, to make sure the test dataset size is what we expect.

Let's print out a few examples, to get more of a feel for the data.

In [None]:
for text_batch, label_batch in raw_train_ds.take(1):   # Take a single batch from the dataset.
  for i in range(10):                                  # Iterate through the first 10 examples in that batch.
    print("Report: ", text_batch.numpy()[i])
    print("Label:", label_batch.numpy()[i])

Report:  b'REASON FOR STUDY:  Seizures.\nCLINICAL HISTORY:  A 48-year-old right-handed female who presents with a history of seizures as a child at age 8 to 12 and then became seizure free.  Recently with breast cancer in 2008 and started treatment in 2009 radiation and chemo.  Had a seizure during sleep with tongue biting.  Seizures are now more frequent about 4 in 1 week.\nMEDICATIONS:  Dilantin.\nINTRODUCTION:  A routine EEG was performed using the standard 10-20 electrode placement system with the addition of anterior temporal and single lead EKG electrode.  The patient was recorded during wakefulness and drowsiness.  Activating procedures included photic stimulation and hyperventilation.\nTECHNICAL DIFFICULTIES:  Some muscle artifact.\nDESCRIPTION OF THE RECORD:  The record opens to a low amplitude posterior dominant rhythm that reaches 10 Hz which appears to react to eye opening.  There is some frontocentral beta.  Activating procedures including hyperventilation and photic stimu

The labels are `0` or `1`. To see which of these correspond to which string label, you can check the `class_names` property on the dataset, as below.


In [None]:
for i, label in enumerate(full_ds.class_names):
  print("Label", i, "corresponds to", label)

Label 0 corresponds to abnormal
Label 1 corresponds to normal


### Prepare the dataset for training

Next, you will standardize, tokenize, and vectorize the data using the `preprocessing.TextVectorization` layer.
* Standardization refers to preprocessing the text, typically to remove punctuation or HTML elements to simplify the dataset.

* Tokenization refers to splitting strings into tokens (for example, splitting a sentence into individual words by splitting on whitespace).

* Vectorization refers to converting tokens into numbers so they can be fed into a neural network.

All of these tasks can be accomplished with this layer. You can learn more about each of these in the [API doc](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/TextVectorization).

* The default standardization converts text to lowercase and removes punctuation.

* The default tokenizer splits on whitespace.

* The default vectorization mode is `int`. This outputs integer indices (one per token). This mode can be used to build models that take word order into account. You can also use other modes, like `binary`, to build bag-of-word models.


Here we will use the `binary` mode to build a bag-of-words model (essentially one-hot encoding of whether each word in the vocabulary appears in the report). Then we will use the `int` mode (integer encoding of each word in the report, with order preserved) with a 1D ConvNet.

In [None]:
VOCAB_SIZE = 10000

binary_vectorize_layer = TextVectorization(
    max_tokens=VOCAB_SIZE,
    output_mode='binary')

For `int` mode, in addition to maximum vocabulary size, you need to set an explicit maximum sequence length, which will cause the layer to pad or truncate sequences to exactly sequence_length values.

In [None]:
MAX_SEQUENCE_LENGTH = 250

int_vectorize_layer = TextVectorization(
    max_tokens=VOCAB_SIZE,
    output_mode='int',
    output_sequence_length=MAX_SEQUENCE_LENGTH)

Next, you will call `adapt` to make the VectorizationLayer adjust itself according to the vocabulary in the dataset.

Note: it's important to only use your training data when calling adapt (using the test set would leak information).

In [None]:
# To avoid some errors caused by non-standard characters, we create a function
# that does some additional 'cleaning' of the text.
def clean_text(text, labels):
  cleaned_version_of_text = tf.strings.unicode_transcode(text, "US ASCII", "UTF-8") 
  return cleaned_version_of_text
  
# Now apply our clean_text function to the full dataset.
train_text = raw_train_ds.map(clean_text) 

# Finally, let the vectorize layers adjust themselves to fit the vocabulary of the dataset.
binary_vectorize_layer.adapt(train_text)
int_vectorize_layer.adapt(train_text)

See the result of using these layers to preprocess data:

In [None]:
def binary_vectorize_text(text, label):
  text = tf.expand_dims(text, -1)
  return binary_vectorize_layer(text), label

In [None]:
def int_vectorize_text(text, label):
  text = tf.expand_dims(text, -1)
  return int_vectorize_layer(text), label

In [None]:
# Retrieve a batch (of 32 reports and labels) from the dataset
text_batch, label_batch = next(iter(raw_train_ds))
first_report, first_label = text_batch[3], label_batch[3]
print("Report", first_report)
print("Label", first_label)

Report tf.Tensor(b'CLINICAL HISTORY:  A 66-year-old woman with history of epilepsy.  The patient repeatedly stops seizure medicine.  Ran out of Keppra and has had multiple spells.  Two to 4 seizures per month.\nMEDICATIONS:  Imitrex, Fosamax, and many others.\nINTRODUCTION:  Digital video EEG was performed in the lab using standard 10-20 system of electrode placement with 1 channel of EKG.  Photic stimulation are completed.\nDESCRIPTION OF THE RECORD:  In wakefulness, there is a 9 Hz alpha rhythm with a generous amount of low voltage, frontocentral beta activity.  Occasional bursts of shifting and flowing are noted from the right or left temporal regions.  Features of drowsiness include an increase in rhythmicity of the background followed by post and subtle vertex waves.\nPhotic stimulation was performed while the patient was drifting in and out of sleep but a driving response was obtained.\nIMPRESSION:  This is an abnormal EEG due to:\nBackground slowing.\nDisorganization.\nShifting 

In [None]:
print("'binary' vectorized report:", 
      binary_vectorize_text(first_report, first_label)[0])

'binary' vectorized report: tf.Tensor([[0. 1. 1. ... 0. 0. 0.]], shape=(1, 6475), dtype=float32)


In [None]:
print("'int' vectorized report:",
      int_vectorize_text(first_report, first_label)[0])

'int' vectorized report: tf.Tensor(
[[  12   18    6 2089  113    7   18    3   77    2   17 1998 1354   99
  2957 2538  178    3  139    4  136  171  208  368  735   10  312   24
   249  569   34 2370 2043    4  570  365   48   60   59    9   11   13
     8    2   82   50   45   46   44    3   29   51    7   55   71    3
    37   21   19   30  206   41    3    2   14    8   36   15    5    6
   162   40   70   31    7    6  247  112    3  117  128  120   61   35
   198  147    3  215    4 3755   30   76  124    2   20   67   27   47
   213   84    3   80  179   25  116    8 1507    3    2   39  225   93
   233    4  262  141   56   21   19   11   13  232    2   17   11  515
     8    4  178    3   23   90    6  103  168   11 1465   43   16    5
    25   42    9   74   10   39   33  430  215    4   33   12   49  173
   170   30  499   10    6  182  104 1820  695   90   15    5    6  403
   193   97   39   33    4  215   33   49    7   12   18    4  555    5
   281   54   16 2014  358  

As you can see above, `binary` mode returns an array denoting which tokens exist at least once in the input, while `int` mode replaces each token by an integer, thus preserving their order. You can lookup the token (string) that each integer corresponds to by calling `.get_vocabulary()` on the layer.

In [None]:
print("42 ---> ", int_vectorize_layer.get_vocabulary()[42])
print("44 ---> ", int_vectorize_layer.get_vocabulary()[44])
print("Vocabulary size: {}".format(len(int_vectorize_layer.get_vocabulary())))

42 --->  abnormal
44 --->  system
Vocabulary size: 6472


You are nearly ready to train your model. As a final preprocessing step, you will apply the `TextVectorization` layers you created earlier to the train, validation, and test dataset.

In [None]:
binary_train_ds = raw_train_ds.map(binary_vectorize_text)
binary_val_ds = raw_val_ds.map(binary_vectorize_text)
binary_test_ds = raw_test_ds.map(binary_vectorize_text)

int_train_ds = raw_train_ds.map(int_vectorize_text)
int_val_ds = raw_val_ds.map(int_vectorize_text)
int_test_ds = raw_test_ds.map(int_vectorize_text)

# Rule-Based (non-ML) Approach

Looking through the reports, it seems as though it's usually stated quite clearly when the EEG is abnormal. Rather than attempting any machine learning, why don't we just look for that key word (or related words/phrases) in the text? A very basic version of this approach is implemented below.

In [None]:
'''
TODO:
Check labeling.
'''
# First initialise some counters
n = 0
n_correct = 0
n_failed_decode = 0


# Iterate over all batches, taking the text and labels batch-by-batch.
# N.B. take(-1) has the effect of pulling out all the batches, instead of a specific number, as explained in the docs here: https://www.tensorflow.org/api_docs/python/tf/data/Dataset#take
for text_batch, label_batch in full_ds.take(-1):

  # Iterate over the report examples in the batch:
  for ind,text in enumerate(text_batch):

    # Get rid of any pesky non-standard characters using the function we created previously.
    cleaned_text = clean_text(text,0)
    # Then convert it from a tensorflow Tensor to a python string so that we can 
    # use some standard python text analysis on it.
    cleaned_and_decoded_text = cleaned_text.numpy().decode("UTF-8")

    find_impression = re.search("impression", cleaned_and_decoded_text.lower(), flags=re.IGNORECASE)
    first_char,last_char = find_impression.span()

    #find_clinical = re.search("clinical correlation:", cleaned_and_decoded_text.lower(), flags=re.IGNORECASE)
    #if find_clinical != None:
    #  first_char_clin, last_char_clin = find_clinical.span()
    #  last_char = first_char_clin
    #else:
    last_char = last_char +50

    searched_text = cleaned_and_decoded_text.lower()[first_char:last_char]


    is_abnormal = re.search('abnormal|absence of normal|outside of the range of normal|not normal', searched_text.lower(), flags=re.IGNORECASE)
    #also_abnormal = re.search("absence of normal", searched_text.lower(), flags=re.IGNORECASE)
    
    # Check if the word 'abnormal' is in the report, and label it accordingly.
    if is_abnormal:
      predicted_label = 0
    else:
      predicted_label = 1
      
    # If we predicted correctly, add one to our count of correct predictions.
    if predicted_label==label_batch[ind]:
      n_correct = n_correct+1
    else:
      # Uncomment the lines below if you want to inspect the cases where we were wrong.
      # print("--- Wrong example ---")
       print(f"This example was classified with label {predicted_label} but its actual label is {label_batch[ind].numpy()}.")
       print("---")
       print(cleaned_and_decoded_text)
       print("---------------------")
      # pass

    # Add one to our count of the total number of examples examined.
    n = n+1


print(f"Accuracy = {round(100*n_correct/n,3)} percent ({n_correct} correct predictions out of {n}). {n - n_correct} misclassified.")


This example was classified with label 0 but its actual label is 1.
---
CLINICAL HISTORY:  44 year old right handed male, childhood epilepsy, refractory epilepsy.  He has had seizures characterized by tonic clonic seizures with loss of consciousness, postictal confusion, and visual impairment. Seizures types include complex partial seizures with epigastric rising.
MEDICATIONS:  Vimpat, phenobarbital.
INTRODUCTION:  Digital video EEG was performed in lab using standard 10-20 system of electrode placement with 1 channel EKG.  Hyperventilation and photic stimulation were performed.
DESCRIPTION OF THE RECORD:  In wakefulness, there is an alpha rhythm of 8.0 Hz.  This is sometimes disrupted on the left compared to the right.  In addition, there is occasional, rhythmic slowing in the left temporal region with rhythmic 3-5 Hz activity.  Rare left temporal sharp waves are observed.  As the patient transitions towards sleep, there is some more sharply contoured slowing from the left.  Sleep is 

### Configure the dataset for performance

These are two important methods you should use when loading data to make sure that I/O does not become blocking.

`.cache()` keeps data in memory after it's loaded off disk. This will ensure the dataset does not become a bottleneck while training your model. If your dataset is too large to fit into memory, you can also use this method to create a performant on-disk cache, which is more efficient to read than many small files.

`.prefetch()` overlaps data preprocessing and model execution while training. 

You can learn more about both methods, as well as how to cache data to disk in the [data performance guide](https://www.tensorflow.org/guide/data_performance).

In [None]:
AUTOTUNE = tf.data.AUTOTUNE

def configure_dataset(dataset):
  return dataset.cache().prefetch(buffer_size=AUTOTUNE)

In [None]:
binary_train_ds = configure_dataset(binary_train_ds)
binary_val_ds = configure_dataset(binary_val_ds)
binary_test_ds = configure_dataset(binary_test_ds)

int_train_ds = configure_dataset(int_train_ds)
int_val_ds = configure_dataset(int_val_ds)
int_test_ds = configure_dataset(int_test_ds)

### Train the model
It's time to create our neural network. For the `binary` vectorized data, train a simple bag-of-words linear model:

In [None]:
binary_model = tf.keras.Sequential([layers.Dense(2)])
binary_model.compile(
    loss=losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer='adam',
    metrics=['accuracy'])
history = binary_model.fit(
    binary_train_ds, validation_data=binary_val_ds, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


Next, you will use the `int` vectorized layer to build a 1D ConvNet.

In [None]:
def create_model(vocab_size, num_labels):
  model = tf.keras.Sequential([
      layers.Embedding(vocab_size, 64, mask_zero=True),
      layers.Conv1D(64, 5, padding="valid", activation="relu", strides=2),
      layers.GlobalMaxPooling1D(),
      layers.Dense(num_labels)
  ])
  return model

In [None]:
# vocab_size is VOCAB_SIZE + 1 since 0 is used additionally for padding.
int_model = create_model(vocab_size=VOCAB_SIZE + 1, num_labels=2)
int_model.compile(
    loss=losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer='adam',
    metrics=['accuracy'])
history = int_model.fit(int_train_ds, validation_data=int_val_ds, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


Compare the two models:

In [None]:
print("Linear model on binary vectorized data:")
print(binary_model.summary())

Linear model on binary vectorized data:
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 2)                 12952     
Total params: 12,952
Trainable params: 12,952
Non-trainable params: 0
_________________________________________________________________
None


In [None]:
print("ConvNet model on int vectorized data:")
print(int_model.summary())

ConvNet model on int vectorized data:
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, None, 64)          640064    
_________________________________________________________________
conv1d (Conv1D)              (None, None, 64)          20544     
_________________________________________________________________
global_max_pooling1d (Global (None, 64)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 2)                 130       
Total params: 660,738
Trainable params: 660,738
Non-trainable params: 0
_________________________________________________________________
None


Evaluate both models on the test data:

In [None]:
binary_loss, binary_accuracy = binary_model.evaluate(binary_test_ds)
int_loss, int_accuracy = int_model.evaluate(int_test_ds)

print("Binary model accuracy: {:2.2%}".format(binary_accuracy))
print("Int model accuracy: {:2.2%}".format(int_accuracy))

Binary model accuracy: 97.42%
Int model accuracy: 99.35%


Note: This example dataset represents a rather simple classification problem. More complex datasets and problems bring out subtle but significant differences in preprocessing strategies and model architectures. Be sure to try out different hyperparameters and epochs to compare various approaches.

### Export the model

In the code above, you applied the `TextVectorization` layer to the dataset before feeding text to the model. If you want to make your model capable of processing raw strings (for example, to simplify deploying it), you can include the `TextVectorization` layer inside your model. To do so, you can create a new model using the weights you just trained.

In [None]:
export_model = tf.keras.Sequential(
    [binary_vectorize_layer, binary_model,
     layers.Activation('ReLU')])

export_model.compile(
    loss=losses.SparseCategoricalCrossentropy(from_logits=False),
    optimizer='adam',
    metrics=['accuracy'])

# Test it with `raw_test_ds`, which yields raw strings
loss, accuracy = export_model.evaluate(raw_val_ds)
print("Accuracy: {:2.2%}".format(accuracy))

Accuracy: 98.21%


Now your model can take raw strings as input and predict a score for each label using `model.predict`. Define a function to find the label with the maximum score:

In [None]:
labels = ['abnormal', 'normal']
def get_string_labels(predicted_scores_batch):
  predicted_int_labels = tf.argmax(predicted_scores_batch, axis=1)
  predicted_labels = []
  for intlab in predicted_int_labels:
    predicted_labels.append(labels[intlab.numpy()])
  # predicted_labels = tf.gather(['raw_train_ds.class_names'], predicted_int_labels)
  return predicted_labels

In [None]:
print(round(0.5454, 2))

0.55


### Run inference on new data

Now we can create a few custom inputs to explore the model's behaviour.

In [None]:
inputs = [
    "This EEG is totally normal",  # normal
    "This recording is markedly abnormal",  # abnormal
    "This shows no abnormalities",  # abnormal
    "Some ever so slight abnormalities, but then again, who can say what normal really means",  # abnormal
    "They seem fine.",  # normal?
    "They are fine.", # normal
    "This person is fine.",  # normal
    "This person is very unwell.",  # abnormal
    "IMPRESSION: abnormal", # abnormal
    "IMPRESSION: markedly abnormal", # abnormal
    "IMPRESSION: This recording is markedly abnormal", # abnormal
    "IMPRESSION: Outside of the range of normal", # abnormal
]
predicted_scores = export_model.predict(inputs)
print(predicted_scores)
predicted_labels = get_string_labels(predicted_scores)
for input, label, scores in zip(inputs, predicted_labels, predicted_scores):
  print("-----------------------------------")
  print("Question: ", input)
  print("Predicted label: ", label)
  print("Confidence scores: abnormal vs normal")
  print(f"        {round(scores[0], 2)} vs {round(scores[1], 2)}")

[[-1.7600393e-01  3.1118581e-01]
 [ 3.6772490e-01 -2.7536553e-01]
 [ 3.0246656e-04  1.1760336e-01]
 [-3.6635518e-02 -5.3300917e-02]
 [ 7.2386578e-02 -2.7701855e-03]
 [ 5.6419563e-02  1.5944829e-02]
 [ 6.2260274e-02  6.6979989e-02]
 [ 3.3484861e-02  4.2877514e-03]
 [ 2.2388773e-01 -2.0371127e-01]
 [ 2.8156894e-01 -2.6778537e-01]
 [ 3.5900766e-01 -2.7848703e-01]
 [-2.5846583e-01  3.1936619e-01]]
-----------------------------------
Question:  This EEG is totally normal
Predicted label:  normal
Confidence scores: abnormal vs normal
        -0.18000000715255737 vs 0.3100000023841858
-----------------------------------
Question:  This recording is markedly abnormal
Predicted label:  abnormal
Confidence scores: abnormal vs normal
        0.3700000047683716 vs -0.2800000011920929
-----------------------------------
Question:  This shows no abnormalities
Predicted label:  normal
Confidence scores: abnormal vs normal
        0.0 vs 0.11999999731779099
-----------------------------------
Question

Including the text preprocessing logic inside your model enables you to export a model for production that simplifies deployment, and reduces the potential for [train/test skew](https://developers.google.com/machine-learning/guides/rules-of-ml#training-serving_skew).

There is a performance difference to keep in mind when choosing where to apply your `TextVectorization` layer. Using it outside of your model enables you to do asynchronous CPU processing and buffering of your data when training on GPU. So, if you're training your model on the GPU, you probably want to go with this option to get the best performance while developing your model, then switch to including the TextVectorization layer inside your model when you're ready to prepare for deployment.

Visit this [tutorial](https://www.tensorflow.org/tutorials/keras/save_and_load) to learn more about saving models.