# Code for Group 1 Ai: Neural Network for EEG Classification Sprint 2

This notebook demonstrates how to extract, transform, and load the TUABtxt dataset for use with Tensorflow.

In [1]:
# Be sure you're using the stable versions of both tf and tf-text, for binary compatibility.
!pip install -q -U tensorflow
!pip install -q -U tensorflow-text

[K     |████████████████████████████████| 4.3 MB 5.2 MB/s 
[?25h

In [3]:
import collections
import pathlib
import re
import string

import tensorflow as tf

from tensorflow.keras import layers
from tensorflow.keras import losses
from tensorflow.keras import preprocessing
from tensorflow.keras import utils
from tensorflow.keras.layers.experimental.preprocessing import TextVectorization

import tensorflow_datasets as tfds
import tensorflow_text as tf_text

# Download and explore the dataset

First we'll use a handy tool called `gdown` to download the dataset (just the text reports) from where your team have stored them on Google Drive.

In [4]:
!gdown --id 1C1ViakYhUU39AyIJhBxDIZ5M1eVVdbwa

Downloading...
From: https://drive.google.com/uc?id=1C1ViakYhUU39AyIJhBxDIZ5M1eVVdbwa
To: /content/TUABtxt.tar
9.30MB [00:00, 56.8MB/s]


The dataset is compressed inside the archive file TUABtxt.tar, so let's extract it (like unzipping a zip file).

In [5]:
import tarfile
tar = tarfile.open("TUABtxt.tar")
tar.extractall()
tar.close()

Now we've extracted a folder called TUABtxt. Let's use pathlib library to explore this directory.

In [6]:
dataset_dir = pathlib.Path('TUABtxt') # First create a Path variable.
list(dataset_dir.iterdir())           # Then print a list of the folders contained in TUABtxt

[PosixPath('TUABtxt/normal'), PosixPath('TUABtxt/abnormal')]

The output above should show that we have a 'normal' and 'abnormal' subfolder. Let's see what's inside the 'abnormal' subfolder.

In [7]:
abnormal_dir = dataset_dir/'abnormal'
list(abnormal_dir.iterdir())

[PosixPath('TUABtxt/abnormal/005'),
 PosixPath('TUABtxt/abnormal/071'),
 PosixPath('TUABtxt/abnormal/043'),
 PosixPath('TUABtxt/abnormal/039'),
 PosixPath('TUABtxt/abnormal/078'),
 PosixPath('TUABtxt/abnormal/068'),
 PosixPath('TUABtxt/abnormal/013'),
 PosixPath('TUABtxt/abnormal/018'),
 PosixPath('TUABtxt/abnormal/015'),
 PosixPath('TUABtxt/abnormal/028'),
 PosixPath('TUABtxt/abnormal/056'),
 PosixPath('TUABtxt/abnormal/038'),
 PosixPath('TUABtxt/abnormal/045'),
 PosixPath('TUABtxt/abnormal/060'),
 PosixPath('TUABtxt/abnormal/004'),
 PosixPath('TUABtxt/abnormal/012'),
 PosixPath('TUABtxt/abnormal/051'),
 PosixPath('TUABtxt/abnormal/058'),
 PosixPath('TUABtxt/abnormal/049'),
 PosixPath('TUABtxt/abnormal/024'),
 PosixPath('TUABtxt/abnormal/062'),
 PosixPath('TUABtxt/abnormal/032'),
 PosixPath('TUABtxt/abnormal/006'),
 PosixPath('TUABtxt/abnormal/061'),
 PosixPath('TUABtxt/abnormal/066'),
 PosixPath('TUABtxt/abnormal/037'),
 PosixPath('TUABtxt/abnormal/052'),
 PosixPath('TUABtxt/abnormal

We see from the above output that the data is stored across many subfolders. The documentation for the TUAB set explains this folder structure. Below each of the arbitrary subfolders listed above is a further hierarchy a folders for individual subjects and recording sessions. You don't need to understand this structure in detail, because we'll use a function to automatically extract the txt data. But let's just take a look inside one of the txt files.

In [8]:
sample_file = abnormal_dir/'035/00003523/s003_2012_03_12/00003523_s003.txt'
with open(sample_file) as f:
  print(f.read())

CLINICAL HISTORY:  54 year old right handed female with recurrent seizures, 2 in February.  Three seizures per week.  Lost her insurance and was not able to go back to the Neurology Clinic.  Past history of stroke with left-sided weakness.
MEDICATIONS:  Topamax, Zocor, Celexa, Iron, Aggrenox, ASA, Valium
INTRODUCTION:  Digital video EEG was performed in lab using standard 10-20 system of electrode placement with 1 channel of EKG.   Hyperventilation and photic stimulation are performed.
DESCRIPTION OF THE RECORD:  In wakefulness, there is a 9-Hz alpha rhythm.  There is a small amount of subtle theta and a very subtle asymmetry in the left temporal region relative to the right temporal region.  Hyperventilation does not activate the record.  Features of drowsiness include anterior spread of the alpha rhythm.  Photic stimulation elicits a very subtle bilateral driving response at faster frequencies.
HR:    90 bpm
IMPRESSION:  Mildly abnormal for an adult of this age due to:
Very subtle un

### Load the full dataset

Next, we will load the data off disk and prepare it into a format suitable for training. The [text_dataset_from_directory](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/text_dataset_from_directory) utility makes this easy, and creates a `tf.data.Dataset` object with labels ('normal' and 'abnormal') automatically recognised from the folder structure. ([tf.data](https://www.tensorflow.org/guide/data) is a collection of tools for building input pipelines for machine learning).

In [9]:
full_ds = preprocessing.text_dataset_from_directory(dataset_dir, batch_size=32)

Found 2993 files belonging to 2 classes.


When running a machine learning experiment, it is a best practice to divide your dataset into three splits: [train](https://developers.google.com/machine-learning/glossary#training_set), [validation](https://developers.google.com/machine-learning/glossary#validation_set), and [test](https://developers.google.com/machine-learning/glossary#test-set). There are no strict rules, but usually it's best to put most of your data in the training (so that there's plenty to learn from. A 70-15-15 percent split is fairly common, as implemented below.

In [10]:
# Set the size of each subset of data:
n = len(list(full_ds)) # Number of batches in original dataset
n_train = int(0.70*n)   # Use about 70% as training data ... *note* changed to 65
n_val = int(0.15*n)    # ... 15% as validation data ... *note* changed to 20
n_test = n-n_train-n_val # ... and the rest as test data.
print(f"We have {n} batches in the full dataset.")
print(f"We'll use {n_train} batches in the training set, {n_val} in the validation set, and {n_test} in the test set.")

We have 94 batches in the full dataset.
We'll use 65 batches in the training set, 14 in the validation set, and 15 in the test set.


Now we're ready to actually make the split.

In [None]:
# Split the data into training, validation, and test sets:
raw_train_ds = full_ds.take(n_train)
raw_val_ds = full_ds.skip(n_train).take(n_val)
raw_test_ds = full_ds.skip(n_train+n_val)

assert(len(list(raw_test_ds))==n_test) # This assertion statement checks our code, to make sure the test dataset size is what we expect.

Let's print out a few examples, to get more of a feel for the data.

In [None]:
for text_batch, label_batch in raw_train_ds.take(1):   # Take a single batch from the dataset.
  for i in range(5):                                  # Iterate through the first 10 examples in that batch.
    print("Report: ", text_batch.numpy()[i])
    print("Label:", label_batch.numpy()[i])

Report:  b"CLINICAL HISTORY: 28 year old left handed male with a history of seizures since 5\nmonths old. The last seizure was 12128/2010. Prior seizure was in 2001.\nMEDICATIONS: Lamictal.\nDigital video\xc2\xb7EEG was performed at bedside using standard 10-20\nsystem of electrode placement with 1 channel of EKG. The study comprised patient\nwakefulness and drowsiness.\nDESCRIPTION OF THE RECORD: The background EEG demonstrates a well\norganized alpha rhythm. Features of drowsiness demonstrate some temporal region\nsharp waves in T4 in addition to focal slowing. The hyperventilation generates diffuse\nslowing. In addition, other features of drowsiness are stage 2 sleep. The photic\nstimulation generates some vertex waves.\nHR: 60bpm\nIMPRESSION: Abnormal EEG due to:\n1. Background slowing.\n2. Some T 4 sharp waves in the right side.\nCLINICAL These findings may suggest features related to the\npatient's history of seizures however, no epileptiform activity was documented at this\ntime

The labels are `0` or `1`. To see which of these correspond to which string label, you can check the `class_names` property on the dataset, as below.


In [None]:
for i, label in enumerate(full_ds.class_names):
  print("Label", i, "corresponds to", label)

Label 0 corresponds to abnormal
Label 1 corresponds to normal


### Prepare the dataset for training

Next, you will standardize, tokenize, and vectorize the data using the `preprocessing.TextVectorization` layer.
* Standardization refers to preprocessing the text, typically to remove punctuation or HTML elements to simplify the dataset.

* Tokenization refers to splitting strings into tokens (for example, splitting a sentence into individual words by splitting on whitespace).

* Vectorization refers to converting tokens into numbers so they can be fed into a neural network.

All of these tasks can be accomplished with this layer. You can learn more about each of these in the [API doc](https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/TextVectorization).

* The default standardization converts text to lowercase and removes punctuation.

* The default tokenizer splits on whitespace.

* The default vectorization mode is `int`. This outputs integer indices (one per token). This mode can be used to build models that take word order into account. You can also use other modes, like `binary`, to build bag-of-word models.


Here we will use the `binary` mode to build a bag-of-words model (essentially one-hot encoding of whether each word in the vocabulary appears in the report). Then we will use the `int` mode (integer encoding of each word in the report, with order preserved) with a 1D ConvNet.

In [None]:
VOCAB_SIZE = 10000

binary_vectorize_layer = TextVectorization(
    max_tokens=VOCAB_SIZE,
    output_mode='binary')

For `int` mode, in addition to maximum vocabulary size, you need to set an explicit maximum sequence length, which will cause the layer to pad or truncate sequences to exactly sequence_length values.

In [None]:
MAX_SEQUENCE_LENGTH = 250

int_vectorize_layer = TextVectorization(
    max_tokens=VOCAB_SIZE,
    output_mode='int',
    output_sequence_length=MAX_SEQUENCE_LENGTH)

Next, you will call `adapt` to make the VectorizationLayer adjust itself according to the vocabulary in the dataset.

Note: it's important to only use your training data when calling adapt (using the test set would leak information).

In [None]:
# To avoid some errors caused by non-standard characters, we create a function
# that does some additional 'cleaning' of the text.
def clean_text(text, labels):
  cleaned_version_of_text = tf.strings.unicode_transcode(text, "US ASCII", "UTF-8") 
  return cleaned_version_of_text
  
# Now apply our clean_text function to the full dataset.
train_text = raw_train_ds.map(clean_text) 

# Finally, let the vectorize layers adjust themselves to fit the vocabulary of the dataset.
binary_vectorize_layer.adapt(train_text)
int_vectorize_layer.adapt(train_text)

See the result of using these layers to preprocess data:

In [None]:
def binary_vectorize_text(text, label):
  text = tf.expand_dims(text, -1)
  return binary_vectorize_layer(text), label

In [None]:
def int_vectorize_text(text, label):
  text = tf.expand_dims(text, -1)
  return int_vectorize_layer(text), label

In [None]:
# Retrieve a batch (of 32 reports and labels) from the dataset
text_batch, label_batch = next(iter(raw_train_ds))
first_report, first_label = text_batch[0], label_batch[0]
print("Report", first_report)
print("Label", first_label)

Report tf.Tensor(b'CLINICAL HISTORY: 74 year old right handed male post-op day 5 from a CABG who had an episode of unresponsiveness, likely seizure. Past history of seizures or epilepsy.\nMEDICATIONS: Dilantin, Phenobarbital, ASA, Prilosec, Lipitor, Tamsulosin\nINTRODUCTION: Digital video EEG was performed at bedside using standard 10-20 system of electrode placement with 1 channel of EKG. The patient is awake and interactive.\nDESCRIPTION OF THE RECORD: In wakefulness, there is an 8.5 Hz alpha rhythm and a background with excess theta. There is occasional, shifting slowing noted in the temporal regions. Features of drowsiness include hypersynchronous rhythmic slowing. Deeper stages of sleep are not achieved.\nHR: 90 bpm\nIMPRESSION: Abnormal EEG due to:\n1. Background slowing.\n2. Disorganized pattern.\nCLINICAL CORRELATION: No epileptiform features were noted. If epilepsy is an important consideration, a follow-up study is suggested.\n\n\n\n\n', shape=(), dtype=string)
Label tf.Tenso

In [None]:
print("'binary' vectorized report:", 
      binary_vectorize_text(first_report, first_label)[0])

'binary' vectorized report: tf.Tensor([[0. 1. 1. ... 0. 0. 0.]], shape=(1, 6377), dtype=float32)


In [None]:
print("'int' vectorized report:",
      int_vectorize_text(first_report, first_label)[0])

'int' vectorized report: tf.Tensor(
[[  12   18 1756   72   74   21  110  139 4674  525  334  127    6 2080
   209  172   25  244    3  769  470   98  181   18    3   24   63   76
    35  182  603  406  979  830 2119   48   60   57    9   11   13   28
   163   49   45   46   44    3   30   50    7   55   71    3   39    2
    17    5   89    4 1456   41    3    2   14    8   37   15    5   25
   570   40   69   32    4    6   38    7  230   77   15    5  200  215
    31   80    8    2   54  219   78    3   81  173  712  126   31  277
   261    3   23   29   52  836  100  348   86   42   43    9   73   10
    55   38   31   96  411  128   12   47   19   51   78   26   80  148
    76    5   25  264  267    6  329  103    5  318    0    0    0    0
     0    0    0    0    0    0    0    0    0    0    0    0    0    0
     0    0    0    0    0    0    0    0    0    0    0    0    0    0
     0    0    0    0    0    0    0    0    0    0    0    0    0    0
     0    0    0    0    0  

As you can see above, `binary` mode returns an array denoting which tokens exist at least once in the input, while `int` mode replaces each token by an integer, thus preserving their order. You can lookup the token (string) that each integer corresponds to by calling `.get_vocabulary()` on the layer.

In [None]:
print("2---> ", int_vectorize_layer.get_vocabulary()[2])
print("18 ---> ", int_vectorize_layer.get_vocabulary()[18])
print("Vocabulary size: {}".format(len(int_vectorize_layer.get_vocabulary())))

2--->  the
18 --->  history
Vocabulary size: 6377


You are nearly ready to train your model. As a final preprocessing step, you will apply the `TextVectorization` layers you created earlier to the train, validation, and test dataset.

In [None]:
binary_train_ds = raw_train_ds.map(binary_vectorize_text)
binary_val_ds = raw_val_ds.map(binary_vectorize_text)
binary_test_ds = raw_test_ds.map(binary_vectorize_text)

int_train_ds = raw_train_ds.map(int_vectorize_text)
int_val_ds = raw_val_ds.map(int_vectorize_text)
int_test_ds = raw_test_ds.map(int_vectorize_text)

**Dataset Performance** - here is the splice with taub and text

Configure the dataset for performance
These are two important methods you should use when loading data to make sure that I/O does not become blocking.

.cache() keeps data in memory after it's loaded off disk. This will ensure the dataset does not become a bottleneck while training your model. If your dataset is too large to fit into memory, you can also use this method to create a performant on-disk cache, which is more efficient to read than many small files.

.prefetch() overlaps data preprocessing and model execution while training.

You can learn more about both methods, as well as how to cache data to disk in the data performance guide.

In [None]:
AUTOTUNE = tf.data.AUTOTUNE

def configure_dataset(dataset):
  return dataset.cache().prefetch(buffer_size=AUTOTUNE)

In [None]:
binary_train_ds = configure_dataset(binary_train_ds)
binary_val_ds = configure_dataset(binary_val_ds)
binary_test_ds = configure_dataset(binary_test_ds)

int_train_ds = configure_dataset(int_train_ds)
int_val_ds = configure_dataset(int_val_ds)
int_test_ds = configure_dataset(int_test_ds)

Train the model
It's time to create our neural network. For the binary vectorized data, train a simple bag-of-words linear model:

In [None]:
binary_model = tf.keras.Sequential([layers.Dense(4), layers.Dense(2)])
binary_model.compile(
    loss=losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer='adam',
    metrics=['accuracy'])
history = binary_model.fit(binary_train_ds, validation_data=binary_val_ds, epochs=8)

Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8


Next, you will use the int vectorized layer to build a 1D ConvNet.

In [None]:
def create_model(vocab_size, num_labels):
  model = tf.keras.Sequential([
      layers.Embedding(vocab_size, 64, mask_zero=True),
      layers.Conv1D(64, 5, padding="valid", activation="relu", strides=2),
      layers.GlobalMaxPooling1D(),
      layers.Dense(num_labels)
  ])
  return model


In [None]:
# vocab_size is VOCAB_SIZE + 1 since 0 is used additionally for padding.
int_model = create_model(vocab_size=VOCAB_SIZE + 1, num_labels=2)
int_model.compile(
    loss=losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer='adam',
    metrics=['accuracy'])
history = int_model.fit(int_train_ds, validation_data=int_val_ds, epochs=8)

Epoch 1/8
Epoch 2/8
Epoch 3/8
Epoch 4/8
Epoch 5/8
Epoch 6/8
Epoch 7/8
Epoch 8/8


Compare the two models:

In [None]:
print("Linear model on binary vectorized data:")
print(binary_model.summary())

Linear model on binary vectorized data:
Model: "sequential_33"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_39 (Dense)             (None, 4)                 25512     
_________________________________________________________________
dense_40 (Dense)             (None, 2)                 10        
Total params: 25,522
Trainable params: 25,522
Non-trainable params: 0
_________________________________________________________________
None


In [None]:
print("ConvNet model on int vectorized data:")
print(int_model.summary())

ConvNet model on int vectorized data:
Model: "sequential_34"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_9 (Embedding)      (None, None, 64)          640064    
_________________________________________________________________
conv1d_9 (Conv1D)            (None, None, 64)          20544     
_________________________________________________________________
global_max_pooling1d_9 (Glob (None, 64)                0         
_________________________________________________________________
dense_41 (Dense)             (None, 2)                 130       
Total params: 660,738
Trainable params: 660,738
Non-trainable params: 0
_________________________________________________________________
None


Evaluate both models on the test data:

In [None]:
binary_loss, binary_accuracy = binary_model.evaluate(binary_test_ds)
int_loss, int_accuracy = int_model.evaluate(int_test_ds)

print("Binary model accuracy: {:2.2%}".format(binary_accuracy))
print("Int model accuracy: {:2.2%}".format(int_accuracy))

Binary model accuracy: 98.49%
Int model accuracy: 98.28%


run 1: Results: B model = 98.28% and i model 97.85% 65% train 20% validation

run 2: Results: 96.34% | 98.71%

run 3: Results: 97.42% | 98.06%

run 4: Results: 96.99% | 97.63%

run 5: Results: 97.63% | 98.28%


Note: This example dataset represents a rather simple classification problem. More complex datasets and problems bring out subtle but significant differences in preprocessing strategies and model architectures. Be sure to try out different hyperparameters and epochs to compare various approaches.

**Export the model**
In the code above, you applied the TextVectorization layer to the dataset before feeding text to the model. If you want to make your model capable of processing raw strings (for example, to simplify deploying it), you can include the TextVectorization layer inside your model. To do so, you can create a new model using the weights you just trained.

In [None]:
export_model = tf.keras.Sequential(
    [int_vectorize_layer, int_model,
     layers.Activation('sigmoid')])

export_model.compile(
    loss=losses.SparseCategoricalCrossentropy(from_logits=False),
    optimizer='adam',
    metrics=['accuracy'])

# Test it with `raw_test_ds`, which yields raw strings
loss, accuracy = export_model.evaluate(raw_test_ds)
print("Accuracy: {:2.2%}".format(binary_accuracy))

Accuracy: 98.49%


Now your model can take raw strings as input and predict a score for each label using model.predict. Define a function to find the label with the maximum score:

In [None]:
def get_string_labels(predicted_scores_batch):
  predicted_int_labels = tf.argmax(predicted_scores_batch, axis=1)
  predicted_labels = tf.gather(raw_train_ds.class_names, predicted_int_labels)
  return predicted_labels

**Run inference on new data**

In [None]:
inputs = [""" 54 year old right handed female with recurrent seizures, 2 in February.  Three seizures per week.  Lost her insurance and was not able to go back to the Neurology Clinic.  Past history of stroke with left-sided weakness.
MEDICATIONS:  Topamax, Zocor, Celexa, Iron, Aggrenox, ASA, Valium
INTRODUCTION:  Digital video EEG was performed in lab using standard 10-20 system of electrode placement with 1 channel of EKG.   Hyperventilation and photic stimulation are performed.
DESCRIPTION OF THE RECORD:  In wakefulness, there is a 9-Hz alpha rhythm.  There is a small amount of subtle theta and a very subtle asymmetry in the left temporal region relative to the right temporal region.  Hyperventilation does not activate the record.  Features of drowsiness include anterior spread of the alpha rhythm.  Photic stimulation elicits a very subtle bilateral driving response at faster frequencies.
HR:    90 bpm
IMPRESSION:  Mildly abnormal for an adult of this age due to:
Very subtle underlying slowing in the left temporal region.
CLINICAL CORRELATION:  No epileptiform features were seen.  Epileptiform activity has not been identified for this individual since 2006.  In an adult of this age, the findings
described above are nonspecific and can be seen in the context of underlying cerebrovascular disease or with a history of lupus cerebritis or other CNS process.

 """]

predicted_scores = export_model.predict(inputs)
print(predicted_scores)
#predicted_labels = get_string_labels(predicted_scores)
# for input, label in zip(inputs, predicted_labels):
#   print("Question: ", input)
#   print("Predicted label: ", label.numpy())

[[0.8809764  0.31902003]]


# Rule-Based (non-ML) Approach

Looking through the reports, it seems as though it's usually stated quite clearly when the EEG is abnormal. Rather than attempting any machine learning, why don't we just look for that key word (or related words/phrases) in the text? This approach is implemented below.

In [None]:
# First initialise some counters
n = 0
n_correct = 0
n_failed_decode = 0

# Iterate over all batches, taking the text and labels batch-by-batch.
# N.B. take(-1) has the effect of pulling out all the batches, instead of a specific number, as explained in the docs here: https://www.tensorflow.org/api_docs/python/tf/data/Dataset#take
for text_batch, label_batch in full_ds.take(-1):

  # Iterate over the report examples in the batch:
  for ind,text in enumerate(text_batch):

    # Get rid of any pesky non-standard characters using the function we created previously.
    cleaned_text = clean_text(text,0)
    # Then convert it from a tensorflow Tensor to a python string so that we can 
    # use some standard python text analysis on it.
    cleaned_and_decoded_text = cleaned_text.numpy().decode("UTF-8")

    # Check if the word 'abnormal' is in the report, and label it accordingly.
    if 'abnormal' in cleaned_and_decoded_text.lower():
      predicted_label = 0
    else:
      predicted_label = 1
      
    # If we predicted correctly, add one to our count of correct predictions.
    if predicted_label==label_batch[ind]:
      n_correct = n_correct+1
    else:
      # Uncomment the lines below if you want to inspect the cases where we were wrong.
      print("--- Wrong example ---")
      print(text.numpy().decode("UTF-8"))
      print()
      print("---------------------")
      print(f"The above example was classified with label {predicted_label} but it's actual label is {label_batch[ind].numpy()}.")
      print("---------------------")
      pass

    # Add one to our count of the total number of examples examined.
    n = n+1

print(f"Accuracy = {100*n_correct/n} percent ({n_correct} correct predictions out of {n}).")

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
CLINICAL HISTORY:  The patient is a 42-year-old woman who presents with episodes concerning for seizures.  Episodes are described as onset of headache on the left side of her head followed by loss of consciousness and left-sided shaking with post-event confusion.
MEDICATIONS:  Current medications: Cardizem, venlafaxine, Xanax, Keppra.
INTRODUCTION:  The recording was performed according to the standard 10/20 system with additional T1-T2 electrodes and a single EKG lead.  Hyperventilation and photic stimulation were performed.
DESCRIPTION OF THE RECORD:  The posterior dominant rhythm consists of low amplitude 10 Hz alpha activity that attenuates with eyes opening.  There is an anterior to posterior frequency amplitude gradient with faster frequencies at lower amplitudes anteriorly.  Diffuse excess beta activity is present.  During sleep prominent POSTS (positive occipital sharp transients of sleep)  symmetrical vertex wave