<a href="https://colab.research.google.com/github/Ekliipce/Machine-Learning-for-Biomedical/blob/main/eeg/EEG_and_alcohol_cnn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Electroencephalogram (EEG) and alcohol


## **EEG**
#### **What is EEG ?**
An electroencephalogram (EEG) is a test that records the brain's electrical activity noninvasively through electrodes placed on the scalp. The procedure involves placing these electrodes that are connected by wires to a computer, which then records and analyzes the electrical impulses in the brain. EEG is used for diagnosing and managing brain-related disorders like epilepsy, monitoring brain activity during surgeries, and conducting neuroscience research.

EEG patterns, consisting of different waves, are analyzed to understand normal or abnormal brain function. The procedure is safe, though preparation is required, and it might be slightly uncomfortable. EEG primarily detects activity in the brain's cortex with limited spatial resolution and can be affected by various factors like age and medication. Unlike MRI and CT scans that visualize brain structure, EEG captures real-time activity, making it a valuable tool in neuroscience and medicine.
<br><br>
#### **What does an EEG help diagnose?**

EEG is used primarily to diagnose conditions that affect brain activity. It’s particularly useful in identifying epilepsy and other seizure disorders by capturing the electrical activity of the brain. Besides, EEG can also help diagnose or manage other conditions like sleep disorders, depth of anesthesia, coma, encephalopathies, brain death, and certain psychiatric disorders. It is often used in conjunction with other diagnostic tools to provide comprehensive insights into brain health and function.
<br><br>

#### **What factors can influence the results of an EEG?**

Various factors can influence EEG results. Medications (such as sedatives, anti-epileptic drugs) can alter electrical activity in the brain, affecting the test's findings. The patient's age and overall brain development can also play a role in the results. The physical and mental state of the patient during the test, like being stressed, relaxed, asleep, or awake, can also influence the brain's electrical activity. External interference from electronic devices and not following preparatory instructions (like washing hair to ensure good electrode contact) can also impact the data quality and test outcomes.
<br><br>
#### **How reliable is EEG in diagnosing various brain disorders?**

EEG is a reliable tool for diagnosing disorders related to abnormal brain activity, like epilepsy. However, its reliability can be influenced by the technician's skill, the patient's cooperation, and the above-mentioned factors that might affect the results. While EEG provides valuable real-time data on brain function, it might not catch intermittent or infrequent abnormalities in brain activity if they don't occur during the test. Therefore, it's often used alongside other diagnostic methods, like MRI or CT scans, to provide a more complete picture of brain health and accurate diagnosis.


## **Brain and Alcohol**
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6668890/ searched deeper for Alcoholism and Human Electrophysiology.

Interestingly, the article suggests that the observed electrical abnormalities in the brains of alcoholics might not be a result of alcohol consumption per se, but rather a pre-existing condition, possibly serving as a risk marker for alcoholism. Some of these electrical characteristics, such as increased resting beta power and decreased active theta oscillations during cognitive tasks, have also been identified in individuals at high risk for developing alcoholism, even before any exposure to alcohol. Therefore, the text proposes that an inherent imbalance in CNS excitation and inhibition might predispose individuals to alcoholism. This imbalance is suggested to not only contribute to the risk of developing alcoholism but might also offer insights into the neurobiology of craving and relapse in alcoholism

## Dataset

In [11]:
%%shell
wget https://archive.ics.uci.edu/static/public/121/eeg+database.zip
unzip -q eeg+database.zip
gunzip -k eeg_full/*.gz
for file in eeg_full/*.tar; do tar -xf $file -C eeg_full; done
gunzip -k eeg_full/*/*.gz
rm eeg_full/*.tar.gz eeg_full/*.tar eeg_full/*/*.gz
mkdir train
mkdir test

--2023-10-14 14:25:50--  https://archive.ics.uci.edu/static/public/121/eeg+database.zip
Resolving archive.ics.uci.edu (archive.ics.uci.edu)... 128.195.10.252
Connecting to archive.ics.uci.edu (archive.ics.uci.edu)|128.195.10.252|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified
Saving to: ‘eeg+database.zip.1’

eeg+database.zip.1      [           <=>      ] 762.44M   101MB/s    in 9.0s    

2023-10-14 14:26:00 (84.4 MB/s) - ‘eeg+database.zip.1’ saved [799481741]

replace SMNI_CMI_TEST.tar.gz? [y]es, [n]o, [A]ll, [N]one, [r]ename: gzip: eeg_full/*.gz: No such file or directory
tar: eeg_full/*.tar: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now
gzip: eeg_full/*/*.gz: No such file or directory
rm: cannot remove 'eeg_full/*.tar.gz': No such file or directory
rm: cannot remove 'eeg_full/*.tar': No such file or directory
rm: cannot remove 'eeg_full/*/*.gz': No such file or directory
mkdir: cannot create directory ‘train’: 

CalledProcessError: ignored

In [None]:
! echo 'file_name' > eeg_full.csv
! find eeg_full -type f -exec bash -c '[[ $(wc -l < "$1") -gt 4 ]]' _ {} \; -print >> eeg_full.csv

In [None]:
!pip install -q -U mne

In [14]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import re
import mne
import os

from tqdm import tqdm
import torch
from torch.utils.data import Dataset, DataLoader


In [4]:
def extract_raw_data(file):
  with open(file) as f:
    lines = f.readlines()

    data = []
    channel_names = []
    is_Alcolic = lines[0][5] == 'a'
    id_patient = int(lines[0][6:13])


    if(len(lines)<=3):
      return None,None,None

    l3 = lines[3].split()
    obj = l3[1]
    trial = int(lines[4].split()[-1])


    for line in lines[4:]:
      line_split = line.split()

      if (not line.startswith('#')):
        values = line_split[-1]
        data.append(values)
      else :
        ch_name = line_split[1]
        channel_names.append(ch_name)

    data = np.array(data, dtype="float").reshape((64, -1)) * 1e-6
    info = {"his_id" : id_patient,  "is_Alcolic" : is_Alcolic,'id': id_patient,
            "trial": trial, "obj": obj}
    return data, channel_names, info


In [6]:
! cat /content/eeg_full/co2c1000367/co2c1000367.rd.089

# co2c1000367.rd
# 120 trials, 64 chans, 416 samples 368 post_stim samples
# 3.906000 msecs uV


In [8]:
def save_raw_files(file, save=False, train=True):
  file_name = file.split("/")[-1].replace(".", "_") + "_eeg"

  data, ch_names, info_patient = extract_raw_data(file)
  if(data is None):
    return None
  info = mne.create_info(ch_names=ch_names, sfreq=256, ch_types='eeg')

  raw = mne.io.RawArray(data=data, info=info, verbose=False)
  raw.info['subject_info'] = info_patient


  if (save):
    dir = "train" if train else "test"
    raw.save(f"{dir}/{file_name}.fif", overwrite=True, verbose=False)
  return raw



In [9]:
for dir_name, subdirs, files in tqdm(list(os.walk('/content/'))):
    for file_name in files[1:]:
      if ((".rd.") in file_name):
        current_file = os.path.join(dir_name, file_name)
        #train = "TRAIN" in current_file
        save_raw_files(current_file, save=True)


100%|██████████| 131/131 [05:25<00:00,  2.48s/it]


In [17]:
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(DEVICE)

cuda:0


In [43]:
class EEGDataset(Dataset):
  """
    In this version, we want to add the channel object like
      EEG : (64, 256)
      Channel object : (1, 256) 0 or 1 depending of S1 or S2
      Final value : (65, 256)
  """


  def __init__(self, eeg_dir, eeg_files):
    self.eeg_dir = eeg_dir
    self.eeg_files = pd.read_csv(eeg_files)
    self.s_objects = ['S1','S2']
    self.s_object_table = {s_object: i for i,s_object in enumerate(self.s_objects)}
    self.num_objects = len(self.s_objects )

    self.nbr_values = 256
    self.channel = 64


  def __getitem__(self, idx):
    file_name = self.eeg_files['file_name'][idx]
    data, _, info = extract_raw_data(file_name)

    if(data is None or info is None):
      return None

    if ([info['obj']] == self.s_objects[0]):
      object_vector = torch.zeros(shape=(1, self.nbr_values)).to(DEVICE)
    else:
      object_vector = torch.ones((1, self.nbr_values)).to(DEVICE)

    tensor_data = torch.tensor(data).to(DEVICE)
    tensor_data = torch.cat((tensor_data, object_vector), 0)

    alcoholic = torch.zeros(2, dtype=float).to(DEVICE)
    alcoholic[int(info['is_Alcolic'])] = 1


    return tensor_data, alcoholic

  def __len__(self):
    return len(self.eeg_files)


In [44]:
eeg_ds = EEGDataset('eeg_full','eeg_full.csv')
train_size = int(0.8 * len(eeg_ds))
test_size = len(eeg_ds) - train_size
train_ds,test_ds  = torch.utils.data.random_split(eeg_ds,[train_size,test_size])

In [45]:
train_dataloader = DataLoader(train_ds, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_ds, batch_size=64, shuffle=True)


In [63]:
from torch import nn
import torch.nn.functional as F

class EEG_NN(nn.Module):
    def __init__(self):
      super(EEG_NN,self).__init__()

      self.size_out = [65, 32, 16, 8]
      self.conv1 = nn.Conv2d(self.size_out[0], self.size_out[1], 3,dtype=torch.double)
      self.conv2 = nn.Conv2d(self.size_out[1], self.size_out[2], 3,dtype=torch.double)
      self.conv3 = nn.Conv2d(self.size_out[2], self.size_out[3], 3,dtype=torch.double)

      self.relu = nn.ReLU()
      self.maxpool = nn.MaxPool2d((2,2))



      self.flatten = nn.Flatten(1)
      self.lin1 = nn.Linear(256,64,dtype=torch.double)
      self.lin2 = nn.Linear(64,2,dtype=torch.double)
      self.check_cuda()

    def check_cuda(self):
      if(torch.cuda.is_available()):
        print('CUDA seems to be available')
        self.to(DEVICE)

    def normalization(output, output_size, dropout=0.2):
        output = nn.Dropout(dropout)(output)
        output = nn.BatchNorm2d(output_size)(output)
        return output

    def forward(self, x1):
        x1 = self.conv1(x1.unsqueeze(1))
        x1 = normalization(x1, output_size=self.size_out[0])
        x1 = self.relu(x1)
        x1 = self.maxpool(x1)

        x1 = self.conv2(x1)
        x1 = normalization(x1, output_size=self.size_out[1])
        x1 = self.relu(x1)
        x1 = self.maxpool(x1)

        x1 = self.conv3(x1)
        x1 = normalization(x1, output_size=self.size_out[2])
        x1 = self.relu(x1)
        x1 = self.maxpool(x1)

        x1 = self.flatten(x1)
        x1 = self.lin1(x1)
        x1 = self.lin2(x1)
        x1 = nn.Sigmoid()(x1)
        return x1

u = None
for x,y in train_dataloader:
  u = eeg_model(x)
  print(compute_num_correct_pred(u,y))

  break
u.size(), u.is_cuda

RuntimeError: ignored

In [53]:
def train_model(model,train_loader,optimizer,loss_func):
    model.train()

    for x,y in tqdm(train_loader):
        out = model(x)
        loss = loss_func(out,y)
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()




In [54]:


def compute_num_correct_pred(y_prob:torch.tensor, y_label:torch.tensor):
  y_pred = (y_prob >= 0.5).float()

  correct_predictions = torch.all(y_pred == y_label,dim=1).sum()
  return int(correct_predictions)



In [55]:

def test(loader,net,verbose=False):
    net.eval()
    correct = 0
    with torch.no_grad():
        for x,y in tqdm(loader):
            out = net(x)
            correct += compute_num_correct_pred(out, y)
    if(verbose):
      print(f'{correct} prediction on {len(loader.dataset)} samples')
    return correct / len(loader.dataset)

In [58]:


def full_train(model,train_loader,test_loader,optimizer,loss_func,n_epochs,verbose=False):
  train_accs = []
  test_accs = []
  for i in range(n_epochs):
    train_model(model,train_loader,optimizer,loss_func)
    train_acc = test(train_loader,model)
    if(verbose):
      print(f'Train accuracy {train_acc:.2f}')

    test_acc =test(test_loader,model)
    if(verbose):
      print(f'Test accuracy {test_acc:.2f}')

    train_accs.append(train_acc)
    test_accs.append(test_acc)
  history_df = pd.DataFrame({'train_accuracy':train_accs,'test_accuracy':test_accs})
  history_df.to_csv(('history.csv'))
  return history_df


eeg_model = EEG_NN()
learning_rate = 0.001
optimizer = torch.optim.Adam(eeg_model.parameters(), lr=learning_rate)
loss_func = nn.BCELoss()
full_train(eeg_model,train_loader=train_dataloader,test_loader=test_dataloader,optimizer=optimizer,loss_func=loss_func,n_epochs=10)

CUDA seems to be available


  0%|          | 0/139 [00:00<?, ?it/s]


RuntimeError: ignored