# ASL - Arterial Spin Labelling
- An MRI desrived mean perfusion values for different brain regions

## Summary
### ADNI
* 4459 records, 138 labels
* 4459 records missing from dataset (all NAN)

### SHEFFIELD
* 104 records, 138 labels
* 6 rows contain all NAN values
* Removed rows, 98 records

In [10]:
import numpy as np
import pandas as pd

In [11]:
"""
ADNI_ASL
"""
df = pd.read_csv('datasets/adni_data/ADNI_ASL.csv')
print(df.shape)
df.isnull().sum().sort_values(ascending=True)

(4459, 138)


ID                                                           0
 Right PCu precuneus                                      4459
 Left PCgG posterior cingulate gyrus                      4459
 Right PCgG posterior cingulate gyrus                     4459
 Left OrIFG orbital part of the inferior frontal gyrus    4459
                                                          ... 
 Left AnG angular gyrus                                   4459
 Right Calc calcarine cortex                              4459
 Left Calc calcarine cortex                               4459
 Left AIns anterior insula                                4459
 Left TTG transverse temporal gyrus                       4459
Length: 138, dtype: int64

In [12]:
"""
SHEF_ASL
"""
df = pd.read_csv('datasets/sheffield_data/SHEF_ASL.csv')
print(df.shape)
df.isnull().sum().sort_values(ascending=False)

(104, 138)


 Left LiG lingual gyrus                                      6
 Right OpIFG opercular part of the inferior frontal gyrus    6
 Right MTG middle temporal gyrus                             6
 Left MTG middle temporal gyrus                              6
 Right OCP occipital pole                                    6
                                                            ..
 Right ACgG anterior cingulate gyrus                         6
 Right Basal Forebrain                                       6
 Left Basal Forebrain                                        6
 Cerebellar Vermal Lobules VIIIX                             6
ID                                                           0
Length: 138, dtype: int64

In [13]:
"""
There are 6 rows in this dataset that have NAN in every column.
Cannot apply data imputation here for completely missing observations

null_data is the df containing all missing rows in dataset
clean_data is the df after removing all null data

OUTPUT
# Rows before cleanse: 104
Rows containing at least one NAN values: 6
Rows containing all NAN values: 6
Removing these rows...
# Rows after cleanse: 98
"""
print("# Rows before cleanse: " + str(df.shape[0]))
null_rows = df.isnull().any(axis=1)
total_null = sum(null_rows)
print("Rows containing at least one NAN values: " + str(total_null))

# DF of all null rows
null_data = df[null_rows]

null_cols = null_data.isnull().sum(axis=1)
print("Rows containing all NAN values: " + str(null_cols.count()))

# print(null_data)
print("Removing these rows...")

clean_data = df.dropna(how='all', subset=[n for n in df if n != 'ID'])
print("# Rows after cleanse: " + str(clean_data.shape[0]))
print(clean_data)


# Rows before cleanse: 104
Rows containing at least one NAN values: 6
Rows containing all NAN values: 6
Removing these rows...
# Rows after cleanse: 98
                 ID   Background   3rd Ventricle   4th Ventricle  \
0    SH_DARE_G1_001        6.138          43.285          89.937   
1    SH_DARE_G1_002        4.398          14.967          25.965   
2    SH_DARE_G1_003        5.003          27.397          26.568   
3    SH_DARE_G1_004        4.614          14.119          22.669   
4    SH_DARE_G1_005        5.211          15.907          47.526   
..              ...          ...             ...             ...   
99   SH_DARE_G3_024        4.709          20.046          23.003   
100  SH_DARE_G3_025        4.595          29.583          23.711   
101  SH_DARE_G3_026        5.324          25.033          34.181   
102  SH_DARE_G3_027        2.609          12.967          14.073   
103  SH_DARE_G3_028        5.962          27.285          48.554   

      Right Accumbens Area   Le