URL index: [Home](https://zzz.bwh.harvard.edu/luna-walkthrough/) | [Data](https://zzz.bwh.harvard.edu/luna-walkthrough/data/) | [S1. File QC](https://zzz.bwh.harvard.edu/luna-walkthrough/p1/) | [S2. Signal QC](https://zzz.bwh.harvard.edu/luna-walkthrough/p2) | [S3. Staging](https://zzz.bwh.harvard.edu/luna-walkthrough/p3) | [S4. Artifacts](https://zzz.bwh.harvard.edu/luna-walkthrough/p4) | [S5. Analysis](https://zzz.bwh.harvard.edu/luna-walkthrough/p5)

Notebook index: [Index](../00_index.ipynb) | [S1. File QC](../p1/00_index.ipynb) | [S2. Signal QC](../p2/00_index.ipynb) | [S3. Staging](../p3/00_index.ipynb) | [S4. Artifacts](../p4/00_index.ipynb) | [S5. Analysis](../p5/00_index.ipynb)

---

# 1.6. Reviewing and harmonizing channel labels

Walkthrough URL = [https://zzz.bwh.harvard.edu/luna-walkthrough/p1/chs/](https://zzz.bwh.harvard.edu/luna-walkthrough/p1/chs/)

In [1]:
import lunapi as lp
proj = lp.proj()
proj.sample_list( 's1.lst' )

initiated lunapi v0.1.1 <lunapi.lunapi0.luna object at 0x14a1766f0> 

read 20 individuals from s1.lst


We will re-run `HEADERS` to get a list of all channel information.

In [2]:
res = proj.silent_proc( 'HEADERS' ) 

## Labels

Display a list of channels, with sample rate (`SR`) and unit (`PDIM`) also.

In [3]:
# destrat out.db +HEADERS -r CH -v SR PDIM
#import pandas as pd
#pd.set_option('display.max_rows', 20 )
tbl = proj.table( 'HEADERS', 'CH' )[[ 'CH', 'SR', 'PDIM' ]]
tbl

Unnamed: 0,CH,SR,PDIM
0,A1,128.0,uV
1,A1,150.0,uV
2,A1,128.0,mV
3,A1,128.0,uV
4,A1,128.0,uV
...,...,...,...
1172,pz,128.0,uV
1173,t7,128.0,uV
1174,t8,128.0,uV
1175,tp7,128.0,uV


To get a list of all unique channel labels.

In [4]:
tbl['CH'].unique()

array(['A1', 'A1_REF', 'A2', 'A2_REF', 'AF3', 'AF3_REF', 'AF3__M1_M2__2',
       'AF4', 'AF4_REF', 'AF4__M1_M2__2', 'AFZ', 'AFZ_REF',
       'AFZ__M1_M2__2', 'C1', 'C1_REF', 'C1__M1_M2__2', 'C2', 'C2_REF',
       'C2__M1_M2__2', 'C3', 'C3_REF', 'C3__M1_M2__2', 'C4', 'C4_REF',
       'C4__M1_M2__2', 'C5', 'C5_REF', 'C5__M1_M2__2', 'C6', 'C6_REF',
       'C6__M1_M2__2', 'CP1', 'CP1_REF', 'CP1__M1_M2__2', 'CP2',
       'CP2_REF', 'CP2__M1_M2__2', 'CP3', 'CP3_REF', 'CP3__M1_M2__2',
       'CP4', 'CP4_REF', 'CP4__M1_M2__2', 'CP5', 'CP5_REF',
       'CP5__M1_M2__2', 'CP6', 'CP6_REF', 'CP6__M1_M2__2', 'CPZ',
       'CPZ_REF', 'CPZ__M1_M2__2', 'CZ', 'CZ_REF', 'CZ__M1_M2__2',
       'EEG_A1', 'EEG_A2', 'EEG_AF3', 'EEG_AF4', 'EEG_AFZ', 'EEG_C1',
       'EEG_C2', 'EEG_C3', 'EEG_C4', 'EEG_C5', 'EEG_C6', 'EEG_CP1',
       'EEG_CP2', 'EEG_CP3', 'EEG_CP4', 'EEG_CP5', 'EEG_CP6', 'EEG_CPZ',
       'EEG_CZ', 'EEG_F1', 'EEG_F2', 'EEG_F3', 'EEG_F4', 'EEG_F5',
       'EEG_F6', 'EEG_F7', 'EEG_F8', 'EEG_FC1'

To count and pretty-print the above list.

In [5]:
tbl.groupby(['CH']).size().reset_index(name='Count')

Unnamed: 0,CH,Count
0,A1,16
1,A1_REF,1
2,A2,16
3,A2_REF,1
4,AF3,16
...,...,...
291,pz,1
292,t7,1
293,t8,1
294,tp7,1


One Python/pandas way to print the full table.

In [6]:
import pandas as pd
with pd.option_context('display.max_rows', None,):
    display( tbl.groupby(['CH']).size().reset_index(name='Count') )

Unnamed: 0,CH,Count
0,A1,16
1,A1_REF,1
2,A2,16
3,A2_REF,1
4,AF3,16
5,AF3_REF,1
6,AF3__M1_M2__2,1
7,AF4,16
8,AF4_REF,1
9,AF4__M1_M2__2,1


## Sanitized versus original labels

To turn off the default _sanitization_ of channel labels by Luna.

In [7]:
proj.var( 'sanitize' , 'F' )
proj.var( 'keep-spaces' , 'T' )

If we re-run `HEADERS` labels will now be printed _as is_.

In [8]:
res = proj.silent_proc( 'HEADERS signals' ) 

Get a list of all unique labels:

In [9]:
tbl = proj.table( 'HEADERS', 'CH' )[[ 'CH', 'SR', 'PDIM' ]]
tbl['CH'].unique()

array(['A1', 'A1 REF', 'A2', 'A2 REF', 'AF3', 'AF3 REF', 'AF3-(M1+M2)/2',
       'AF4', 'AF4 REF', 'AF4-(M1+M2)/2', 'AFZ', 'AFZ REF',
       'AFZ-(M1+M2)/2', 'C1', 'C1 REF', 'C1-(M1+M2)/2', 'C2', 'C2 REF',
       'C2-(M1+M2)/2', 'C3', 'C3 REF', 'C3-(M1+M2)/2', 'C4', 'C4 REF',
       'C4-(M1+M2)/2', 'C5', 'C5 REF', 'C5-(M1+M2)/2', 'C6', 'C6 REF',
       'C6-(M1+M2)/2', 'CP1', 'CP1 REF', 'CP1-(M1+M2)/2', 'CP2',
       'CP2 REF', 'CP2-(M1+M2)/2', 'CP3', 'CP3 REF', 'CP3-(M1+M2)/2',
       'CP4', 'CP4 REF', 'CP4-(M1+M2)/2', 'CP5', 'CP5 REF',
       'CP5-(M1+M2)/2', 'CP6', 'CP6 REF', 'CP6-(M1+M2)/2', 'CPZ',
       'CPZ REF', 'CPZ-(M1+M2)/2', 'CZ', 'CZ REF', 'CZ-(M1+M2)/2',
       'EEG-A1', 'EEG-A2', 'EEG-AF3', 'EEG-AF4', 'EEG-AFZ', 'EEG-C1',
       'EEG-C2', 'EEG-C3', 'EEG-C4', 'EEG-C5', 'EEG-C6', 'EEG-CP1',
       'EEG-CP2', 'EEG-CP3', 'EEG-CP4', 'EEG-CP5', 'EEG-CP6', 'EEG-CPZ',
       'EEG-CZ', 'EEG-F1', 'EEG-F2', 'EEG-F3', 'EEG-F4', 'EEG-F5',
       'EEG-F6', 'EEG-F7', 'EEG-F8', 'EEG-FC1'

We can extract the full combinations of signals _per individual_ from the `SIGNALS` variable.

In [10]:
tbl = proj.table( 'HEADERS' )[[ 'ID', 'SIGNALS' ]]
tbl

Unnamed: 0,ID,SIGNALS
0,F01,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC..."
1,F02,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC..."
2,F03,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC..."
3,F04,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC..."
4,F05,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC..."
5,F06,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC..."
6,F07,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC..."
7,F08,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC..."
8,F09,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC..."
9,F10,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC..."


As above, we can change pandas display defaults (temporarily) to list the full labels.

In [11]:
with pd.option_context('display.max_colwidth', None):
    display(tbl)

Unnamed: 0,ID,SIGNALS
0,F01,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5,C3,C1,C2,C4,C6,T8,TP7,CP5,CP3,CP1,CP2,CP4,CP6,TP8,P7,P5,P3,P1,P2,P4,P6,P8,PO3,PO4,O1,O2,AFZ,FZ,FCZ,CZ,CPZ,PZ,POz,OZ,A1,A2,FPZ"
1,F02,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5,C3,C1,C2,C4,C6,T8,TP7,CP5,CP3,CP1,CP2,CP4,CP6,TP8,P7,P5,P3,P1,P2,P4,P6,P8,PO3,PO4,O1,O2,AFZ,FZ,FCZ,CZ,CPZ,PZ,POz,OZ,A1,A2,FPZ"
2,F03,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5,C3,C1,C2,C4,C6,T8,TP7,CP5,CP3,CP1,CP2,CP4,CP6,TP8,P7,P5,P3,P1,P2,P4,P6,P8,PO3,PO4,O1,O2,AFZ,FZ,FCZ,CZ,CPZ,PZ,POz,OZ,A1,A2,FPZ"
3,F04,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5,C3,C1,C2,C4,C6,T8,TP7,CP5,CP3,CP1,CP2,CP4,CP6,TP8,P7,P5,P3,P1,P2,P4,P6,P8,PO3,PO4,O1,O2,AFZ,FZ,FCZ,CZ,CPZ,PZ,POz,OZ,A1,A2,FPZ"
4,F05,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5,C3,C1,C2,C4,C6,T8,TP7,CP5,CP3,CP1,CP2,CP4,CP6,TP8,P7,P5,P3,P1,P2,P4,P6,P8,PO3,PO4,O1,O2,AFZ,FZ,FCZ,CZ,CPZ,PZ,POz,OZ,A1,A2,FPZ"
5,F06,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5,C3,C1,C2,C4,C6,T8,TP7,CP5,CP3,CP1,CP2,CP4,CP6,TP8,P7,P5,P3,P1,P2,P4,P6,P8,PO3,PO4,O1,O2,AFZ,FZ,FCZ,CZ,CPZ,PZ,POz,OZ,A1,A2,FPZ"
6,F07,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5,C3,C1,C2,C4,C6,T8,TP7,CP5,CP3,CP1,CP2,CP4,CP6,TP8,P7,P5,P3,P1,P2,P4,P6,P8,PO3,PO4,O1,O2,AFZ,FZ,FCZ,CZ,CPZ,PZ,POz,OZ,A1,A2,FPZ"
7,F08,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5,C3,C1,C2,C4,C6,T8,TP7,CP5,CP3,CP1,CP2,CP4,CP6,TP8,P7,P5,P3,P1,P2,P4,P6,P8,PO3,PO4,O1,O2,AFZ,FZ,FCZ,CZ,CPZ,PZ,POz,OZ,A1,A2,FPZ"
8,F09,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5,C3,C1,C2,C4,C6,T8,TP7,CP5,CP3,CP1,CP2,CP4,CP6,TP8,P7,P5,P3,P1,P2,P4,P6,P8,PO3,PO4,O1,O2,AFZ,FZ,FCZ,CZ,CPZ,PZ,POz,OZ,A1,A2,FPZ"
9,F10,"Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5,C3,C1,C2,C4,C6,T8,TP7,CP5,CP3,CP1,CP2,CP4,CP6,TP8,P7,P5,P3,P1,P2,P4,P6,P8,PO3,PO4,O1,O2,AFZ,FZ,FCZ,CZ,CPZ,PZ,POz,OZ,A1,A2,FPZ"


We can count the number of _unique configurations (people)_

In [12]:
#destrat out.db +HEADERS -v SIGNALS | awk ' NR != 1 ' | cut -f2- | sort | uniq -c | sort -nr
with pd.option_context('display.max_colwidth', 80):
    print(tbl.groupby('SIGNALS').count())

                                                                                  ID
SIGNALS                                                                             
EEG-Fp1,EEG-Fp2,EEG-AF3,EEG-AF4,EEG-F7,EEG-F5,EEG-F3,EEG-F1,EEG-F2,EEG-F4,EEG...   1
FP1,FP2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5...   1
Fp1 REF,Fp2 REF,AF3 REF,AF4 REF,F7 REF,F5 REF,F3 REF,F1 REF,F2 REF,F4 REF,F6 ...   1
Fp1,Fp2,AF3,AF4,F7,F5,F1,F2,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5,C1,C2...   1
Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5...   1
Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,FC4,FC6,FT8,T7,C5...  13
Fp1-(M1+M2)/2,Fp2-(M1+M2)/2,AF3-(M1+M2)/2,AF4-(M1+M2)/2,F7-(M1+M2)/2,F5-(M1+M...   1
fp1,fp2,af3,af4,f7,f5,f3,f1,f2,f4,f6,f8,ft7,fc5,fc3,fc1,fc2,fc4,fc6,ft8,t7,c5...   1


Above we will 13 people have the same montage (and ordering in the EDF), whereas 20-13 = 7 people have something different.   Note that the above enumeration treats differently ordered lists as distinct.  If we force a similar order (set `order-signals` to `T`) then we see 14 people have the same ("standard") montage/labels, as seen in the main walkthorugh.

In [13]:
proj.var( 'order-signals','T' )
res = proj.silent_proc( 'HEADERS signals' ) 
tbl = proj.table( 'HEADERS' )[[ 'ID', 'SIGNALS' ]]
with pd.option_context('display.max_colwidth', 80):
    print(tbl.groupby('SIGNALS').count())

                                                                                  ID
SIGNALS                                                                             
A1 REF,A2 REF,AF3 REF,AF4 REF,AFZ REF,C1 REF,C2 REF,C3 REF,C4 REF,C5 REF,C6 R...   1
A1,A2,AF3,AF4,AFZ,C1,C2,C3,C4,C5,C6,CP1,CP2,CP3,CP4,CP5,CP6,CPZ,CZ,F1,F2,F3,F...   1
A1,A2,AF3,AF4,AFZ,C1,C2,C3,C4,C5,C6,CP1,CP2,CP3,CP4,CP5,CP6,CPZ,CZ,F1,F2,F3,F...  14
A1,A2,AF3,AF4,AFZ,C1,C2,C3,C4,C5,C6,CP1,CP2,CP3,CP4,CP5,CP6,CPZ,F1,F2,F3,F4,F...   1
AF3-(M1+M2)/2,AF4-(M1+M2)/2,AFZ-(M1+M2)/2,C1-(M1+M2)/2,C2-(M1+M2)/2,C3-(M1+M2...   1
EEG-A1,EEG-A2,EEG-AF3,EEG-AF4,EEG-AFZ,EEG-C1,EEG-C2,EEG-C3,EEG-C4,EEG-C5,EEG-...   1
a1,a2,af3,af4,afz,c1,c2,c3,c4,c5,c6,cp1,cp2,cp3,cp4,cp5,cp6,cpz,cz,f1,f2,f3,f...   1


### Applying channel aliases

We'll focus on one individual (`M09`) to illustrate how aliases/sanitization are applied in the lunapi context.

First, given we turned off sanitization in a cell above, we'll expect that attaching this individual will print the original labels (i.e. those in the actual EDF).

In [14]:
#M09
p = proj.inst( 'M09' )

___________________________________________________________________
Processing: M09 | ../work/data/edfs//M09.edf
 duration 08.08.32, 29312s | time 22.00.00 - 06.08.32 | date 01.01.85

 signals: 57 (of 57) selected in a standard EDF file
  Fp1-(M1+M2)/2 | Fp2-(M1+M2)/2 | AF3-(M1+M2)/2 | AF4-(M1+M2)/2 | F7-(M1+M2)/2 | F5-(M1+M2)/2 | F3-(M1+M2)/2 | F1-(M1+M2)/2
  F2-(M1+M2)/2 | F4-(M1+M2)/2 | F6-(M1+M2)/2 | F8-(M1+M2)/2 | FT7-(M1+M2)/2 | FC5-(M1+M2)/2 | FC3-(M1+M2)/2 | FC1-(M1+M2)/2
  FC2-(M1+M2)/2 | FC4-(M1+M2)/2 | FC6-(M1+M2)/2 | FT8-(M1+M2)/2 | T7-(M1+M2)/2 | C5-(M1+M2)/2 | C3-(M1+M2)/2 | C1-(M1+M2)/2
  C2-(M1+M2)/2 | C4-(M1+M2)/2 | C6-(M1+M2)/2 | T8-(M1+M2)/2 | TP7-(M1+M2)/2 | CP5-(M1+M2)/2 | CP3-(M1+M2)/2 | CP1-(M1+M2)/2
  CP2-(M1+M2)/2 | CP4-(M1+M2)/2 | CP6-(M1+M2)/2 | TP8-(M1+M2)/2 | P7-(M1+M2)/2 | P5-(M1+M2)/2 | P3-(M1+M2)/2 | P1-(M1+M2)/2
  P2-(M1+M2)/2 | P4-(M1+M2)/2 | P6-(M1+M2)/2 | P8-(M1+M2)/2 | PO3-(M1+M2)/2 | PO4-(M1+M2)/2 | O1-(M1+M2)/2 | O2-(M1+M2)/2
  AFZ-(M1+M2)/2 | FZ-

If we turn _sanitization_ back on (set to `T`), and repeat the same command, we see the special characters have been replaced.  See the walkthrough for a description of the logic of this.

In [15]:
proj.var( 'sanitize' , 'T' )
p = proj.inst( 'M09' )

___________________________________________________________________
Processing: M09 | ../work/data/edfs//M09.edf
 duration 08.08.32, 29312s | time 22.00.00 - 06.08.32 | date 01.01.85

 signals: 57 (of 57) selected in a standard EDF file
  Fp1__M1_M2__2 | Fp2__M1_M2__2 | AF3__M1_M2__2 | AF4__M1_M2__2 | F7__M1_M2__2 | F5__M1_M2__2 | F3__M1_M2__2 | F1__M1_M2__2
  F2__M1_M2__2 | F4__M1_M2__2 | F6__M1_M2__2 | F8__M1_M2__2 | FT7__M1_M2__2 | FC5__M1_M2__2 | FC3__M1_M2__2 | FC1__M1_M2__2
  FC2__M1_M2__2 | FC4__M1_M2__2 | FC6__M1_M2__2 | FT8__M1_M2__2 | T7__M1_M2__2 | C5__M1_M2__2 | C3__M1_M2__2 | C1__M1_M2__2
  C2__M1_M2__2 | C4__M1_M2__2 | C6__M1_M2__2 | T8__M1_M2__2 | TP7__M1_M2__2 | CP5__M1_M2__2 | CP3__M1_M2__2 | CP1__M1_M2__2
  CP2__M1_M2__2 | CP4__M1_M2__2 | CP6__M1_M2__2 | TP8__M1_M2__2 | P7__M1_M2__2 | P5__M1_M2__2 | P3__M1_M2__2 | P1__M1_M2__2
  P2__M1_M2__2 | P4__M1_M2__2 | P6__M1_M2__2 | P8__M1_M2__2 | PO3__M1_M2__2 | PO4__M1_M2__2 | O1__M1_M2__2 | O2__M1_M2__2
  AFZ__M1_M2__2 | FZ_

Finally, we can specify an `@include` list that lists all aliases.   We have the the file `cmaps` with these mappings.

In [16]:
%%sh
head ../work/data/aux/cmaps 

alias	Fp1|Fp1-(M1+M2)/2|EEG-Fp1|"Fp1 REF"
alias	Fp2|Fp2-(M1+M2)/2|EEG-Fp2|"Fp2 REF"
alias	AF3|AF3-(M1+M2)/2|EEG-AF3|"AF3 REF"
alias	AF4|AF4-(M1+M2)/2|EEG-AF4|"AF4 REF"
alias	F7|F7-(M1+M2)/2|EEG-F7|"F7 REF"
alias	F5|F5-(M1+M2)/2|EEG-F5|"F5 REF"
alias	F3|F3-(M1+M2)/2|EEG-F3|"F3 REF"
alias	F1|F1-(M1+M2)/2|EEG-F1|"F1 REF"
alias	F2|F2-(M1+M2)/2|EEG-F2|"F2 REF"
alias	F4|F4-(M1+M2)/2|EEG-F4|"F4 REF"


We can `@include` them:  this is similar to running 

```
luna s1.lst @../work/data/aux/cmaps ...
```
(except that these special variables _remain set_ in the interactive lunapi environment unless explicitly cleared, i.e. they will contine to apply to any subsequent `inst()` or `attach_edf()` functions that "attach" new signal data).

In [17]:
proj.include( '../work/data/aux/cmaps' )

  setting alias = Fp1|Fp1-(M1+M2)/2|EEG-Fp1|"Fp1 REF"
  setting alias = Fp2|Fp2-(M1+M2)/2|EEG-Fp2|"Fp2 REF"
  setting alias = AF3|AF3-(M1+M2)/2|EEG-AF3|"AF3 REF"
  setting alias = AF4|AF4-(M1+M2)/2|EEG-AF4|"AF4 REF"
  setting alias = F7|F7-(M1+M2)/2|EEG-F7|"F7 REF"
  setting alias = F5|F5-(M1+M2)/2|EEG-F5|"F5 REF"
  setting alias = F3|F3-(M1+M2)/2|EEG-F3|"F3 REF"
  setting alias = F1|F1-(M1+M2)/2|EEG-F1|"F1 REF"
  setting alias = F2|F2-(M1+M2)/2|EEG-F2|"F2 REF"
  setting alias = F4|F4-(M1+M2)/2|EEG-F4|"F4 REF"
  setting alias = F6|F6-(M1+M2)/2|EEG-F6|"F6 REF"
  setting alias = F8|F8-(M1+M2)/2|EEG-F8|"F8 REF"
  setting alias = FT7|FT7-(M1+M2)/2|EEG-FT7|"FT7 REF"
  setting alias = FC5|FC5-(M1+M2)/2|EEG-FC5|"FC5 REF"
  setting alias = FC3|FC3-(M1+M2)/2|EEG-FC3|"FC3 REF"
  setting alias = FC1|FC1-(M1+M2)/2|EEG-FC1|"FC1 REF"
  setting alias = FC2|FC2-(M1+M2)/2|EEG-FC2|"FC2 REF"
  setting alias = FC4|FC4-(M1+M2)/2|EEG-FC4|"FC4 REF"
  setting alias = FC6|FC6-(M1+M2)/2|EEG-FC6|"FC6 REF"
  sett

63

s = FPZ|FPZ-(M1+M2)/2|EEG-FPZ|"FPZ REF"
  setting alias = A1|EEG-A1|"A1 REF"
  setting alias = A1|M1|EEG-M1|"M1 REF"
  setting alias = A2|EEG-A2|"A2 REF"
  setting alias = A2|M2|EEG-M2|"M2 REF"


If we now attach `M09` again, we should see the _harmonized_ labels.

In [18]:
p = proj.inst( 'M09' )

___________________________________________________________________
Processing: M09 | ../work/data/edfs//M09.edf
 duration 08.08.32, 29312s | time 22.00.00 - 06.08.32 | date 01.01.85

 signals: 57 (of 57) selected in a standard EDF file
  Fp1 | Fp2 | AF3 | AF4 | F7 | F5 | F3 | F1
  F2 | F4 | F6 | F8 | FT7 | FC5 | FC3 | FC1
  FC2 | FC4 | FC6 | FT8 | T7 | C5 | C3 | C1
  C2 | C4 | C6 | T8 | TP7 | CP5 | CP3 | CP1
  CP2 | CP4 | CP6 | TP8 | P7 | P5 | P3 | P1
  P2 | P4 | P6 | P8 | PO3 | PO4 | O1 | O2
  AFZ | FZ | FCZ | CZ | CPZ | PZ | POz | OZ
  FPZ


Having set _aliases_ (or other types of mappings, e.g. for annotations) it can be useful to make lunapi tell us what the current "conversion table" looks like.  We can do this with project-level `aliases()`.

In [19]:
#luna s1.lst @work/data/aux/cmaps -o out.db -s HEADERS
# w/ aliases attached
proj.aliases()

Unnamed: 0,Type,Preferred,"Case-insensitive, sanitized alias"
1,CH,A1,A1 REF
2,CH,A1,A1__M1_M2__2
3,CH,A2,A2 REF
4,CH,A2,A2__M1_M2__2
5,CH,AF3,AF3 REF
6,CH,AF3,AF3__M1_M2__2
7,CH,AF4,AF4 REF
8,CH,AF4,AF4__M1_M2__2
9,CH,AFZ,AFZ REF
10,CH,AFZ,AFZ__M1_M2__2


Note the above also includes some default stage/annotation mappings hard-coded in Luna (but can be turned off).

## Sample rates and units

Next we'll look at sample rates and units for the channels.  We'll re-run `HEADERS` as we now have the aliases defined, and so the output will be more consistent across channel labels.

In [20]:
res = proj.silent_proc( 'HEADERS' )

In [21]:
tbl = proj.table( 'HEADERS' , 'CH' ) 

In [22]:
tbl['CH'].unique()

array(['A1', 'A2', 'AF3', 'AF4', 'AFZ', 'C1', 'C2', 'C3', 'C4', 'C5',
       'C6', 'CP1', 'CP2', 'CP3', 'CP4', 'CP5', 'CP6', 'CPZ', 'CZ', 'F1',
       'F2', 'F3', 'F4', 'F5', 'F6', 'F7', 'F8', 'FC1', 'FC2', 'FC3',
       'FC4', 'FC5', 'FC6', 'FCZ', 'FPZ', 'FT7', 'FT8', 'FZ', 'Fp1',
       'Fp2', 'O1', 'O2', 'OZ', 'P1', 'P2', 'P3', 'P4', 'P5', 'P6', 'P7',
       'P8', 'PO3', 'PO4', 'POz', 'PZ', 'T7', 'T8', 'TP7', 'TP8'],
      dtype=object)

That is, above we now only see a smaller set of unique labels, as expected.
We can get a table/count of channel labels (N = number of individuals).

In [23]:
tbl = tbl[[ 'ID', 'CH', 'SR', 'PDIM' ]]
tbl.groupby('CH').size().reset_index(name='Count')

Unnamed: 0,CH,Count
0,A1,19
1,A2,19
2,AF3,20
3,AF4,20
4,AFZ,20
5,C1,20
6,C2,20
7,C3,20
8,C4,20
9,C5,20


A table of the number of unique channel / unit / sample-rate combinations.

In [24]:
tbl.groupby([ 'CH', 'SR' , 'PDIM' ]).size().reset_index(name='Count')

Unnamed: 0,CH,SR,PDIM,Count
0,A1,128.0,mV,1
1,A1,128.0,uV,17
2,A1,150.0,uV,1
3,A2,128.0,mV,1
4,A2,128.0,uV,17
...,...,...,...,...
172,TP7,128.0,uV,18
173,TP7,150.0,uV,1
174,TP8,128.0,mV,1
175,TP8,128.0,uV,18


A table of the number of unique channel unit / sample-rate combinations.

In [25]:
tbl.groupby([ 'SR' , 'PDIM' ]).size().reset_index(name='Count')

Unnamed: 0,SR,PDIM,Count
0,128.0,mV,59
1,128.0,uV,1059
2,150.0,uV,59


Listing unit / sample-list per individual for C3 only.

In [26]:
# destrat out.db +HEADERS -r CH/C3 -v SR PDIM
tbl.loc[ tbl.CH == 'C3' ] 

Unnamed: 0,ID,CH,SR,PDIM
138,F01,C3,128.0,uV
139,F02,C3,150.0,uV
140,F03,C3,128.0,mV
141,F04,C3,128.0,uV
142,F05,C3,128.0,uV
143,F06,C3,128.0,uV
144,F07,C3,128.0,uV
145,F08,C3,128.0,uV
146,F09,C3,128.0,uV
147,F10,C3,128.0,uV


## Generating new EDFs

Given the channel mapping above, we can create a new set of EDFs with mapped labels and we'll also harmonize sample rates and units, and set linked-mastoid referencing while we're at it.

As we know `M09` already has linked-mastoid referencing, we'll `skip` this person from the run below.

In [27]:
proj.var( 'skip' , 'M09' )
proj.vars()

{'alias': 'A2|M2|EEG-M2|"M2 REF"',
 'anterio-frontal': 'AF7,AF3,AFZ,AF8,AF4',
 'anterior': 'FP1,AF7,AF3,F1,F3,F5,F7,FPZ,AFZ,FZ,FP2,AF8,AF4,F2,F4,F6,F8',
 'central': 'FT7,FC5,FC3,FC1,C1,C3,C5,T7,TP7,CP5,CP3,CP1,CPZ,FCZ,CZ,FT8,FC6,FC4,FC2,C2,C4,C6,T8,TP8,CP6,CP4,CP2',
 'centro-parietal': 'CP5,CP3,CP1,CPZ,CP6,CP4,CP2',
 'frontal': 'F1,F3,F5,F7,FZ,F2,F4,F6,F8',
 'fronto-central': 'FC5,FC3,FC1,FCZ,FC6,FC4,FC2',
 'keep-spaces': 'T',
 'left': 'FP1,AF7,AF3,F1,F3,F5,F7,FT7,FC5,FC3,FC1,C1,C3,C5,T7,TP7,CP5,CP3,CP1,P1,P3,P5,P7,P9,PO7,PO3,O1',
 'mid-central': 'C1,C3,C5,CZ,C2,C4,C6',
 'midline': 'IZ,OZ,POZ,PZ,CPZ,FPZ,AFZ,FZ,FCZ,CZ',
 'occiptital': 'O1,IZ,OZ,O2',
 'order-signals': 'T',
 'parietal': 'P1,P3,P5,P7,P9,PZ,P2,P4,P6,P8,P10',
 'parieto-occipital': 'PO7,PO3,POZ,PO8,PO4',
 'posterior': 'P1,P3,P5,P7,P9,PO7,PO3,O1,IZ,OZ,POZ,PZ,P2,P4,P6,P8,P10,PO8,PO4,O2',
 'pre-frontal': 'FP1,FPZ,FP2',
 'right': 'FP2,AF8,AF4,F2,F4,F6,F8,FT8,FC6,FC4,FC2,C2,C4,C6,T8,TP8,CP6,CP4,CP2,P2,P4,P6,P8,P10,PO8,PO4,O2',
 's

This does the actual processing - __it may take a couple of minutes to finish__

In [28]:
res = proj.proc( """ RESAMPLE sr=128
                     uV 
                     REFERENCE sig=${eeg} ref=A1,A2
                     SIGNALS drop=A1,A2
                     WRITE edf-dir=../work/harm1 """ )

___________________________________________________________________
Processing: F01 | ../work/data/edfs//F01.edf
 duration 03.00.30, 10830s | time 22.00.00 - 04.58.00 | date 01.01.85

 signals: 60 (of 60) selected in an EDF+D file
  Fp1 | Fp2 | AF3 | AF4 | F7 | F5 | F3 | F1
  F2 | F4 | F6 | F8 | FT7 | FC5 | FC3 | FC1
  FC2 | FC4 | FC6 | FT8 | T7 | C5 | C3 | C1
  C2 | C4 | C6 | T8 | TP7 | CP5 | CP3 | CP1
  CP2 | CP4 | CP6 | TP8 | P7 | P5 | P3 | P1
  P2 | P4 | P6 | P8 | PO3 | PO4 | O1 | O2
  AFZ | FZ | FCZ | CZ | CPZ | PZ | POz | OZ
  A1 | A2 | FPZ | EDF Annotations
  extracting 'EDF Annotations' track from EDF+
 ..................................................................
 CMD #1: RESAMPLE
   options: sig=* sr=128
 ..................................................................
 CMD #2: uV
   options: sig=*
 ..................................................................
 CMD #3: REFERENCE
   options: ref=A1,A2 sig=Fp1,Fp2,AF3,AF4,F7,F5,F3,F1,F2,F4,F6,F8,FT7,FC5,FC3,FC1,FC2,

In [None]:
Assuming this completes, this should have added 19 EDFs to `../work/harm1/`.

In [33]:
%%sh
ls -lR ../work/harm1

total 15320144
-rw-r--r--@ 1 smp37  staff  158371364 Sep 11 10:59 F01.edf
-rw-r--r--@ 1 smp37  staff  376969984 Sep 11 11:00 F02.edf
-rw-r--r--@ 1 smp37  staff  381216256 Sep 11 11:00 F03.edf
-rw-r--r--@ 1 smp37  staff  384105472 Sep 11 11:00 F04.edf
-rw-r--r--@ 1 smp37  staff  392627200 Sep 11 11:00 F05.edf
-rw-r--r--@ 1 smp37  staff  357664768 Sep 11 11:01 F06.edf
-rw-r--r--@ 1 smp37  staff  413347840 Sep 11 11:01 F07.edf
-rw-r--r--@ 1 smp37  staff  368973568 Sep 11 11:01 F08.edf
-rw-r--r--@ 1 smp37  staff  341365504 Sep 11 11:01 F09.edf
-rw-r--r--@ 1 smp37  staff  404254916 Sep 11 11:01 F10.edf
-rw-r--r--@ 1 smp37  staff  383565568 Sep 11 11:01 M01.edf
-rw-r--r--@ 1 smp37  staff  387622144 Sep 11 11:01 M02.edf
-rw-r--r--@ 1 smp37  staff  410633728 Sep 11 11:02 M03.edf
-rw-r--r--@ 1 smp37  staff  409049344 Sep 11 11:02 M04.edf
-rw-r--r--@ 1 smp37  staff  408663808 Sep 11 11:02 M05.edf
-rw-r--r--@ 1 smp37  staff  399894016 Sep 11 11:02 M06.edf
-rw-r--r--@ 1 smp37  staff  437191168 Sep

Now we will add the final EDF for `M09` which we are handling differently.

In [30]:
# now handle M09 : first clear the `skip` variable
proj.clear_vars( 'skip' )
#luna s1.lst @work/data/aux/cmaps id=M09   -s WRITE edf-dir=work/harm1
p = proj.inst( 'M09' )
p.eval( 'WRITE edf-dir=../work/harm1' )

___________________________________________________________________
Processing: M09 | ../work/data/edfs//M09.edf
 duration 08.08.32, 29312s | time 22.00.00 - 06.08.32 | date 01.01.85

 signals: 57 (of 57) selected in a standard EDF file
  Fp1 | Fp2 | AF3 | AF4 | F7 | F5 | F3 | F1
  F2 | F4 | F6 | F8 | FT7 | FC5 | FC3 | FC1
  FC2 | FC4 | FC6 | FT8 | T7 | C5 | C3 | C1
  C2 | C4 | C6 | T8 | TP7 | CP5 | CP3 | CP1
  CP2 | CP4 | CP6 | TP8 | P7 | P5 | P3 | P1
  P2 | P4 | P6 | P8 | PO3 | PO4 | O1 | O2
  AFZ | FZ | FCZ | CZ | CPZ | PZ | POz | OZ
  FPZ
 ..................................................................
 CMD #1: WRITE
   options: edf-dir=../work/harm1 sig=*
  no epoch mask set, no restructuring needed
  data are not truly discontinuous
  writing as a standard EDF
  writing 57 channels
  saved new EDF, ../work/harm1/M09.edf


Unnamed: 0,Command,Strata
0,WRITE,BL


Now we should have all 20 EDFs present.

In [34]:
%%sh
ls ../work/harm1

F01.edf
F02.edf
F03.edf
F04.edf
F05.edf
F06.edf
F07.edf
F08.edf
F09.edf
F10.edf
M01.edf
M02.edf
M03.edf
M04.edf
M05.edf
M06.edf
M07.edf
M08.edf
M09.edf
M10.edf


Next, we'll go to look at [the annotations](06_anns.ipynb). 