# LunAPI : advanced notes

Links to notebooks in this repository: [Index](./00_overview.ipynb) | [Individuals](./01_indivs.ipynb) | [Projects](./02_projects.ipynb) | [Staging](./03_staging.ipynb) | [Models](./04_models.ipynb) | [Advanced](./98_advanced.ipynb) | [Reference](./99_reference.ipynb)

---

___This notebook is under development - further notes and FAQs will be added over time.___  

Also see the [reference](./99_reference.ipynb) notebook for a comprehensive list of all high-level lunapi functions.

In [1]:
import lunapi as lp
proj = lp.proj()

initiated lunapi v0.0.5 <lunapi.lunapi0.luna object at 0x10ff01cf0> 



We'll attach one example individual to demonstrate features below.

In [2]:
proj.var( 'path' , '/tutorial/' )
proj.sample_list( '/tutorial/s.lst' )
p = proj.inst( 0 )

read 3 individuals from ~/tutorial/s.lst


___________________________________________________________________
Processing: nsrr01 | /Users/smp37/tutorial/edfs/learn-nsrr01.edf
 duration 11.22.00, 40920s | time 21.58.17 - 09.20.17 | date 01.01.85

 signals: 14 (of 14) selected in a standard EDF file
  SaO2 | PR | EEG_sec | ECG | EMG | EOG_L | EOG_R | EEG
  AIRFLOW | THOR_RES | ABDO_RES | POSITION | LIGHT | OX_STAT


## Package structure

The `lunapi` package contains two levels of interface: a high-level (`lunapi1`) set of Python functions ([lunapi/lunapi1.py](https://github.com/remnrem/luna-api/blob/c592e23fe3ecec03774ed245062cab797aee200c/src/lunapi/lunapi1.py)), which are wrappers around lower-level Python bindings to the core Luna C/C++ API (`lunapi0`), based on the C/C++ code in [lunapi/lunapi0.cpp](https://github.com/remnrem/luna-api/blob/c592e23fe3ecec03774ed245062cab797aee200c/src/lunapi/lunapi0.cpp). Users will generally want to work with the higher-level functions. 

---
## Luna to Python mapping

### Command-line

If a standard command-line Luna sequence may look as follows: given a sample-list `s.lst` and a command script `cmd.txt`, a run may take the form (where `x=1` and `y=2` are _project-wide_ variables, i.e. presumably used in `cmd.txt`) and we also pass a _special variable_ to define some channel _aliases_ (i.e. here mapping `EKG` and `ECG1` both to `ECG`):

```
luna s.lst x=1 y=2 alias="ECG|EKG,ECG1" -o out.db < cmd.txt
```

This would generate an output database `out.db`, which might be queried with `destrat`.  We can first check the contents of the database:
```
destrat out.db
```
If `cmd.txt` contained `PSD` commands, we might write something like (e.g. assuming these outputs where shown when running the above command): 
```
destrat out.db +PSD -r B CH
```
to access the channel-by-band result strata.

### Python

Within `lunapi`, the above steps would be represented as follows: we first initiate the project:

```
import lunapi as lp
proj = lp.proj()
```
We then attach a sample-list:
```
proj.sample_list( 's.lst' )
```
and set the project-level special variables
```
proj.varmap( { 'x':1 , 'y':2 , 'alias': 'ECG|EKG,ECG1' } )
```
and finally run the script in `cmd.txt`:
```
res = proj.eval( lp.cmdfile( 'cmd.txt' ) )
```
To see the results, we use `strata()`:
```
proj.strata()
```
We could then access the output using `table()`:
```
proj.table( 'PSD' , 'B_CH' )
```
for the same channel-by-band result strata as above.   

The results should be similar, as exactly the same underlying C/C++ library is used in both cases.


## Instance referencing

When creating an individual instance, __the returned object is only a reference to the instance, not the instance itself__.  This distinction can be important to note when working with multiple instances, or trying to copy instances, etc.   In brief, if we start with this: 

At this point, the following statements hold:

 - `q` points to the same object as `p`
 - therefore, if `p` is changed, so is `q` (and vice versa)
 - if `p` is deleted (`del p`), `q` will continue to point to the same Luna instance `id1` however
 - if both `p` and `q` are deleted, then Python's garbage collector will release any resources used to store `id1` (i.e. based on Python's _reference-counting_ approach to memory management - meaning that you normally shouldn't have to worry about explicitly freeing resources)

The figure below shows how two Luna instances (`id1` and `id2`) might be created and referenced by different Python objects (`p`, `q` and `r`), which should make the above distinctions clear.

<img src="img/py-refs.png" width=40% height=40%>

This of course begs the question: if you _do_ want to make a copy of an instance, how is that done?  Currently, the only way to do this is to repeat the steps loading the same data (but assigned to different references, e.g. here `p` and `q`), and then processing both objects (as now changes to `p` will not affect `q` and vice versa):

## Controlling inputs

### Multi-line Luna scripts

You can use Python triple-quotes to write multi-line scripts; for longer scripts, it is better to read from a file (see below).

In [4]:
p.eval( """ MASK ifnot=N2 & RE
            RE
            PSD sig=EEG spectrum dB """ )

 ..................................................................
 CMD #1: MASK
   options: ifnot=N2 sig=*
  set masking mode to 'force'
  annots: N2
  applied annotation mask for 1 annotation(s)
  523 epochs match; 841 newly masked, 0 unmasked, 523 unchanged
  total of 523 of 1364 retained
 ..................................................................
 CMD #2: RE
   options: sig=*
  restructuring as an EDF+:  retaining 15690 of 40920 records
  of 682 minutes, dropping 420.5, retaining 261.5
  resetting mask
  clearing any cached values and recording options
  retaining 523 epochs
 ..................................................................
 CMD #3: RE
   options: sig=*
  no epoch mask set, no restructuring needed
 ..................................................................
 CMD #4: PSD
   options: dB sig=EEG spectrum
  calculating PSD from 0.5 to 25 for 1 signals


Unnamed: 0,Command,Stata
0,MASK,EMASK
1,PSD,B_CH
2,PSD,CH
3,PSD,CH_F
4,RE,BL


### Running command files

The `lp.cmfile()` utility reads in a Luna script from a file, and correctly formats it (e.g. removing comments, etc) such that it can be used by `eval()` or `proc()`:

In [6]:
lp.cmdfile( '/tutorial/cmd/first.txt')

'EPOCH len=30 & TAG tag=STAGE/${stage} & MASK ifnot=${stage} & RESTRUCTURE & STATS sig=EEG'

In [7]:
proj.var( 'stage' , 'N2' )
p.eval( lp.cmdfile( '/tutorial/cmd/first.txt') )

 ..................................................................
 CMD #1: EPOCH
   options: len=30 sig=*
  set epochs, length 30 (step 30, offset 0), 523 epochs
 ..................................................................
 CMD #2: TAG
   options: sig=* tag=STAGE/N2
  setting analysis tag to [STAGE/N2]
 ..................................................................
 CMD #3: MASK
   options: ifnot=N2 sig=*
  set masking mode to 'force'
  annots: N2
  applied annotation mask for 1 annotation(s)
  523 epochs match; 0 newly masked, 0 unmasked, 523 unchanged
  total of 523 of 523 retained
 ..................................................................
 CMD #4: RESTRUCTURE
   options: sig=*
 ..................................................................
 CMD #5: STATS
   options: sig=EEG
 processing EEG ...

  ***           but currently an epoch mask set has been set;
  ***           for this operation to skip masked epochs,
  ***           you need to run RE (RESTRUCTU

Unnamed: 0,Command,Stata
0,EPOCH,BL
1,MASK,EMASK_STAGE
2,RESTRUCTURE,STAGE
3,STATS,CH_STAGE


### Import external databases

The `import_db()` command populates the project-level result cache with the contents of a previously-generated Luna output database. (Note: this example will not run unless you've previously created the file `out.db` via Luna command line in this folder.)

In [10]:
proj.import_db( 'out.db' ) 

  read data on 3 individuals from out.db


['nsrr01', 'nsrr02', 'nsrr03']

In [11]:
proj.commands()

Unnamed: 0,Command
0,STATS


## Controlling output

### Silencing log output

In [12]:
# turn off logging
proj.silence()

# turn it back on
proj.silence( False )

### Writing to text-tables

(to be added: to allow generating text-table outputs directly via `lunapi`, i.e. as may be useful for very large jobs)

# Advanced topics

This section briefly covers a few more advanced topics to help orient working with `lunapi`

## `lunapi1` and `lunapi0` functions

In the example below, `edf` is the level-1 object (i.e. wrapper plus instance), whereas `raw` is the level-0 object (i.e. a vanilla instance).   Most functions are similar - it is just the output format that is different (e.g. a pandas dataframe versus a list-of-lists, etc).

In [13]:
p

<lunapi-instance id:nsrr01 edf:/Users/smp37/tutorial/edfs/learn-nsrr01.edf annot:~/tutorial/edfs/learn-nsrr01-profusion.xml>

Here we get the raw, level-0 instance (the `edf` member of the level-1 instance):

In [15]:
raw = p.edf

In [17]:
raw

<lunapi-instance id:nsrr01 edf:/Users/smp37/tutorial/edfs/learn-nsrr01.edf annot:~/tutorial/edfs/learn-nsrr01-profusion.xml>

We can compare the output of various functions: e.g. channel listings:

In [14]:
p.channels()

Unnamed: 0,Channels
0,SaO2
1,PR
2,EEG_sec
3,ECG
4,EMG
5,EOG_L
6,EOG_R
7,EEG
8,AIRFLOW
9,THOR_RES


In [18]:
raw.channels()

['SaO2',
 'PR',
 'EEG_sec',
 'ECG',
 'EMG',
 'EOG_L',
 'EOG_R',
 'EEG',
 'AIRFLOW',
 'THOR_RES',
 'ABDO_RES',
 'POSITION',
 'LIGHT',
 'OX_STAT']

As a second example, using `stat()`:

In [19]:
p.stat()

Unnamed: 0,Value
annotation_files,~/tutorial/edfs/learn-nsrr01-profusion.xml
duration,04.21.30.000
edf_file,/Users/smp37/tutorial/edfs/learn-nsrr01.edf
elen,30.0
id,nsrr01
na,10
ne,523
nem,0
ns,14
nt,14


In [20]:
raw.stat()

{'annotation_files': '~/tutorial/edfs/learn-nsrr01-profusion.xml',
 'duration': '04.21.30.000',
 'edf_file': '/Users/smp37/tutorial/edfs/learn-nsrr01.edf',
 'elen': 30.0,
 'id': 'nsrr01',
 'na': 10,
 'ne': 523,
 'nem': 0,
 'ns': 14,
 'nt': 14,
 'state': 1}

When a new instance is created, this happens _internally_ within the Luna C/C++ library.  The lunapi Python wrapper contains objects that point to this single reference.  This means that if you copy an instance, both instances point to the same underlying data.  If you change one, you'll see the other is also changed.

We can also access multiple individuals at the same time:

In [None]:
a = proj.inst(1)
b = proj.inst(2)

In [None]:
a

In [None]:
b

Note that `a` and `b` are simply references to the underlying data, which are stored within the core Luna C/C++ library.  

In [None]:
a = proj.inst(0)
a

In [None]:
b = a
b

In [None]:
a.stat()

In [None]:
b.stat()

In [None]:
a.eval( 'MASK ifnot=N2 & RE' )
a.stat()

In [None]:
b.stat()

However... if you create two... can be diff

In [None]:
p1 = proj.inst(0)
p2 = proj.inst(0)

In [None]:
p1

In [None]:
p2

In [None]:
p1.stat()

In [None]:
p2.stat()

In [None]:
p1.eval('MASK ifnot=N2 & RE')

In [None]:
p1.stat()

In [None]:
p2.stat()

That is, the unique underlying instances are created by the `inst()` function.  One innvocation of `inst()` == one instance. 

There may be some programmatic cases where you want to avoid the overhead of Pandas dataframe, and so it may be more convenient to extract the simpler built-in Python objects.    For most purposes, the higher-level functions will typically be preferrable.