# Setup

In [1]:
from typing import *
from pathlib import Path

import pandas as pd
import numpy as np

from constraints import (
  Constraint, filter_df, ensure_constraint,
  FIELD, NOT, OR, AND, EQ, RANGE
)

## Set the paths
We have a data directory `DATA_DIR`

In [2]:
DATA_DIR: Final[Path] = Path('./data/')
DATA_DIR.mkdir(parents=True, exist_ok=True)
DATA_DIR

PosixPath('data')

and the `manifest.json` inside it

In [3]:
MANIFEST_PATH: Final[Path] = DATA_DIR / 'manifest.json'
MANIFEST_PATH

PosixPath('data/manifest.json')

## Get the cache 

Import the cache and session types from `allensdk`.

In [4]:
from allensdk.brain_observatory.ecephys.ecephys_project_cache import EcephysProjectCache
from allensdk.brain_observatory.ecephys.ecephys_session import EcephysSession

and create type aliases for them

In [5]:
Cache: TypeAlias = EcephysProjectCache
Session: TypeAlias = EcephysSession

Create the global cache `CACHE`

In [6]:
CACHE: Final[Cache] = EcephysProjectCache.from_warehouse(manifest=MANIFEST_PATH, timeout=30*60)

and load the session table into `SESSIONS_TABLE`

In [None]:
SESSIONS_TABLE = CACHE.get_session_table()

# Constraints Intro

The module `constraints` is a custom module defined for this project
in `constraints.py`.  It provides a way to abstract away the process
of filtering data.

`constraints` defines the abstract `Constraint` type, which exposes
methods, `Constraint.__contains__` and `Constraint.mask`.

The function `c.__contains__(x)`, that is `x in c`, returns
`True` if `x` satisfies the constraint `c`.

The function `c.mask(df)` returns a mask for the pandas dataframe or
series `df` which, when applied (as `df[c.mask(df)]`), returns the
rows of `df` which satisfy `c`.

Because `df[c.mask(df)]` is such a common thing to do, the function 
`filter_df` is defined as a shorthand:

``` python
def filter_df(df, constraint):
return df[ensure_constraint(constraint).mask(df)]
```

The function `ensure_constraint` is the way to convert
non-`Constraint` objects into `Constraint` objects, such that our
methods are well-defined.

## Writing Constraints
The following `Constraint` classes are defined:

### Literal Constraint
`number`  
`string`  
`EQ(obj)`
  
A literal constraint will only match objects equal to `obj`.  That is,
`x in EQ(obj)` iff `x == obj`.

Numbers, strings, and other literal objects (excluding iterable
objects and mapping, see `OR` and `FIELD`) are treated as `EQ`
constraints when passed through `ensure_constraint`.  Thus,
`filter_df(df, 5)` would return rows of `df` equal to `5`.

### Range Constraint
`RANGE(lb, ub, lb_strict=False, ub_strict=True)`

A range constraint matches objects which are between `lb` and `ub`.
The bounds are strict if `lb_strict` and `ub_strict` are `True`.

If `lb` or `ub` is `None`, then the constraint has no lower or upper
bounds respectively.

### Or Constraint
`[Constraint]`  
`OR(Iterable[Constraint])`  
`OR(Constraint...)`

The logical-or of the supplied constraints.  An object matches the
`OR` constraint if it matches any of its sub-constraints.

A sequence of constraints `[constraint...]` is equivalent to
`OR(constraint...)`.

### And Constraint
`AND(Iterable[Constraint])`  
`AND(Constraint...)`

The logical-and of the supplied constraints.  An object matches the
`AND` constraint if it matches all of its sub-constraints.

### Not Constraint
`NOT(Constraint)`

The negation of the supplied constraint.  An object matches the
`NOT` constraint only if it _doesn't_ match its sub-constraint.

### Field Constraint
`{'field': constraint...}`  
`FIELD(field=constraint...)`

This constraint matches a field of object against the corresponding
constraint.

There are two ways to think about this: Given a pandas dataframe
`df`, if you wanted to add a constraint `constraint` to the column
named `length`, you would use the constraint
``` python 
FIELD(length=constraint)
```
and if you further wanted to add a constraint `constraint2` to the 
column named `item_id`, you would do
``` python
FIELD(length=constraint, item_id=constraint2)
```

Of course, instead, you can also use a dict.  The above constraint
is equivalent to
``` python
{'length': constraint, 'item_id': constraint2}
```

There is nothing special about the `FIELD` constraint, so it can be
passed to `AND`, `OR`, and `NOT` just the way you would pass other
constraints.  This allows the writing of a rich set of constraints.

### Examples
If you want to get the rows in `CURRENT_SESSION.units` where the
`isi_violations` is less than 0.7, you would do

In [None]:
filter_df(CURRENT_SESSION.units, RANGE(None, 0.7))

# Org Babel Setup

``` python
def assStr (name, literal=True):
  f = repr if literal else str
  return name + " = " + f(eval(name))
```

``` elisp
(message "%s" v)
```

``` elisp
(when (eq 'hline (cadr table))
  (setq table (cddr table)))
(cl-flet ((ellide-list (list length &optional (ellision "..."))
              (if (and length (length> list length))
                  (append (seq-subseq list 0 (/ length 2))
                          (list ellision)
                          (seq-subseq list (- (length list) (/ length 2))))
                list)))
    (mapcar (lambda (row) (if (listp row) (ellide-list row ncols) row))
            (ellide-list table nrows (make-list (length (car table)) "..."))))
```

``` elisp
(cl-mapcar #'list (number-sequence start (+ start (length list))) list)
```

``` elisp
(cl-loop with i = (1- start)
         for startp = t then nil
         for tail on table
         collect
         (cond ((atom (car tail)) (car tail))
               ((and startp (eq 'hline (cadr tail)))
                (cons index-name (car tail)))
               (t (cons (cl-incf i) (car tail)))))
```

# Setup

``` python
from typing import *
import pandas as pd
```

# Download Sessions

Set `data_dir` and ensure it exists.

``` python
from pathlib import Path
data_dir: Final[Path] = Path('./data/')
data_dir.mkdir(parents=True, exist_ok=True)
```

Set `manifest_path`.

``` python
manifest_path: Final[Path] = data_dir / 'manifest.json'
```

Create and initialize the project cache object

``` python
from allensdk.brain_observatory.ecephys.ecephys_project_cache import EcephysProjectCache
cache: Final = EcephysProjectCache.from_warehouse(manifest=manifest_path, timeout=30*60)
```

and obtain the sessions

``` python
sessions = cache.get_session_table()
```

Extract the session ids from `sessions` into `session_ids`

``` python
session_ids: Final[Sequence[int]] = list(sessions.index)
```

Extract a single session into `session`:

``` python
assert session_id in session_ids
session = cache.get_session_data(session_id) 
session.metadata
```

``` python
session.units.head()
```

``` python
session.stimulus_conditions.head()
```

``` python
session.stimulus_conditions['stimulus_name'].value_counts()
```

``` python
grouped_stimulus_presentations = session.stimulus_presentations.groupby('stimulus_name')
```

``` python
total_durations = grouped_stimulus_presentations['duration'].sum()
```

``` python
def presentation_type_spike_times(expr = None, mask = None, unit_names = session.units.index):
  presentations = session.stimulus_presentations
  if mask is not None:
    presentations = presentations[mask]
  if expr is not None:
    presentations = presentations.query(expr)
  return session.presentationwise_spike_times(presentations['stimulus_condition_id'], unit_names)

presentation_type_spike_times(expr="`stimulus_name` == 'drifting_gratings'").head(10)
```

``` python
spike_times_with_presentations_condition_names = \
  pd.merge(
    session.presentationwise_spike_times(
      session.stimulus_presentations['stimulus_condition_id'],
      session.units.index).reset_index(),
    session.stimulus_presentations['stimulus_name'],
    how='left',
    on='stimulus_presentation_id'
  )
```

``` python
def get_spike_times_df(unit_ids: Optional[Sequence[int]]=None):
  return pd.concat((
    (_df := pd.DataFrame(unit_spike_times, columns=['spike_time']),
     _df.insert(0,'unit_id',unit_id),
     _df)[-1]
    for (unit_id, unit_spike_times) in session.spike_times.items()
    if unit_ids is None or unit_id in unit_ids
  ), ignore_index=True, copy=False)

spike_times_df = get_spike_times_df()
```

``` python

```