# PEEP 001: Memory Optimized Signals

In [7]:
import numpy as np

## Background

Real-life signals exist in a measurement-space defined by:

* time, 
* frequency, 
* and various parametric sweeps. 
    
Propagating where a signal lies in the measurement space is useful for:

* Plotting against the correct independent axis.
* Plotting multi-dimensional data on a 2D axis using a legend.
* Transforming data between domains (Time, Frequency, Time Envelope, Frequency Spectrum, Power)

While, tracking independent data offers context and convenience, it increase computational costs of time-critical processes. Therefore, there must always be a way to opt-out of signal tracking, thus resorting to a numpy ndarray datatype.

## NumPy Multi-Dimensional Arrays

NumPy arrays use C-style (Row-Major) memory order by default. A contiguous ND multi-dimensional array is organized a follows:
* Axis order:                 axis0, axis1, ..., axisN-1
* Memory Order:               largest memory step, ...., smallest memory step.
* Multi-dimensional indexing: arr[index0, index1, ..., indexN-1].
* Array of Array indexing:    arr[index0][index1, ..., indexN-1].

## NumPy Indexing Behaviour

NumPy supports a wide variety of multi-dimensional indexing behaviour that must be studied. Allthough different static functions exist for different indexing operations (eg. take, split, mask, etc), all non-static indexing of numpy arrays must be completed inside the __getitem__ magic method of the np.ndarray class. The different types of indexing operations supported by __getitem__ are distinguished as:
* Basic Indexing and Slicing - Passes a scalar or tuple to __getitem__ and returns a view of the data.
* Advanced Indexing and Slicing - Passes an ndarray of boolean or integers to __getitem__ and returns a copy of the data.

The goal of this document is to apply memory optimized array indexing techniques to signal data, hence any advanced indexing behaviour will not be implemented in the Signal data type.

### Basic Indexing and Slicing Inputs and Outputs

The types of basic indexing and slicing inputs are described and the output of such operations is demonstated using several examples. The types of inputs passed to the __getitem__ method can be any of the following:
* int
* slice
* Ellipsis
* newaxis
* Any sequence (tuple, list) representing a multi-dimensional combination of the above data types.

#### Numeric Indexing
<table style="width:100%">
  <tr>
    <th>Inputs</th>
    <th>Outputs</th> 
  </tr>
  <tr>
    <td>int</td>
    <td>scalar or array of dim - 1</td>
  </tr>
  <tr>
    <td>sequence of ints (dim(seq) <= dim(array))</td> 
    <td>scalar or array of dim - dim(seq)</td> 
  </tr>
</table>
Positive (0 based) and/or negative (-1 based) indexing along multiple axis' will result in a scalar when all dimensions have been indexes, otherwise the remaining unindexed axis' will be used to construct a sub-array. **NOTE: The dimensions of the sub-array is reduced by the number of indicies.**

In [18]:
arr = np.ones((1, 2, 3, 4))

# Scalar output
sub_arr = arr[0, 1, 2, 3]
sub_arr.shape, sub_arr

((), 1.0)

In [19]:
# Axis 0 index
sub_arr = arr[0]
sub_arr.shape

(2, 3, 4)

#### Slice Indexing
<table style="width:100%">
  <tr>
    <th>Inputs</th>
    <th>Outputs</th> 
  </tr>
  <tr>
    <td>slice</td>
    <td>array</td>
  </tr>
  <tr>
    <td>sequence of slices (dim(seq) <= dim(array))</td> 
    <td>array of dim(array)</td> 
  </tr>
</table>

Positive (0-based) and/or negative (-1 based) slicing along multiple axis' will result in a sub-array of the same number of dimensions. **NOTE: The dimensions of the sub-array is the same as the input array.**

In [20]:
arr = np.ones((1, 2, 3, 4))

# Single Value Slice (Preserves the dimensions)
sub_arr = arr[0:1, 1:2, 2:3, 3:4]
sub_arr.shape, sub_arr

((1, 1, 1, 1), array([[[[ 1.]]]]))

In [21]:
# Axis 0 slice
sub_arr = arr[0:1]
sub_arr.shape

(1, 2, 3, 4)

#### Ellipsis Indexing
<table style="width:100%">
  <tr>
    <th>Inputs</th>
    <th>Outputs</th> 
  </tr>
  <tr>
    <td>Ellipsis</td>
    <td>All non indexed arrays are replaced with : slice</td> 
  </tr>
</table>

High-Dimensional arrays can be efficiently handled using Ellipsis indexing to select all values along multiple dimesions. **NOTE: Only one Ellipsis is allowed in an indexing operation.**

#### NewAxis Indexing
<table style="width:100%">
  <tr>
    <th>Inputs</th>
    <th>Outputs</th> 
  </tr>
  <tr>
    <td>newaxis</td>
    <td>A new dimension is added in the indexed dimension</td> 
  </tr>
</table>

Adds a new dimension to the resulting sub-array along the dimension where the newaxis dimension was placed.

#### Combining Indexation Types
Any combination of integer, slice, Ellipsis, or newaxis indexation can be combined into a tuple that is passed to __getitem__. Other non-ndarray sequence types are also supported. An ndarray input will trigger Advancded indexing operations which are not backwards compatible with the basic indexation operations described above.

## Proposed Solution

An N-D signal will sub-class ndarray so that it also contains named independent variable information inside signal variable. This will ensure that indexation of the dependent data is reflected in the independnent data. 

### Memory Savings
1. In a single measurement sweep, all derived dependent variables will contain a reference to a single memory copy of the independent sweep information. (Single Independent Varible memory for Multiple Dependent Variables)
2. Uncoupled Measurement sweeps contains a 1-D slice of information compared to the N-D representation of the dependent variable. (1-D Independent Variable memory for N-D Dependent Varaibles).
3. Coupled Independent sweeps still contain a small fraction of the memory allocated multi-dimensional dependent variables. 

### Considerations
0. Advanced indexing operations should result in custom exception or NotImplemented or AdvancedIndexing,
1. Sweeps should be **prepended** to signal array from right-to-left (fastest sweep-to-slowest sweep). This is opposite the axis increase from left-to-right of NumPy arrays.
2. Integer (i) indexing should be replaced by slice(i, i+1, 1) indexing to maintain constant dimensions in signal.
    * A squeeze method would create a controlled way of elminating singular axis'.
3. NewAxis indexing should not be allowed because the number of sweeps (dims) should remain constant.
4. Indexing should be from left-to-right to avoid confusion with NumPy arrays.
    * Ellipsis operation can be used to perform calculations accross frequency/time (signal[..., freq, time]).
5. Indepenent sweeps should be able to broadcast with depenent variables at all times. Can be specified in two formats:
    - large shape: (sweep_ind, n, ..., m,...) where "..." represenents singular dimensions.
    - small shape: sweep_ind, n, m, where "n", and "m" represent coupled sweep dimensions.

## Signal Class

In [None]:
indep_lock = ("freq", "time", "row", "col")
indep_map = OrderedDict()
init(arr, indep_map=OrderedDict, indep_lock=())
    super(arr)
    \# Check Broadcasting
    for ind, (k, v) in reversed(enumerate(indep_map)):
        assert(v.shape[ind] == 1 or arr.shape[ind] == v.shape[ind])

getitem(self, keys):
    \#Convert from scalar to tuple
    if isinstance(keys, (int, type(Ellipsis), slice, newaxis))
        keys = (keys)
    elif isinstance(keys, (list, tuple))
        keys = tuple(keys)
    else:
        raise AdvanceIndexingError()
    new_keys = ()
    for k, in keys:
        if isinstance(k, int)
            new_key += (slice(k, k+1, 1)) \#Convert int indexing to slice
        elif k is Ellipsis:
            new_key += (slice(None))*(self.ndim-len(keys))
        elif isinstance(k, newaxis):
            raise NewAxisIndexingError()
    arr = super().getitem(self, new_keys)
    indep_map = self.indep_map.copy()
    for k, v in self.indep_map:
        indep_keys = (new_keys[ind] if x > 1 else slice(None) for ind, x in enumerate(indep_shape))
        v  = v[indep_keys]
    return Signal(arr, indep_map=indep_map)
    
    

    
    
    [ new_keys[ind] if x > 1 else slice(None) for ind, x in enumerate(indep_shape)]
    indep_keys[indep_shape > 1] = new_keys[indep_shape > 1]
    
    
    
    

## IndepDict Class

add (self, key, value, coupled=())
    /* Convert small to large shape
    shape_ = (v.shape[ind] if k in coupled else 1 for ind, (k, v) in reversed(enumerate(self.items())))
    shape = (-1) + shape_
    self.prepend(key, value.reshape(shape_))
    
setitem (self, keys, value)
    if key not in self:
        raise KeyError("indep_sweep must be added first")
    super().setitem(keys, value)

        