# Designing the `lookup` function

There are a couple of different ways we could design a function for looking up
nmos / pmos device data from our lookup tables. Let's take a tour through the
different options, and see what works and what doesn't.

## Understanding the data

First, let's take a minute to understand what the device data looks like.

We have 4 independent variables:

- L
- Vgs
- Vds
- Vsb

```note
We'd typically think of $ W $ as a variable we'd sweep as well, but when possible,
we'll actually just use a representative value for $ W $ and de-normalize the things
that scale with $ W $ after sizing the device.
```

When we sweep those variables, we measure a whole bunch of things, like $I_d$, $g_m$, $g_{ds}$, $C_{gg}$, etc.
Each of those measurements is captured by a column in the device's DataFrame.

So, we end up with something like this:

In [3]:
# import my helper functions
import sys
sys.path.append('../helpers')
from xtor_data_helpers import load_mat_data

# load up device data
nch_data_df = load_mat_data("../../Book-on-gm-ID-design-main/starter_kit/180nch.mat")

# display top 5 rows of data, with a reasonable subset of columns
cols = ['W', 'L', 'VGS', 'VDS', 'VSB', 'ID', 'GM', 'GM_ID', 'GDS', 'CGG']
display(
    nch_data_df[cols].head()
)

Loading data from ../../Book-on-gm-ID-design-main/starter_kit/180nch.mat
Found the following columns: ['ID', 'VT', 'GM', 'GMB', 'GDS', 'CGG', 'CGS', 'CGD', 'CGB', 'CDD', 'CSS', 'STH', 'SFL', 'INFO', 'CORNER', 'TEMP', 'VGS', 'VDS', 'VSB', 'L', 'W', 'NFING']


  values = np.array([convert(v) for v in values])


Unnamed: 0,W,L,VGS,VDS,VSB,ID,GM,GM_ID,GDS,CGG
0,5,0.18,0.0,0.0,0.0,-0.0,0.0,,5.54795e-10,6.802937e-15
1,5,0.18,0.0,0.0,0.1,1e-13,0.0,0.0,2.285584e-10,6.707651e-15
2,5,0.18,0.0,0.0,0.2,2e-13,0.0,0.0,9.710877e-11,6.62527e-15
3,5,0.18,0.0,0.0,0.3,3e-13,0.0,0.0,4.240448e-11,6.553162e-15
4,5,0.18,0.0,0.0,0.4,4e-13,0.0,0.0,1.897669e-11,6.489379e-15


<br>

So, now that we know what our data looks like, let's talk about how to access it.

## Murmann's original Matlab function

The `lookup` function is used all over *Systematic Design of Analog CMOS Circuits*. Professor
Murmann has helpfully put the Matlab code for it on [Github](https://github.com/bmurmann/Book-on-gm-ID-design/blob/main/starter_kit/look_up.m),
and instead of trying to explain it myself, I'll just copy his explanation: 

``` matlab
% The function "look_up" extracts a desired subset from the 4-dimensional simulation data
% The function interpolates when the requested points lie off the simulation grid
%
% There are three usage modes:
% (1) Simple lookup of parameters at some given (L, VGS, VDS, VSB)
% (2) Lookup of arbitrary ratios of parameters, e.g. GM_ID, GM_CGG at given (L, VGS, VDS, VSB)
% (3) Cross-lookup of one ratio against another, e.g. GM_CGG for some GM_ID
%
% In usage modes (1) and (2) the input parameters (L, VGS, VDS, VSB) can be 
% listed in any order and default to the following values when not specified:
%
% L = min(data.L); (minimum length used in simulation)
% VGS = data.VGS; (VGS vector used during simulation)
% VDS = max(data.VDS)/2; (VDD/2)
% VSB = 0;
%
% When more than one parameter is passed to the function as a vector, the output
% becomes multidimensional. This behavior is inherited from the Matlab function 
% “interpn”, which is at the core of the lookup function. The following example
% produces an 11x11 matrix as the output:
%
% look_up(nch,'ID', 'VGS', 0:0.1:1, 'VDS', 0:0.1:1)
%
% The dimensions of the output array are ordered such that the largest dimension
% comes first. For example, one dimensional output data is an (n x 1) column vector.
% For two dimensions, the output is (m x n) and m > n.
```

The first thing we'll change is that we won't return multidimensional arrays, we'll just
return plain DataFrames with multiple sweeps of the input variables. I think of this 
like we're returning the unpivoted version of what Murmann's Matlab would return; 
hopefully that makes sense.

Another change we'll want to make compared to this original Matlab function is that for usage
mode (1): I don't want to return just the parameter that the user asks for; it seems
reasonable enough to just return the whole row(s) of device data that match the inputs
(or are interpolated from the inputs). So in the example above:

``` matlab
look_up(nch, 'ID', 'VGS', 0:0.1:1, 'VDS', 0:0.1:1)
```

I wouldn't just return `ID`, I'd return the whole dataset for values of `VGS` of `0:0.1.1`
and `VDS` of `0:0.1:1`.

The other change we'll make is to combine use cases (1) and (2). We'll do that by pre-computing
the ratios, i.e. `GM_ID` or `GM_GDS`, then use case (2) looks exactly like (1) because the ratios
are just another parameter in the table.

Lastly, if we're going to pre-compute the ratios, it seems like we can treat use case (3)
a lot like (1), if we're sort of flexible about the inputs to the function.

So our function should take the following inputs:

- `df`: A DataFrame with PMOS/NMOS device data
- `fixed_vars`: a dictionary of variables and values for them that we want to use in the lookup. This
  could be either just independent sweep variables or independent sweep variables plus one dependent
  variable.

The function should return a DataFrame of device data interpolated at the values of the `fixed_vars`
provided.

## Method one: using `numpy.interp`

I think I could use `numpy.interp`, but things get tricky if we have to do multiple interpolations.

For example, suppose the user provides vectors for both `VGS` and `VDS` that aren't in the original
dataset. I think we'd have to first interpolate for `VGS` using the original dataset, then interpolate
for `VDS` on the dataset returned by the first interpolation. Maybe this is less tricky that I'm
making it out to be?

## Method two: using `scipy.RegularGridInterpolator`

This seems like the most elegant solution, and would require the least code, but I'm actually not 100%
on how to use it. From the Scipy [docs](https://docs.scipy.org/doc/scipy/tutorial/interpolate/ND_regular_grid.html)
I think we could do something like this:

<br>

In [11]:
all_cols = list(nch_data_df.columns)
indy_cols = ['L', 'VGS', 'VDS', 'VSB']
depy_cols = [col for col in all_cols if col not in indy_cols]
display(all_cols)
display(indy_cols)
display(depy_cols)

['ID',
 'VT',
 'GM',
 'GMB',
 'GDS',
 'CGG',
 'CGS',
 'CGD',
 'CGB',
 'CDD',
 'CSS',
 'STH',
 'SFL',
 'INFO',
 'CORNER',
 'TEMP',
 'VGS',
 'VDS',
 'VSB',
 'L',
 'W',
 'NFING',
 'GM_ID',
 'JD',
 'GM_GDS',
 'GM_CGG']

['L', 'VGS', 'VDS', 'VSB']

['ID',
 'VT',
 'GM',
 'GMB',
 'GDS',
 'CGG',
 'CGS',
 'CGD',
 'CGB',
 'CDD',
 'CSS',
 'STH',
 'SFL',
 'INFO',
 'CORNER',
 'TEMP',
 'W',
 'NFING',
 'GM_ID',
 'JD',
 'GM_GDS',
 'GM_CGG']

In [23]:
from scipy.interpolate import RegularGridInterpolator

# first, we need to remove the 'INFO' and 'CORNER' columns,
# since they're not numeric and they mess up the interpolator
nch_data_df.drop(labels=['INFO', 'CORNER'], axis='columns', inplace=True)

# simple example: let's assume user gave us fixed values for all
# of the independent swept variables, so we'll create the Interpolator
# using them as the 'points'
ind_cols = ['L', 'VGS', 'VDS', 'VSB']

# we'll use all of the other columns as the 'values' for the Interpolator
dep_cols = [col for col in nch_data_df.columns if col not in indy_cols]

# create a tuple of the values in each column in ind_cols
points = tuple(nch_data_df[col].unique() for col in ind_cols)
# display(points)

# create the array of values of the dependent variables
values = nch_data_df[dep_cols].values
# display(values)

# create a RegularGridInterpolator from our NMOS data
interp = RegularGridInterpolator(points=points, values=values, method='linear')

# this fails with an exception about dimensions, which I think means I'm not setting
# up points and values correctly



ValueError: There are 4 point arrays, but values has 2 dimensions