# Try a function on for size

In [1]:
# interactive figures, requires ipypml!
%matplotlib widget
#%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import scipy
import xarray as xa

In [2]:
# not sure how else to get the helpers on the path!
import sys
sys.path.append('../scripts')

In [3]:
from data_gen import get_data, fit
d = get_data(25)

## Plot more than one curve

In the previous lesson we got as far making a plot with a single vibration curve in it:

In [5]:
fig, ax = plt.subplots()
m = d[6]
ax.plot(m.time, m, label=f'control = {float(m.control):.1f}')
ax.set_xlabel('time (ms)')
ax.set_ylabel('displacement (mm)')
ax.legend();

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

But we know that the scientifically interesting effect we want to see is how these curves change as a function of *control* so we really want to be able to see more than one curve at the same time.  Via copy-paste-edit we can get three curves on the axes:

In [6]:
fig, ax = plt.subplots()
m = d[6]
ax.plot(m.time, m, label=f'control = {float(m.control):.1f}')
m = d[0]
ax.plot(m.time, m, label=f'control = {float(m.control):.1f}')
m = d[-1]
ax.plot(m.time, m, label=f'control = {float(m.control):.1f}')
ax.set_xlabel('time (ms)')
ax.set_ylabel('displacement (mm)')
ax.legend();

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

### Add an offset

While this is better than "plot everything" from the first it is still a bit too busy to be readily understood.  One technique we can use is to add an offset to the data before plotting to separate the data visually

In [8]:
fig, ax = plt.subplots()
m = d[6]
ax.plot(m.time, m + 0, label=f'control = {float(m.control):.1f}')
m = d[0]
ax.plot(m.time, m + 4 , label=f'control = {float(m.control):.1f}')
m = d[-1]
ax.plot(m.time, m + 8, label=f'control = {float(m.control):.1f}')
ax.set_xlabel('time (ms)')
ax.set_ylabel('displacement [offset] (mm)')
ax.legend();

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

### Refactor to a loop

Looking at this cell there is a fair amount of (nearly identical) duplicated code.  This suggests that we should try using a loop to reduce the duplication.  This will make the code easier to read (as it will be clear what is different each pass through the loop) make it easier to make future updates (as the change only has to be made once), and makes in easier to change the number of curves plotted (by changing the loop)

In [9]:
fig, ax = plt.subplots()
for j, indx in enumerate([6, 0, -1]):
    m = d[indx]
    ax.plot(m.time, m + j * 4, label=f'control = {float(m.control):.1f}')
ax.set_xlabel('time (ms)')
ax.set_ylabel('displacement [offset] (mm)')
ax.legend();

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

## With a little help from my ~friends~ function

Looking at the body of that loop we have a section of code that does a well scoped task "Given a curve plot it (with an offset) making sure it has a good label".  We want to pull this out into a function (it is only two lines now, but it will grow!) so that we can re-use this logic.  However, we are now faced with a design choice: what should the signature of our function be?!  We could mechanically lift the loop body out and make all of the variables input:


```python
def plot_one(ax: Axes, d: FullDataSet, indx: int, j: int):
    ...
```

This would allow us to copy-paste our loop body into the function and go on our way (and is also what some IDEs might offer to do for you!), but this is not the best design.  It both tells the function too much and not enough.  Because we are passing in the whole data set and an index we are offering the function more information that it needs to do its job, it only needs the curve it cares about!  Further, because we are also passing in the index of the full data set to pull out, if we were to end up having just one curve and wanted to use this function we would have to re-wrap the curve in something that we could then have the function index

```python
plot_one(ax, [single_curve], 0, 0)  # why do this?
```

The first change we should make to the signature is take in a single curve rather than the full data set and an index:

```python
def plot_one(ax: Axes, experiment: OneExperiment, j: int):
    ...
```

Now we should look at *j* which is passing too little information into the function!  In loop we had a hard-coded factor of `4` in the offset computation.  As currently proposed we would only be able to offset the curves in multiples of 4!  We could relax the API a bit to allow *j* to also be a float, but then when using this function you would have to know about the magic number 4 and do

```python
plot_one(ax, single_curve, the_offset_I_want / 4)  # why do this?
```

Hence, we want to change our proposed signature to be:

```python
def plot_one(ax: Axes, experiment: OneExperment, offset: float=0):
    ...
```

where we also set a default value for the offset.

Adapting the function body to match this signature we write

In [16]:
def plot_one(ax: 'Axes', experiment: 'OneExperiment', offset: float=0) -> 'Dict[str, Artist]':
    """Given a curve plot it (with an offset) and format a label for a legend.
    
    Parameters
    ----------
    ax : mpl.Axes
        The axes to add the plot to
        
    experiment : OneExperiment
        An xarray DataArray with a vector 'time' and scalar 'control' coordinates.
        
    offset : float, optional
        A vertical offset to apply to before plotting
        
    Returns
    -------
    curve : Line2D
        The Line2D object for the curve 
    """
    return ax.plot(
        experiment.time, 
        experiment + offset, 
        label=f'control = {float(experiment.control):.1f}'
    )

The docstring (which is indeed currently longer than the function body!) follows the [numpydoc](https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard).  While it is not the only docstring convention in use, you will see a lot of docstrings in this format because it is followed by many of the core projects (numpy, scipy, Matplotlib, scikit-learn, ...) of the scientific Python ecosystem.

In [17]:
type(d[6])

xarray.core.dataarray.DataArray

In [18]:
fig, ax = plt.subplots()
for j, indx in enumerate([6, 0, -1]):
    plot_one(ax, d[indx], j*4)
ax.set_xlabel('time (ms)')
ax.set_ylabel('displacement [offset] (mm)')
ax.legend();

Canvas(toolbar=Toolbar(toolitems=[('Home', 'Reset original view', 'home', 'home'), ('Back', 'Back to previous …

We now ask "was it worth it?" for creating this function.  Currently we only have one statement in the function body and in our calling cell we only saved ourselves one local variable! In the next couple of lessons we are going to expand this function, but even if we stopped here, I think this function is worth having.  What it is expressing, in addition to the `ax.plot` call is that the data in **this** use-case is a 1D-vector which caries with it an associated *time* and *control* attributes.  This may seem trivial, but by using xarray (or pandas, awkward array, or a dictionary of numpy arrays) to structurally encode the important relationships between the parts of your data.  This function knows how to use this structure to "do the right thing".

## Style the curves

Our function, despite its upsides, has in fact cost us some functionality.  In addition to *label*, `ax.plot` can take a wide range of key-word arguments to control the styling of the line.  To get this back we can either pass a all extra key-word arguments through to the `ax.plot` call like

```python

def plot_one(..., **kwargs_for_plot):
    return ax.plot(..., **kwargs_for_plot)
```

which is a very common pattern when wrapping APIs.  However, because with Python there can only be exactly on "all the extra keyword collectors".  THis 