<a href="https://csdms.colorado.edu"><img style="float: center; width: 75%" src="https://raw.githubusercontent.com/csdms/ivy/main/media/logo.png"></a>

# Functions

In the [diffusion](8_diffusion.ipynb) and [advection](9_advection.ipynb) notebooks,
we wrote code
to solve the one-dimensional diffusion and advection equations numerically,
evolve the solutions with time,
and visualize the results.

However, the code in these notebooks is long and complicated and frequently repetitive.
What if we wanted to use the code again,
with different parameters or perhaps even in a different notebook?
Cutting and pasting is tedious, and it can easily lead to errors.

We'd like a way to organize our code so that it's easier to reuse.
Python provides for this by letting us define *functions*.
A function groups code into a program that can be called as a unit.

Before we start,
we'll need Numpy and a NumPy setting for the code in this notebook.

In [None]:
import numpy as np

np.set_printoptions(precision=1, floatmode="fixed")

## Definition

In the diffusion notebook,
we defined a timestep based on a stability criterion
for our numerical solution to the diffusion equation.

Let's group this code into a function.

In [None]:
def calculate_time_step(grid_spacing, diffusivity):
    return grid_spacing ** 2 / diffusivity / 2.1

A function definition begins with the keyword `def`,
followed by the name of the function,
followed by a comma-delimited listing of arguments (also known as parameters in parentheses,
and ending with a colon `:`.
The code in the body of the function--run when the function is called--must be indented.

We've named our function `calculate_time_step` (naming functions is often an art).
It takes two arguments,
the grid spacing of the model and the diffusivity.
The variables `grid_spacing` and `diffusivity` are *local* to the function--they don't exist outside of the body of the function.
In the body of the function,
the time step is calculated from the stability criterion
and returned to the caller.

## Execution

Call the `calculate_time_step` function with a grid spacing `dx` of $10.0~m$ and a diffusivity `D` of $0.1~m^2 s^{-1}$.

In [None]:
dx = 10.0
D = 0.1
dt = calculate_time_step(dx, D)

Note that we passed the arguments to the function in the order it expects:
first the grid spacing, then the diffusivity.
Calling a function we define is no different than calling any other Python function.

Print the result.

In [None]:
print(f"Time step = {dt:.2f} s")

In Python,
we can also pass arguments by name.

In [None]:
dt1 = calculate_time_step(grid_spacing=dx, diffusivity=D)
dt1 == dt

This is useful because it makes the function call more readable.

Further,
when passing arguments by name,
we can change the order of the arguments.

In [None]:
dt2 = calculate_time_step(diffusivity=D, grid_spacing=dx)
dt2 == dt

This is useful because it makes the function easier to call--you don;t have to remember the argument order.

These techniques can be used with any Python function, whether it's made by us or by someone else.

## Extension

Python functions have many interesting features,
more than we can address here.
We'll focus on a few,
and provide a list of additional resources in the summary. 

### Default arguments

It's often useful to define default values for the arguments in a function.

Let's create another function from a piece of repeated code in the diffusion notebook.
This one sets the initial profile of the diffused quantity
(e.g., temperature, aerosol concentration, sediment, etc.).

In [None]:
def set_initial_profile(domain_size=100, boundary_left=500, boundary_right=0):
    concentration = np.empty(domain_size)
    concentration[: int(domain_size / 2)] = boundary_left
    concentration[int(domain_size / 2) :] = boundary_right
    return concentration

Note that each of the arguments is assigned a default value.
If any argument is omitted from a call to this function,
its default value is used instead.

Call `set_initial_profile` with a domain size `Lx` of $10~m$.

In [None]:
Lx = 10
C = set_initial_profile(Lx)

Although we omitted the left and right boundary condition values,
the function call didn't produce an error.

Check the result by printing the returned concentration `C`.

In [None]:
print(C)

The default values for the left and right boundary conditions were applied.

Using default values makes calling a function easier.

### Type hints

Let's group some more repeated code in the diffusion notebook into a function.
This is the solver we used for the one-dimensional diffusion equation.

In [None]:
def solve1d(concentration, grid_spacing=1.0, time_step=1.0, diffusivity=1.0):
    flux = -diffusivity * np.diff(concentration) / grid_spacing
    concentration[1:-1] -= time_step * np.diff(flux) / grid_spacing

The arguments for the grid spacing, time step, and diffusivity take default values,
but the `concenctration` argument does not.

**Question:** Without looking at the body of the function,
can you tell what sort of variable goes into the `concentration` argument?
A float? A string? A NumPy array?



Python is dynamically typed.

What type should a parameter be? Integer, float, string, NumPy array?!
It's hard to tell.

This is where *type hints* can be handy.

Do I want to talk about pass by reference?

How do we know what type/size of variables the function expects from its arguments?
This is where type hints can help.

Redo the function with type hints.

In [None]:
def solve1d(concentration: np.ndarray, grid_spacing: float = 1.0, time_step: float = 1.0, diffusivity: float = 1.0) -> None:
    flux = -diffusivity * np.diff(concentration) / grid_spacing
    concentration[1:-1] -= time_step * np.diff(flux) / grid_spacing

Type hints are optional, but useful.

Type hints are not enforced.

### Docstrings

Add documentation string (docstring) to `solve1d`.

In [None]:
def solve1d(concentration: np.ndarray, grid_spacing: float = 1.0, time_step: float = 1.0, diffusivity: float = 1.0) -> None:
    """Solve the one-dimensional diffusion equation with fixed boundary conditions.

    Parameters
    ----------
    concentration : ndarray
        The quantity being diffused.
    grid_spacing : float (optional)
        Distance between grid nodes.
    time_step : float (optional)
        Time step.
    diffusivity : float (optional)
        Diffusivity.

    Returns
    -------
    result : ndarray
        The temperatures after time *time_step*.

    Examples
    --------
    >>> import numpy as np
    >>> from solver import solve1d
    >>> z = np.zeros(5)
    >>> z[2] = 5
    >>> solve1d(z, diffusivity=0.25)
    array([   0.0,    1.2,    2.5,    1.2,    0.0])
    """
    flux = -diffusivity * np.diff(concentration) / grid_spacing
    concentration[1:-1] -= time_step * np.diff(flux) / grid_spacing

Use the `help` function.

In [None]:
help(solve1d)

In [None]:
?solve1d

Docstring aren't necessary, but they're a good practice.

Documentation systems such as sphinx (link) use information from docstrings to produce documentation.

## Composition

Put the functions together.
*Compose* the functions.

In [None]:
def example():
    """An example of running `solve1d`."""
    print(example.__doc__)
    D = 100  # diffusivity
    Lx = 10  # domain length
    dx = 0.5 # grid spacing

    dt = calculate_time_step(dx, D)
    C = set_initial_profile(Lx)

    print("Time = 0\n", C)
    for t in range(1, 5):
        solve1d(C, dx, dt, D)
        print(f"Time = {t*dt:.4f}\n", C)

This is our first taste of how larger programs are built:
we define basic operations,
then combine them in ever-larger chunks to get the effect we want.
Real-life functions will usually be larger than the ones shown here --- typically half a dozen to a few dozen lines --- but
they shouldn't ever be much longer than that,
or the next person who reads it won't be able to understand what's going on.

Run the example `example`.

In [None]:
example()

## Exercises

1.  "Adding" two strings produces their concatenation: `'a' + 'b'` is `'ab'`. Write a function called `fence` that takes two parameters, `original` and `wrapper`, and returns a new string that has the wrapper character at the beginning and end of the original.

1. Write a function, `normalize`, that takes an array as input and returns a corresponding array of values scaled to the range $[0,1]$. (Hint: Look at NumPy functions such as `arange` and `linspace` to see how their arguments are defined.)

1.  Rewrite your `normalize` function so that it scales data to $[0,1]$ by default, but allows a user to optionally specify the lower and upper bounds.

## Summary

More info in the Python documentation.

https://docs.python.org/3/tutorial/controlflow.html#defining-functions

https://docs.python.org/3/tutorial/controlflow.html#more-on-defining-functions
including default arguments

Unresolved: formal versus actual parameters.

Unresolved: global variables.

If your function doesn't fit on a screen, it's too long.
Break it up.

The process of building larger programs from smaller functions--composition--is a key element of scientific programing.

How do we know a function is working as we expect?
This is *unit testing*, covered later.