# Introduction to the array API

This notebook is a brief introduction to the problems you face when trying to write code that is
agnostic to the type of the input array. This means you are trying to write code that works with
Numpy, PyTorch, CuPy, etc arrays as input.

As an example, let's write a function that can normalize an array.

We will compute the mean and standard deviation to center and normalize
the array. This isn't ground breaking stuff, but it already illustrates the hurdles that you
face when trying to write code that works with several different types of input arrays.

In [None]:
%matplotlib inline

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(42)


def normalize(arr):
    mean = np.mean(arr)
    std = np.std(arr)
    normalized_arr = (arr - mean) / std

    return normalized_arr

Generate some random values to play with.

In [None]:
mu = -3  # Mean
sigma = 3.141  # Standard deviation
x = np.random.normal(mu, sigma, size=(100,))

plt.hist(x, range=(-15, 15), bins=30);

This looks reasonable. The center is somewhere left of zero.

In [None]:
x_norm = normalize(x)

plt.hist(x_norm, range=(-15, 15), bins=30);

Hoorray! It looks like it works. The data is now centered on zero and the spread is much smaller.

How about using the same function for a CuPy array? It would be cool to use the same code
no matter what the input array type is.

In [None]:
# Skip these cells if you don't have a CUDA device/GPU
#import cupy as cp

#x_cp = cp.asarray(x)

In [None]:
#normalize(x_cp)

Looks good!

How about PyTorch?

In [None]:
import torch

x_torch = torch.asarray(x, device="cpu")

In [None]:
normalize(x_torch)

Well, it would have been too good.

We could probably fix this if we used `torch.mean` instead of `np.mean` in our function.

To do this we have to somehow ask the array what its module is.

Maybe the `inspect` module has something to help

In [None]:
import inspect

In [None]:
inspect.getmodule(x_torch)

In [None]:
xp = inspect.getmodule(x_torch)

In [None]:
xp.mean(x_torch)

It looks like we could modify our `normalize` function to first use `inspect.getmodule` to get the module
of the input array, and then use the functions from it to perform the normalisation.

In [None]:
def normalize(arr):
    xp = inspect.getmodule(arr)
    
    mean = xp.mean(arr)
    std = xp.std(arr)
    normalized_arr = (arr - mean) / std

    return normalized_arr

In [None]:
normalize(x_torch)

Excellent!

Let's just check with a Numpy array to make sure.

In [None]:
normalize(x)

## The array API standard

The basic idea of the array API standard is to provide a standardised way to obtain the namespace associated with an array
and for the contents of that namespace to be standardised as well.

This means you can write functions that work with any kind of input array (if that array complies with the array API standard).

No need for `if isinstance(x, torch.tensor)` or `inspect.getmodule` trickery.

You can find out more about what is part of the standard https://data-apis.org/array-api/latest/index.html

Unfortunately the standard is pretty new, so things aren't perfect yet. For example Numpy before v2 does not contain
the `__array_namespace__` method (same for PyTorch). So for now we will use `array_api_compat` a small library that smooths
out the remaining differences.

In [None]:
!pip install array-api-compat

In [None]:
import array_api_compat

In [None]:
array_api_compat.get_namespace(x_torch)

In [None]:
array_api_compat.get_namespace(x)

## Final version

Let's write a version of `normalize` that uses tha array API.

In [None]:
def normalize(arr):
    xp = array_api_compat.get_namespace(arr)
    
    mean = xp.mean(arr)
    std = xp.std(arr)
    normalized_arr = (arr - mean) / std

    return normalized_arr

In [None]:
normalize(x_torch)

In [None]:
normalize(x)

## Exercise

Take this function and convert it to use the array API so that you can pass in a Numpy, Torch or CuPy array.