# Python for (open) Neuroscience

_Lecture 1.1_ - More on `numpy`

Luigi Petrucco

Jean-Charles Mariani

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vigji/python-cimec/blob/main/lectures/Lecture1.1_Numpy.ipynb)

## More on `numpy` functions

### `np.mean()` / `np.nanmean()`

Calculate the average of an array, either global or along some axis:

In [None]:
import numpy as np
arr = np.random.randint(0, 255, (5, 6))
print(f"{arr};\nmean: {np.mean(arr)}")

If we want to take the average along a specific dimension, we can pass the axis as a parameter:

In [None]:
import numpy as np
arr = np.random.randint(0, 255, (5, 6))
arr_mean = np.mean(arr, axis=0)  # we specify one axis
print(f"{arr};\nmean: {mn}")

If there are nan values around, we have to use `np.nanmean()`:

In [None]:
import numpy as np
arr = np.random.randint(0, 255, (5, 6)).astype(float)  # we need a float dtype to use nan values!
arr[0, 0] = np.nan
arr_mean = np.mean(arr)  # regular mean
arr_nan_mean = np.nanmean(arr)  # use nanmean
print(f"{arr};\nregular mean: {arr_mean}\nnanmean: {arr_nan_mean}")

Many of the functions we're about to see behave in this way - assume they have a nan-dealing equivalent!

 - `np.std()` / `np.nanstd()`
 - `np.percentile()` / `np.nanpercentile()`
 - `np.max()` / `np.nanmax()`
 - ...

### `np.std()` / `np.nanstd()`

Calculate the standard deviation of an array, either global or along some axis:

In [None]:
arr = np.random.normal(0, 3, 1000)
np.std(arr)

### `np.median()` / `np.nanmedian()`

Calculate the median of an array, either global or along some axis:

In [None]:
arr = np.random.randint(0, 100, 1000)
np.median(arr)

### `np.max()` / `np.min()`

Calculate the minimum or maximum of an array, either global or along some axis:

In [None]:
arr = np.random.randint(0, 100, 1000)

np.min(arr), np.max(arr)  # print min and max together

### `np.percentile()`

Calculate a given percentile of an array, either global or along some axis:

In [None]:
arr = np.random.randint(0, 1000, 10000)

np.percentile(arr, 75)  # print min and max together

### `np.unique()`

Return unique values of an array, and if asked their counts

In [None]:
np.unique([1,2,2,3])

In [None]:
# If we ask we can get counts as well
np.unique([1,2,2,3], return_counts=True) 

In [None]:
arr = np.random.normal(0, 1, 1000000)

### `np.diff()` / `np.cumsum()`  

We can compute cumulative sums (integrals) of an array with `np.cumsum()`:

In [None]:
my_arr = np.array([1,2,3,4])
np.cumsum(my_arr)

We can compute differences between consecutive elements of an array using `np.diff()`:

In [None]:
my_arr = np.array([1,2,3,4])
np.diff(my_arr)

### Write code the `numpy` way

When operating with matrices, you should always aim at writing <span style="color:indianred">vectorized</span> code

Vectorized code: code where for loops are replaced by operations over matrix dimensions

An very simple example: vectors multiplication

In [None]:
vector_1 = np.random.normal(0, 1, (10000000,))
vector_2 = np.random.normal(0, 1, (10000000,))

In [None]:
%%timeit
product = np.zeros(vector_1.shape)  # initialize empty result vector

# Compute the multiplication in a loop:
for i in range(vector_1.shape[0]):
    product[i] = vector_1[i] * vector_2[i]

In [None]:
%%timeit
product = vector_1 * vector_2

Another example: Z-score the rows of a matrix:

In [None]:
data_matrix = np.random.randint(0, 255, (100000, 100))

In [None]:
%%timeit
normalized_matrix = np.zeros(data_matrix.shape)  # start an empty matrix of matching shape 

# Loop over rows (first dimension), take mean and std, subtract and divide:
for i in range(data_matrix.shape[0]):
    row_mean = np.mean(data_matrix[i, :])
    row_std = np.std(data_matrix[i, :])
    
    normalized_matrix[i, :] = (data_matrix[i, :] - row_mean) / row_std

In [None]:
%%timeit
rows_mean = np.mean(data_matrix, axis=1)  # vectorized mean
rows_std = np.std(data_matrix, axis=1)  # vectorized std

# Write the normalization as a vector operation.
# Note how we use broadcasting to make sure the right dimensions are propagated!

normalized_matrix = data_matrix - rows_mean[:, np.newaxis] / rows_std[:, np.newaxis]

## Search indexes

Some functions can be used to find indexes of some elements:

### `np.argmin()` / `np.argmax()` 

Find the position of the maximum or the minimum of an array

In [None]:
arr = np.array([5, 0, 2, 9, 6,])

np.argmin(arr)  # give index of smallest element

In [None]:
np.argmax(arr)  # give index of biggest element

For a multi-dimensional array:

In [None]:
arr = np.array([[5, 1, 2], [3, 0, 4]])
print(arr)
np.argmin(arr)

### Index raveling / unraveling

What is this 4? 

In [None]:
np.unravel_index(4)

In [None]:
np.ravel_multi_index

### `np.nonzero()` / `np.argwhere()`

In [None]:
arr = np.array([1,2,3,4,5])
print(arr)

#match_idxs = np.argwhere(arr > 3)
arr[np.nonzero(arr > 3)]