# Tutorial **Numpy Functions** - Array Handling

### The `numpy` **module** is used in almost all numerical computation using Python. 

### A **function** is a specialized operation that has been programmed to take one or more *input* variables and return one or more *output* variables. 
### In this lesson, we will learn about a number of numpy functions that are useful.  I am not going to cover every numpy function.  Im not even going to cover most of them. 



In [None]:
# Import the numpy module
import numpy as np

### I'm also going to specifically import the random submodule in order to make my code easier to read
### I will also start the random number generator

In [None]:
from numpy import random
myseed = 1967
rng = random.default_rng(seed = myseed)

##  Mathematical Functions in numpy

### Basic Math
### Of course the basic operations will work on numpy arrays, element by element:
* +, addition                                  
* -, subtraction
* \*, multiplication
* / division 
* \*\*, exponentiation 
* //, floor division or integer division 
* %, remainder 
### **THEY WILL ONLY WORK WITH A CONSTANT OR WITH ARRAYS OF THE SAME SIZE**

In [None]:
a = np.array([0.5, 1,2])
b = np.array([4,6,8]) 
c = np.array([-1,1])
print('b ', b)
#operating on a constant 
add = b+2   
print('b+2 ',add)
subtract = b-2
print('b-2 ', subtract)
multiply = b*2
print('b*2 ', multiply)
divide = b/2 
print('b/2 ', divide)
exponent = b**2
print('b**2 ', exponent)

In [None]:
print('a ',a)
print('b ',b)
#operating on arrays 
add = b+a   
print('b+a ',add)
subtract = b-a
print('b-a ',subtract)
multiply = b*a
print('b*a ',multiply)
divide = b/a 
print('b/a ', divide)

## Manipulating Sign and Data Type
### There are also some basic manipulation of the sign and type of data:
* `abs`, computes the absolute value
* `rint`, rounds to the nearest integer
* `floor`, discard the decimal and return integer value  
* `ceil`, return the first integer higher than the number
* `sign`, returns -1 for negative values and 1 for positive values 

In [None]:
c = np.array([-1.25, -0.75,-0.25, 0,0.25, 0.75, 1.25])
print('c ', c)
#let's test the operations above
c_abs = np.abs(c)
print('abs ',c_abs)
c_rint = np.rint(c)
print('rint ',c_rint)
c_floor = np.floor(c)
print('floor ',c_floor)
c_ceil = np.ceil(c)
print('ceil ',c_ceil)
c_sign = np.sign(c)
print('sign ',c_sign)



## Maximum and minimum 
### There are three pairs of functions that handle maximum and minimum of arrays. 
### 1. Within an array to find the maximum/minimum
* `amax`
* `amin` 
### 2. to find the *index* of the maximum or minimum element of an array
* `argmax`
* `argmin`
### 3. to compare two equal size arrays element by element, use 
* `maximum`
* `minimum` 

### Lets get the maximum and minimum of an array and the index of the maximum and minimum of an array

In [None]:
v = rng.integers(1,21,15) #15 integers ranging from 1 to 20
maxv = np.max(v) # find the maximum value of v
index_maxv = np.argmax(v) #find the index into v that gives the maximum value.  
print('v = ',v)
print('maxv = ',maxv)
print('index_maxv =', index_maxv)
minv = np.min(v) #find the minimum value of v
index_minv = np.argmin(v) # find the index into v that gives the minimum value 
print('v = ',v)
print('minv = ',minv)
print('index_minv =', index_minv)

### We can compute the minimum or maximum of a matrix.  

In [None]:
M = rng.integers(1,21,(4,8)) # random numbers between 1 and 20 in a 5 x 6 matrix
M_min = np.min(M) # find the minimum of a matrix
M_min_index = np.argmin(M) # find the index into a matrix that identifies the minimim 
M_max = np.max(M) #find the maximum of a matrix 
M_max_index = np.argmax(M) #find the index into a matrix that identifies the maximum
print('M')
print(M)
print('M_min')
print(M_min)
print('M_min_index')
print(M_min_index) #Here, the index is computed in row order
print('M_max')
print(M_max)
print('M_max_index')
print(M_max_index) #Here the index is computed in row order.  

### When working with a matrix we often want to compute the maximum or minimum of each row or column.  To do that, we have to specify an *axis*

In [None]:
M = rng.integers(1,21,(7,3)) # random numbers between 1 and 20 in a 5 x 6 matrix
maxM_0 = np.max(M,axis = 0)
maxM_1 = np.max(M,axis = 1)
print('M')
print(M)
print('max, axis = 0') # maximum of each column, 
print(maxM_0)
print('max, axis = 1') # maximum of each row, 
print(maxM_1)

### I can also compare two arrays and choose the maximum or minimum element by element 

In [None]:
w = rng.integers(1,7,10)
u = rng.integers(1,7,10)
print('w = ',w)
print('u = ',u)
p = np.maximum(u,w)
q = np.minimum(u,w)
print('p = ',p)
print('q = ',q)

## Sort functions for `numpy` arrays
### In many operations with data it is useful to be able to sort the data from lowest to highest, or highest to lowest. 
### It is perhaps not surprising that the function that will sort an array is called `sort`

In [None]:
v = rng.integers(1,7,10)  # 10 random numbers between 1 and 6
v_sorted = np.sort(v) #sort v in ascending order 
print('v =', v)
print('v_sorted =',v_sorted)

### Numpy `sort` function always sorts in ascending order from lowest to highest.  What if I wanted to sort from highest to lowest? 

### Numpy has a `flip` function that allows up reverse the order of the elements in an array. 

In [None]:
v_flipped = np.flip(v)
print('v = ',v)
print('v_flipped = ', v_flipped)

In [None]:
v_sorted = np.sort(v)
v_sorted = np.flip(v_sorted)
print('v_sorted =', v_sorted)

In [None]:
#I could actually do it one step by **nesting** my functions like this. 
v_sorted = np.flip(np.sort(v)) # I implicitly take the output of np.sort and enter into np.flip
print('v_sorted =', v_sorted)

## Matrix sort and the `axis`
### What does sorting a matrix (or any array with more than one dimension) do? 

In [None]:
M = rng.integers(1,7,(6,10)) # 6 by 10 matrix containing random numbers between 1 and 6 
M_test = np.sort(M)
print('M')
print(M)
print('M_test')
print(M_test)

### If we compare the two matrices, it looks like it took each row of M and applied a sort to it by *default*.
### When sorting a matrix or higher dimension array, we can make explicit which dimension we want to apply the sort along. 
### Recall that in a matrix (2 dimensional array), the first dimension is the row and the second dimension is the columns 
### In python the first dimension *axis* is 0 for the row and the second dimension or *axis* is 1 for columns.  

In [None]:
M_sorted_0 = np.sort(M, axis = 0) #sort each column 
M_sorted_1 = np.sort(M, axis = 1) #sort each row 
print('M_sorted_0')
print(M_sorted_0)
print('M_sorted_1')
print(M_sorted_1)

### Specifying axis = 0 sorted each column along the rows.
### Specifying axis = 1 sorted each row along the columns. 
### The default behavior of `sort` is to sort along the last axis.
### It's good practice to specify the *axis* along which you want to sort unless its a simple array. 

### If I need to sort one dimension of a matrix along descending order, I can use the sort function as above, and the `flip` function also specifying an axis 

In [None]:
M_sorted_1 = np.flip(M_sorted_1,axis =1)
print(M_sorted_1)

## Ordered Indices - `argsort`
### In many (*most?*) circumstances you don't only want to be able to obtain a sorted list of items, but you also want to know *what order of indices* produces the sorted list.  This may not seem obvious, but i will make some examples here that illustrate why this is important. 

### The `argsort` function tells you the order of indices to sort an array

In [None]:
v = rng.integers(1,7,10)
v_sorted = np.sort(v) #This obtains a sorted list in increasing order. 
sort_order = np.argsort(v) #This obtains a list of ordered indices that you could use to sort v 
v_sorted_byorder = v[sort_order]
print('i = ',np.arange(0,10,1)) # i juat wanted to track the index 
print('v = ',v)
print('sort_order = ',sort_order)
print('v_sorted = ',v_sorted)
print('v_sorted_byorder = ', v_sorted_byorder)

### Why is this useful? 

### Many times, we want to sort data on one variable, *and sort other variables in the same order*

### I provide an example here on the relationship between age and LDL-bad cholesterol. 

In [None]:
age = np.array([55,58,72,46,48,65]) #age in years
LDL = np.array([65,90,120,55,70,100]) #LDL - bad cholesterol 

### I want to quickly look at those numbers and determine if LDL goes up with age.  
### What I'm going to do is sort the data by age and then use that sort order with the LDL data. 

In [None]:
age_order = np.argsort(age)
age_sorted = age[age_order]
LDL_sorted_byage = LDL[age_order] # notice i passed the indices to order age into LDL
print('age = ', age_sorted)
print('LDL = ', LDL_sorted_byage)