### NumPy - Numeric python <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/1a/NumPy_logo.svg/1200px-NumPy_logo.svg.png" alt="NumPy logo" width = "100">

NumPy (np) is the premier Python package for scientific computing

https://numpy.org

Its powerful comes from the <b>N-dimensional array object</b>

np is a *lower*-level numerical computing library. 

This means that, while you can use it directly, most of its power comes from the packages built on top of np:
* Pandas (*Pan*els *Da*tas)
* Scikit-learn (machine learning)
* Scikit-image (image processing)
* OpenCV (computer vision)
* more...

<b>Importing NumPy<br>
Convention: use np alias</b>

In [None]:
import numpy as np

<img src="https://www.oreilly.com/library/view/elegant-scipy/9781491922927/assets/elsp_0105.png" alt="data structures" width="500">

<b>NumPy basics</b>

Arrays are designed to:
* handle vectorized operations lists are not
    * if you apply a function it is performed on every item in the array, rather than on the whole array object
* store multiple items <b>of the same data type</b>
* have 0-based indexing

* Missing values can be represented using `np.nan` object
    * the object `np.inf` represents infinite
* Array size cannot be changed, should create a new array
* An equivalent numpy array occupies much less space than a python list of lists

<b>Create Array</b><br>
https://docs.scipy.org/doc/numpy-1.13.0/user/basics.creation.html

In [None]:
# Build array from Python list
vector = np.array([1,2,3])
vector

In [None]:
# matrix with zeros 
np.zeros(3,4)

In [None]:
# matrix with 1s
np.ones((3,4), dtype=int)

In [None]:
# matrix with a constant value
np.full((3,4), 10)

In [None]:
# Create a 4x4 identity matrix
np.eye(4)        

In [None]:
# arange - numpy range
np.arange(10, 30, 2)

In [None]:
# evenly spaced numbers over a specified interval
np.linspace(1, 10, 20)

<b>Random data</b><br>
https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.random.html

In [None]:
# Create an array filled with random values
np.random.random((3,4))        

In [None]:
# Create an array filled with random values from the standard normal distribution
np.random.randn((3,4))    

In [None]:
# Generate the same random numbers every time
# Set seed
np.random.seed(100)

```python
# Create the random state
rs = np.random.RandomState(100)
```

<b>Basic array attributes:</b>
* shape: array dimension
* size: Number of elements in array
* ndim: Number of array dimension (len(arr.size))
* dtype: Data-type of the array
* T: The transpose of the array

In [None]:
matrix = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
matrix

In [None]:
# let's check them out 
matrix.shape

<b>Reshaping</b>

In [None]:
matrix

In [None]:
# Reshaping
matrix_reshaped = matrix.reshape(2,6)
matrix_reshaped

<b>Slicing/Indexing</b>

In [None]:
# List-like
matrix_reshaped[1][1]

In [None]:
matrix_reshaped[1,3]

In [None]:
matrix_reshaped[1,:3]

In [None]:
# iterrating ... let's print the elements of matrix_reshaped



In [None]:
# Fun arrays
checkers_board = np.zeros((8,8),dtype=int)
checkers_board[1::2,::2] = 1
checkers_board[::2,1::2] = 1
print(checkers_board)

Create a 2d array with 1 on the border and 0 inside

<b>Performance</b>

test_list = list(range(int(1e6)))
<br>
test_vector = np.array(test_list)

In [None]:
%%timeit
sum(test_list)

In [None]:
%%timeit
np.sum(test_vector)

https://numpy.org/devdocs/user/quickstart.html#universal-functions

<b>Matrix operations</b>

https://www.tutorialspoint.com/matrix-manipulation-in-python<br>
Arithmetic operators on arrays apply elementwise. <br> 
A new array is created and filled with the result.


<b>Array broadcasting</b><br>

https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html<br>
The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. <br>
Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes.

<img src = "https://www.tutorialspoint.com/numpy/images/array.jpg" height=10/>


https://www.tutorialspoint.com/numpy/numpy_broadcasting.htm

In [None]:
matrix

In [None]:
matrix + np.array([1,2,3,4]).reshape(4,1)

In [None]:
matrix * np.array([1,2,3,4]).reshape(4,1)

In [None]:
matrix2 = np.array([[1,2,3],[5,6,7],[1,1,1],[2,2,2]])

In [None]:
matrix * matrix2

In [None]:
# matrix multiplication
matrix.dot(np.array([1,2,3]).reshape(1,3))

In [None]:
# matrix multiplication - more recently
matrix@(np.array([1,2,3]).reshape(3,1))

In [None]:
# stacking arrays together
np.vstack((matrix,matrix2))

In [None]:
np.hstack((matrix,matrix2))

In [None]:
# splitting arrays 
np.vsplit(matrix,2)

In [None]:
np.hsplit(matrix,(2,3))

<b>Copy</b>

In [None]:
matrix

In [None]:
# shallow copy - looks at the same data
matrix_copy = matrix
matrix_copy1 = matrix.view()
print(matrix_copy)
print(matrix_copy1)

In [None]:
# deep copy
matrix_copy2 = matrix.copy()
print(matrix_copy2)

<b>More matrix computation</b>

In [None]:
# conditional subsetting
matrix[(1 < matrix[:,0])]

In [None]:
matrix[(1 <= matrix[:,0]) & (matrix[:,0] <= 6)
       & (2 <= matrix[:,1]) & (matrix[:,1] <= 7),]

In [None]:
# row mean
matrix.mean(axis = 0)

In [None]:
# unique values and counts
matrix = np.random.random((3,4), )
uvals, counts = np.unique(matrix, return_counts=True)

https://www.w3resource.com/python-exercises/numpy/index.php


Create a matrix of 5 rows and 3 columns with numbers from 1 to 30.
Add 2 to the odd values of the array.

Normalize the values in the matrix. Substract the mean and divide by the standard deviation.

Create a random array (5 by 3) and compute: 
   * the sum of all elements 
   * the sum of the rows  
   * the sum of the columns

In [None]:
#Given a set of Gene Ontology (GO) terms and the genes that are associated with these terms find the gene 
#that is associated with the most GO terms

go_terms=np.array(["cellular response to nicotine",
                   "cellular response to hypoxia",
                   "cellular response to lipid"])
genes=np.array(["BAD","KCNJ11","MSX1","CASR","ZFP36L1"])

assoc_matrix = np.array([[1,1,0,1,0],[1,0,0,1,1],[1,0,0,0,0]])

print(assoc_matrix)