##### <img src="../SDSS-Logo.png" style="display:inline; width:500px" />


## Learning Objectives
### Understand Numpy's underlying star, the Ndarray
### Vectorized functions on Ndarrays



### By convention, the Numpy library is usually imported as np

In [1]:
import numpy as np

### The Ndarray and Numpy form the basis for many other important packages in data science such as pandas, scipy, scikit-learn, and tensorflow.
### The primary benefit of Numpy comes from the many numerical operations on Ndarrays that are provided by the Numpy API.
### The numerical operations are highly optimized, based on fast C implementations - these are also called vectorized operations. 

### Some good  references for this lesson are [PLYMI](https://www.pythonlikeyoumeanit.com/module_3.html), [Vanderplas's book](https://jakevdp.github.io/PythonDataScienceHandbook/02.00-introduction-to-numpy.html) and the [Numpy docs](https://numpy.org/doc/stable/reference/arrays.ndarray.html).

In [2]:

# imports needed for setup
import comp116
import pickle
import matplotlib.pyplot as plt
import numpy as np


## The Ndarray

* Fundamentally the Ndarray is a fixed size collection of python objects of the same type that are organized as an N-dimensional array.

* An Ndarray is characterized by two aspects: 
    * a `shape`, which is a tuple of integers  that shows how the data is distributed across the N dimensions
    * a `dtype`, which is the data type of each of the elements of the array.
    
* Numpy provides many ways of creating Ndarrays

In [3]:
my_array = np.array([[1, 2, 3],[9, 10, 11]], np.float64)
print(my_array)

[[ 1.  2.  3.]
 [ 9. 10. 11.]]


### In the above cell, we started with nested lists of numbers and created a numpy array that has 2 rows and 3 columns.
* The `print` output of the ndarray looks like a list of lists - but notice that there are no `,`  between list elements.
* Below is `print()` applied to a list of lists. Notice the commas in between the numbers.

In [4]:
print([[1, 2, 3],[9, 10, 11]])

[[1, 2, 3], [9, 10, 11]]


### You can check for the array's type, shape and data type

In [5]:
print(f"TYpe of my_array {type(my_array)}")
print(f"Shape of my_array {my_array.shape}")
print(f"Data type of my_array {my_array.dtype}")

TYpe of my_array <class 'numpy.ndarray'>
Shape of my_array (2, 3)
Data type of my_array float64


### Numpy provides a number of ways of creating ndarrays, see this [link](https://numpy.org/doc/stable/user/basics.creation.html)
### The previous example created a ndarray from a Python list, but other python sequences can be used.
### A few other useful methods for creating ndarrays are shown below.


In [6]:
# Create an array of zeros.
print(np.zeros((2, 3, 2)))
print()
# Create an array pf ones.
print(np.ones((3, 3)))
print()
# Create an array of normally distributed values with mean 0 and sd 1
print(np.random.normal(0, 1, (2,2)))

[[[0. 0.]
  [0. 0.]
  [0. 0.]]

 [[0. 0.]
  [0. 0.]
  [0. 0.]]]

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]

[[-0.66791465 -0.65114247]
 [-0.05153187 -0.27694694]]


### Reshaping an ndarray is another useful function

In [7]:
np.reshape(list(range(10)), (2, 5))

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

### Indexing and slicing works the same way as for strings and lits.

In [8]:
x = np.reshape(list(range(12)), (3, 4))
print(f"{x} \n")
print(f"{x[1,2]} \n")
print(f"{x[1:3, 2:4]}\n")
               

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]] 

6 

[[ 6  7]
 [10 11]]



### Numpy has vectorized arithmetic operations on ndarrays.
### The most common of can be used with regular operator symbols that are overloaded.

In [9]:
x1 = np.random.randint(1, 15, 4)
x2 = np.arange(0, 0.4, 0.1)
print(f"x1 = {x1}\n")
print(f"x2 = {x2}\n")

print(f"{np.add(x1, x2)}\n")
print(f"{x1 + x2}\n")

x1 = [13  5  1 12]

x2 = [0.  0.1 0.2 0.3]

[13.   5.1  1.2 12.3]

[13.   5.1  1.2 12.3]



### There are many vectorized operations in Numpy that compute along one or more dimensions of the ndarray and reduce the array dimension to one alond that axis.

In [10]:
x3 = np.random.randint(0, 20, (3,4))
print(x3)

[[16  5  6 10]
 [ 0  0 12 16]
 [ 8 15 19 17]]


In [11]:
# Compute the sum of all elements
np.sum(x3)

np.int64(124)

In [12]:
# Find the max across all elements
np.amax(x3)

np.int32(19)

In [13]:
# Find the indexes of the maximum value
np.argmax(x3)

np.int64(10)

In [14]:
# Get the 2-d index not the 1-d index  
np.unravel_index(np.argmax(x3), x3.shape)

(np.int64(2), np.int64(2))

## Sometimes you want to do the operation across one of the dimensions

## 2D numpy array and axis
<img src="Unit-5-5-Fig1.png" width="300" style="float: left" />

## Sum of 2D numpy array along axis = 0
<img src="Unit-5-5-Fig2.png" width="300" style="float: left" />

In [15]:
np.sum(x3, axis = 0)

array([24, 20, 37, 43])

## Sum of 2D numpy array along axis = 1
<img src="Unit-5-5-Fig3.png" width="300" style="float: left" />

In [16]:
np.sum(x3, axis = 1)

array([37, 28, 59])

### See the references for many more exanples of numpy functions.