# NumPy 

* NumPy is a Linear Algebra  Library for Python:
    * adds support for large, multi-dimensional arrays and matrices,
    * introduces a large collection of high-level mathematical functions to operate on these arrays.
* NumPy is one of main building blocks for almost all of data science-relevant Python libraries,
* NumPy is very performant, since large parts of it are written in C.
* Numpy has many built-in functions and capabilities. 
    * We won't cover them all but instead we will focus on some of the most important aspects of Numpy: vectors,arrays,matrices, and number generation.

## Installing NumPy

* When using Anacoda, is installed in default conda environment
* Recommended way to install NumPy via Anaconda:
    `conda install numpy`
* If using Python Package Index (PyPI):
    `pip install numpy`

## Using NumPy

Once you've installed NumPy you can import it as a library:

In [1]:
import numpy as np

## Numpy Arrays

NumPy arrays are the main way we will use Numpy throughout the course. Numpy arrays essentially come in two flavors: vectors and matrices. Vectors are strictly 1-d arrays and matrices are 2-d (but you should note a matrix can still have only one row or one column).

Let's begin our introduction by exploring how to create NumPy arrays.

# Creating NumPy arrays

## Create NumPy arrays from lists

We can create an array by directly converting a list or list of lists:

In [2]:
my_list = [1,2,3]
my_list

[1, 2, 3]

In [3]:
type(np.array(my_list))

numpy.ndarray

In [6]:
my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
print(my_matrix)
type(my_matrix)

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]


list

In [11]:
np.array(my_matrix)
print(np.array(my_matrix))
type(np.array(my_matrix))

[[1 2 3]
 [4 5 6]
 [7 8 9]]


numpy.ndarray

## Create NumPy arrays using built-in methods

There are lots of built-in ways to generate Arrays

### `arange` function

Return evenly spaced values within a given interval.

In [12]:
np.arange(0,10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [13]:
np.arange(0,11,2)

array([ 0,  2,  4,  6,  8, 10])

### `zeros` and `ones` functions

Generate arrays of zeros or ones

In [14]:
np.zeros(3)

array([0., 0., 0.])

In [15]:
np.zeros((5,5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [16]:
np.ones(3)

array([1., 1., 1.])

In [17]:
np.ones((3,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

### `linspace` function
Return evenly spaced numbers over a specified interval.

In [18]:
np.linspace(0,10,3)

array([ 0.,  5., 10.])

In [19]:
np.linspace(0,10,50)

array([ 0.        ,  0.20408163,  0.40816327,  0.6122449 ,  0.81632653,
        1.02040816,  1.2244898 ,  1.42857143,  1.63265306,  1.83673469,
        2.04081633,  2.24489796,  2.44897959,  2.65306122,  2.85714286,
        3.06122449,  3.26530612,  3.46938776,  3.67346939,  3.87755102,
        4.08163265,  4.28571429,  4.48979592,  4.69387755,  4.89795918,
        5.10204082,  5.30612245,  5.51020408,  5.71428571,  5.91836735,
        6.12244898,  6.32653061,  6.53061224,  6.73469388,  6.93877551,
        7.14285714,  7.34693878,  7.55102041,  7.75510204,  7.95918367,
        8.16326531,  8.36734694,  8.57142857,  8.7755102 ,  8.97959184,
        9.18367347,  9.3877551 ,  9.59183673,  9.79591837, 10.        ])

In [115]:
np.linspace(0,1,100).reshape(10,10)

array([[0.        , 0.01010101, 0.02020202, 0.03030303, 0.04040404,
        0.05050505, 0.06060606, 0.07070707, 0.08080808, 0.09090909],
       [0.1010101 , 0.11111111, 0.12121212, 0.13131313, 0.14141414,
        0.15151515, 0.16161616, 0.17171717, 0.18181818, 0.19191919],
       [0.2020202 , 0.21212121, 0.22222222, 0.23232323, 0.24242424,
        0.25252525, 0.26262626, 0.27272727, 0.28282828, 0.29292929],
       [0.3030303 , 0.31313131, 0.32323232, 0.33333333, 0.34343434,
        0.35353535, 0.36363636, 0.37373737, 0.38383838, 0.39393939],
       [0.4040404 , 0.41414141, 0.42424242, 0.43434343, 0.44444444,
        0.45454545, 0.46464646, 0.47474747, 0.48484848, 0.49494949],
       [0.50505051, 0.51515152, 0.52525253, 0.53535354, 0.54545455,
        0.55555556, 0.56565657, 0.57575758, 0.58585859, 0.5959596 ],
       [0.60606061, 0.61616162, 0.62626263, 0.63636364, 0.64646465,
        0.65656566, 0.66666667, 0.67676768, 0.68686869, 0.6969697 ],
       [0.70707071, 0.71717172, 0.7272727

### `eye` function

Creates an identity matrix

In [20]:
np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

## Generating random NumPy arrays

Numpy also has lots of ways to create random number arrays:

### `rand ` function
Create an array of the given shape and populate it with
random samples from a uniform distribution
over ``[0, 1)``.

In [21]:
np.random.rand(2)

array([0.19753774, 0.42526137])

In [102]:
np.random.rand(1)

array([0.96318875])

In [22]:
np.random.rand(5,5)

array([[0.26699749, 0.60998486, 0.61157432, 0.22596143, 0.59354401],
       [0.23131373, 0.24518267, 0.12671221, 0.05234182, 0.40253718],
       [0.19366203, 0.98081963, 0.30682561, 0.85164848, 0.70267006],
       [0.08376473, 0.10980144, 0.60121209, 0.57546465, 0.24226327],
       [0.71075103, 0.35332253, 0.90386422, 0.52430059, 0.53766047]])

### `randn` function

Return a sample (or samples) from the "standard normal" distribution. Unlike rand which is uniform:

In [23]:
np.random.randn(2)


array([ 0.18583303, -2.27223955])

In [24]:
np.random.randn(5,5)

array([[ 1.02983794, -0.17483291, -0.3330976 , -1.40513582,  0.13625993],
       [-0.21571082,  0.53145251,  1.04595048,  0.11112512,  2.05793436],
       [ 1.77662047, -0.2998706 , -1.20246956,  0.68372075,  0.55748481],
       [ 0.43002602, -1.79900273,  0.90646321,  0.12260158, -0.87817465],
       [ 0.84988277,  1.43570742, -0.61916718, -1.40878353, -0.05160212]])

In [103]:
np.random.randn(25)

array([ 1.42186437,  1.237772  ,  0.19032927, -0.18296385, -0.20181046,
       -0.3258099 , -0.03811545,  0.86585414, -0.07763089, -0.23793534,
        0.92754029, -1.40320085,  0.75321691,  0.53446371,  0.74055328,
       -0.91771791,  0.34148737,  0.56046823, -0.05433655, -0.41981239,
        0.10615221,  1.13634252,  1.26923759,  1.11351006,  0.46438877])

### `randint` function
Return random integers from `low` (inclusive) to `high` (exclusive).

In [25]:
np.random.randint(1,100)

60

In [26]:
np.random.randint(1,100,10)

array([66, 85,  9,  8, 73, 21, 51, 26,  6, 70])

## Attributes and Methods of NumPy array

NumPy arrays are implemented as `ndarray` class

```python
class numpy.ndarray(shape, dtype=float, buffer=None, offset=0, strides=None, order=None)
```

that pocesses a number of attributes and methods.

Also see [documentation](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html)

Let's generate to NumPy array objects

In [27]:
arr = np.arange(25)
ranarr = np.random.randint(0,50,10)

In [28]:
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

In [None]:
ranarr

In [29]:
type(arr)

numpy.ndarray

In [30]:
ranarr

array([35,  4, 44, 30, 35, 17, 10, 35, 49, 47])

### `ndarray.reshape` method

`ndarray.reshape` returns an array containing the same data with new shape.

In [31]:
arr.reshape(5,5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

### `max`, `min`, `argmax`, `argmin` methods

* `ndarray.max` and `ndarray.min` are useful methods for finding max or min values within the array
* corresponding index locations can be found using `ndarray.argmin` and `ndarray.argmax`

In [32]:
ranarr

array([35,  4, 44, 30, 35, 17, 10, 35, 49, 47])

In [33]:
ranarr.max()

49

In [34]:
ranarr.argmax()

8

In [35]:
ranarr.min()

4

In [36]:
ranarr.argmin()

1

### `ndarray.shape` attribute

`ndarray.shape` is an attribute that arrays returns shape of the array.

In [37]:
# Vector
arr.shape

(25,)

In [38]:
# Notice the two sets of brackets
arr.reshape(1,25)

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
        16, 17, 18, 19, 20, 21, 22, 23, 24]])

In [39]:
arr.reshape(1,25).shape

(1, 25)

In [40]:
arr.reshape(25,1)

array([[ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10],
       [11],
       [12],
       [13],
       [14],
       [15],
       [16],
       [17],
       [18],
       [19],
       [20],
       [21],
       [22],
       [23],
       [24]])

In [41]:
arr.reshape(25,1).shape

(25, 1)

### `ndarray.dtype` attribute

You can also grab the data type of the object in the array:

In [42]:
arr.dtype

dtype('int32')

# NumPy Indexing and Selection

Next we will discuss how to select elements or groups of elements from an array.

Let's create an array to play with

In [43]:
arr = np.arange(0,11)
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

## Bracket Indexing and Selection
The simplest way to pick one or some elements of an array looks very similar to python lists:

In [44]:
#Get a value at an index
arr[8]

8

In [45]:
#Get values in a range
arr[1:5]

array([1, 2, 3, 4])

In [46]:
#Get values in a range
arr[0:5]

array([0, 1, 2, 3, 4])

## Indexing a 2D array (matrices)

The general format is **arr_2d[row][col]** or **arr_2d[row,col]**. I recommend usually using the comma notation for clarity.

Let's create a new 2-dimensional array to play with

In [47]:
arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))
arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [48]:
arr_2d.shape

(3, 3)

Indexing a row

In [49]:
arr_2d[1]

array([20, 25, 30])

Getting individual element value

In [50]:
# Format is arr_2d[row][col] or arr_2d[row,col]

# Getting individual element value
arr_2d[1][0]

20

In [51]:
arr_2d[1,0]

20

Slicing is follows the same principles as for Python arrays

In [52]:
# original array
arr_2d

array([[ 5, 10, 15],
       [20, 25, 30],
       [35, 40, 45]])

In [53]:
# 2D array slicing

#Shape (2,2) from top right corner
arr_2d[:2,1:]

array([[10, 15],
       [25, 30]])

In [54]:
#Shape bottom row
arr_2d[2]

array([35, 40, 45])

In [55]:
#Shape middle column
arr_2d[:,2]

array([15, 30, 45])

### Fancy Indexing

Fancy indexing allows you to select entire rows or columns out of order.

To show this, let's quickly build out a numpy array:

In [58]:
#Set up matrix
arr2d = np.zeros((10,10))
arr2d

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [112]:
#Length of array
arr_length = arr2d.shape[1]
arr_length


10

In [113]:
np.linspace(0,1,0.01)

  """Entry point for launching an IPython kernel.


array([], dtype=float64)

In [60]:
#Set up array

for i in range(arr_length):
    arr2d[i] = arra2d[i]i
    
arr2d

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3., 3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [5., 5., 5., 5., 5., 5., 5., 5., 5., 5.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.],
       [9., 9., 9., 9., 9., 9., 9., 9., 9., 9.]])

Let's select some rows from the `arr2d` array in an arbitrary order

In [61]:
arr2d[[2,4,6,8]]

array([[2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [8., 8., 8., 8., 8., 8., 8., 8., 8., 8.]])

In [62]:
#Allows in any order
arr2d[[6,4,2,7]]

array([[6., 6., 6., 6., 6., 6., 6., 6., 6., 6.],
       [4., 4., 4., 4., 4., 4., 4., 4., 4., 4.],
       [2., 2., 2., 2., 2., 2., 2., 2., 2., 2.],
       [7., 7., 7., 7., 7., 7., 7., 7., 7., 7.]])

It is also possible to select every i-th element of an array using `::`

In [63]:
arr2d[2::3, ::2]

array([[2., 2., 2., 2., 2.],
       [5., 5., 5., 5., 5.],
       [8., 8., 8., 8., 8.]])

## A picture is worth a thousand words...

<img src= './images/numpy_indexing.png' width=500/>

## Boolean indexing

It is also possible to perform selection of `ndarray` elements using *boolean indexing*.

Again, let's create a new array

In [64]:
arr = np.arange(1,11)
arr

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

It is possible to compare elements of an array to a scalar 

NumPy employes a process called [broadcasting](https://docs.scipy.org/doc/numpy/user/theory.broadcasting.html#array-broadcasting-in-numpy) that allows to perform arithmetic operations on arrays with different shapes - *out of scope of this course*)

In [65]:
arr > 4

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

We can create a boolean array and use it for filtering out elements of an original array that reside on same "places" as `False` elements of the boolean array

In [66]:
bool_arr = arr>4

In [67]:
bool_arr

array([False, False, False, False,  True,  True,  True,  True,  True,
        True])

In [68]:
arr[bool_arr]

array([ 5,  6,  7,  8,  9, 10])

putting it all together

In [69]:
arr[arr>4]

array([ 5,  6,  7,  8,  9, 10])

## Arithmetic operations on NumPy arrays

You can easily perform array with array arithmetic, or scalar with array arithmetic. Let's see some examples:

In [70]:
import numpy as np

arr = np.arange(0,10)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [71]:
arr + arr

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [72]:
arr * arr

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81])

In [73]:
arr - arr

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [74]:
# Warning on division by zero, but not an error!
# Just replaced with nan
arr/arr

  This is separate from the ipykernel package so we can avoid doing imports until


array([nan,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.])

In [75]:
# Also warning, but not an error instead infinity
1/arr

  


array([       inf, 1.        , 0.5       , 0.33333333, 0.25      ,
       0.2       , 0.16666667, 0.14285714, 0.125     , 0.11111111])

In [76]:
arr**3

array([  0,   1,   8,  27,  64, 125, 216, 343, 512, 729], dtype=int32)

## Universal Array Functions

Numpy comes with many [universal array functions](http://docs.scipy.org/doc/numpy/reference/ufuncs.html), which are essentially just mathematical operations you can use to perform the operation across the array. Let's show some common ones:

In [77]:
#Taking Square Roots
np.sqrt(arr)

array([0.        , 1.        , 1.41421356, 1.73205081, 2.        ,
       2.23606798, 2.44948974, 2.64575131, 2.82842712, 3.        ])

In [78]:
#Calcualting exponential (e^)
np.exp(arr)

array([1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01,
       5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03,
       2.98095799e+03, 8.10308393e+03])

In [79]:
np.max(arr) #same as arr.max()

9

In [80]:
np.sin(arr)

array([ 0.        ,  0.84147098,  0.90929743,  0.14112001, -0.7568025 ,
       -0.95892427, -0.2794155 ,  0.6569866 ,  0.98935825,  0.41211849])

In [81]:
np.log(arr)

  """Entry point for launching an IPython kernel.


array([      -inf, 0.        , 0.69314718, 1.09861229, 1.38629436,
       1.60943791, 1.79175947, 1.94591015, 2.07944154, 2.19722458])

In [86]:
np.ones(10)*5

array([5., 5., 5., 5., 5., 5., 5., 5., 5., 5.])

In [89]:
np.arange(10,51)

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
       27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
       44, 45, 46, 47, 48, 49, 50])

In [90]:
np.arange(10,51,2)

array([10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42,
       44, 46, 48, 50])

In [92]:
np.arange(0,9).reshape(3,3)

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])