# Introduction to Python NumPy Arrays

You can watch the video that goes with these notes [here](https://www.youtube.com/watch?v=fPI2SIKMGOI)

## What is NumPy

NumPy is a library that can make managing various data easier on our end as programmers. We can add NumPy module with the following;

In [1]:
import numpy as np

## Basic NumPy Commands

We can create NumPy N-D arrays with te following command:

In [6]:
x = list(range(10))
x = np.array(x)
print(x)
print(type(x)) # This is a 1D NmPy array

print()

x = [[1,2,3],[4,5,6]]
print(x)
x = np.array(x)
print(x) # This is a 2D NumPy array

[0 1 2 3 4 5 6 7 8 9]
<class 'numpy.ndarray'>

[[1, 2, 3], [4, 5, 6]]
[[1 2 3]
 [4 5 6]]


We can get the dimension of our array and the shape of our array (rows x columns) using the following commands:

In [7]:
print(x.ndim)
print(x.shape)

2
(2, 3)


We can also figure out the types of the values in our NumPy array by calling the `dtype` function. 

We can also convert a NumPy array to another datatype using the `astype` function.

In [8]:
print(x.dtype)

print()

x_floats = x.astype(np.float)
print(x_floats)
print(x_floats.dtype)

int64

[[1. 2. 3.]
 [4. 5. 6.]]
float64


Note that the data types used here are data types unique to the NumPy package.

With the `arange` function we can create NumPy arrays using commands similar to the `range` function in Python

In [9]:
x = np.arange(2,12,2)
print(x)

[ 2  4  6  8 10]


Here are some other commands to create NumPy arrays:

In [10]:
x = np.ones((4,5))
print(x)

x = np.zeros(10)
print(x)

x = np.full(10,5.5)
print(x)

[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5 5.5]


## Indexing NumPy Arrays and Vectorization

For 1D NumPy arrays, indexing is the same as normal Lists. For 2D NumPy arrays, you can use typical 2D list notation or wrap it all up into a single bracket with commas separating the row and column value.

In [11]:
x = np.ones((4,5))
print(x[0][0], x[0,0])

1.0 1.0


**vectorization** is applying an operation to all the elements in a list without having to write a loop. For example, say we want to add each element in a list. Here is the imlementation using normal lists:

In [12]:
x = list(range(10))
y = list(range(10,20))
print(x)
print(y)
z = []

for i in range(len(x)):
    z.append(x[i] + y[i])
print(z)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[10, 12, 14, 16, 18, 20, 22, 24, 26, 28]


However, if we use NumPy arrays and vectorization, we get the following:

In [13]:
x = np.arange(10)
y = np.arange(10,20)
print(x)
print(y)
z = x + y
print(z)

[0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]
[10 12 14 16 18 20 22 24 26 28]


See? No `for` loop needed which is SUPER handy!

Arithmetic and relational numbers are also possible using this. Here is an example of this:

In [16]:
m_names = np.array(["Mary", "Michael", "Margaret", "Mary", "Marcus", "Molly"])
m_ages = np.array([    28,        72,       12,       34,     40,       68])

marys = m_names == "Mary"
print(marys)

mary_ages = m_ages[marys] # boolean indexing with a boolean array
print(mary_ages)

[ True False False  True False False]
[28 34]


## Broadcasting

**Broadcasting** is performing operations over arrays that are of *different sizes*

In [18]:
x = np.arange(10)
print(x)

x *= np.array([2])
print(x)

[0 1 2 3 4 5 6 7 8 9]
[ 0  2  4  6  8 10 12 14 16 18]


## Advanced Assignment

We can update multiple values in a NumPy array with a single assignment operator. Here is an example:

In [21]:
from numpy.random import randn

rand_data = randn(3,4)
print(rand_data)
rand_data[0][2] = 100.0
print(rand_data)

# Boolean array for all negative values
negatives = rand_data < 0
print(negatives)

# Set all negative valuse to 0
rand_data[negatives] = 0 # Batch assignment!
print(rand_data)

[[-0.8199989   1.2177753  -1.77169472 -0.36193631]
 [-0.68581857  0.76134517 -0.3625586   0.31135462]
 [-0.81977319  1.30463187  0.53548171 -1.48385072]]
[[ -0.8199989    1.2177753  100.          -0.36193631]
 [ -0.68581857   0.76134517  -0.3625586    0.31135462]
 [ -0.81977319   1.30463187   0.53548171  -1.48385072]]
[[ True False False  True]
 [ True False  True False]
 [ True False False  True]]
[[  0.           1.2177753  100.           0.        ]
 [  0.           0.76134517   0.           0.31135462]
 [  0.           1.30463187   0.53548171   0.        ]]


## Slicing NumPy Arrays



In [22]:
x = list(range(10))
print(x)
slice_x = x[3:7] # Copy
print(slice_x)
slice_x[0] = 100
print(slice_x)
print(x)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[3, 4, 5, 6]
[100, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [23]:
x = np.arange(10)
print(x)
slice_x = x[3:7] # View, NOT A COPY
print(slice_x)
slice_x[0] = 100
print(slice_x)
print(x)

[0 1 2 3 4 5 6 7 8 9]
[3 4 5 6]
[100   4   5   6]
[  0   1   2 100   4   5   6   7   8   9]


It should be noted that scalars can be broadcasted to slices!

In [24]:
x[2:5] = 500
print(x)

[  0   1 500 500 500   5   6   7   8   9]


## Reshaping

We can use the `reshape` method to change the shape (number of rows/columns) of our NumPy Array.

In [25]:
x = np.arange(10)
print(x.shape)
x = x.reshape(5,2)
print(x.shape)
print(x)

(10,)
(5, 2)
[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


## Transposing

We can take the linear algebra transpose of a NumPy array using the `T` method.

In [26]:
x = x.T
print(x)

[[0 2 4 6 8]
 [1 3 5 7 9]]


## N-D Array Functions

In [29]:
###### UNARY UFUNCS #######
x = np.arange(10)
print(x)
print(np.sqrt(x))

# For a full list of ufuncs, check out the NumPy documentation


##### BINARY UFUNCS ########
y = np.arange(10,20)
print(y)
print(np.add(x,y))
print(np.power(x,np.full(10,2.0)))
print(np.power(x,2.0))

[0 1 2 3 4 5 6 7 8 9]
[0.         1.         1.41421356 1.73205081 2.         2.23606798
 2.44948974 2.64575131 2.82842712 3.        ]
[10 11 12 13 14 15 16 17 18 19]
[10 12 14 16 18 20 22 24 26 28]
[ 0.  1.  4.  9. 16. 25. 36. 49. 64. 81.]
[ 0.  1.  4.  9. 16. 25. 36. 49. 64. 81.]
