# NumPy Introduction

NumPy is a package for scientific computing that gives Python an extensive math library that works very efficiently. It has many features that make it better for mathmatical computations compared to using plain Python lists.

It is several orders of magnitude faster than python lists. This makes it memory-efficient and optimized for doing artimatic, statistical, and linear algebra operations.

NumPy has multidimensional array data structures that represent vectors and matrices used in linear algebra, which is in turn used heavily in machine learning algorithms.

Numpy also has many built-in (and optimized) mathmatical functions that allow you to do complex computations quickly and without the need to write tons of code.

Install with `conda install numpy`

Documentation:
- [NumPy Manual](https://docs.scipy.org/doc/numpy-1.13.0/contents.html)
- [NumPy User Guide](https://docs.scipy.org/doc/numpy-1.13.0/user/index.html)
- [NumPy Reference](https://docs.scipy.org/doc/numpy-1.13.0/reference/index.html#reference)
- [Scipy Lectures](http://www.scipy-lectures.org/intro/numpy/index.html)

In [1]:
import numpy as np
import time

In [2]:
x = np.random.random(100000000)

Compare regular python vs numpy when calculating the mean of a huge set of numbers

In [3]:
start = time.time()

sum(x)/len(x)

print(time.time() - start)

12.945324182510376


In [4]:
start = time.time()

np.mean(x)

print(time.time() - start)

0.06117987632751465


## NumPy ndarrays

__ndarray__ stands for n-dimensional array and is a multidimensional array of elements all of the same type. They can hold numbers or strings, or other types.

Often used in machine learning, e.g. an ndarray could be used to hold pixel values of an image to input into a neural network for image processing/classifcation.

### Creating

Method 1 for creation:
Create by providing python lists to NumPy's np.array()

In [5]:
x = np.array([1, 2, 3, 4, 5]) # 1-d array or a rank 1 array

print(x)
print(type(x))

[1 2 3 4 5]
<class 'numpy.ndarray'>


In [6]:
# return dtype of elements in the array
x.dtype

dtype('int64')

NumPy can handle more data types than python.

### ndarray methods

In [7]:
Y = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
# this is a 2-d array, or rank 2 array

print(Y)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


#### Shape
See the dimesions, returned as a tuple.
We can see Y is 2-dimensional 

In [8]:
Y.shape

(4, 3)

Shape of a rank 1 array:

In [9]:
x = np.array([1, 2, 3, 4, 5])
x.shape

(5,)

#### Size
Get how many elements are in an array

In [10]:
Y.size

12

ndarrays can only have elements with one data type. If you create one with different datatypes, it will convert them to be the same. This is called upcasting: the elements are converted so precision won't be lost.

In [11]:
x = np.array([1, 2.5, 4])
# Converts integers to floats
print(x, x.dtype)

[1.  2.5 4. ] float64


You could also specify the datatype you want:

In [12]:
x = np.array([1, 2.5, 4], dtype = np.int64)
# Will converts everything to integers
print(x, x.dtype)

[1 2 4] int64


Choosing the datatype is useful if you don't want NumPy to accidently choose the wrong one, or if you don't need that much precision and want to save memory.

#### Save ndarrays into files

This lets you save ndarrays for later use. 


In [13]:
# Saving
x = np.array([1, 2, 3, 4])
np.save('my_array', x)

# x will be saved into a file named my_array.npy (ends up in the same dir)

# Loading
# Make sure to include filename + npy
y = np.load('my_array.npy')
print(y)

[1 2 3 4]


## Creating ndarrays with Built-in functions

## np.zeros()

In [14]:
# Create an ndarray of zeros
# Takes an argument of the shape of the nparray as a tuple
# Default datatype is float64
X = np.zeros((3, 4))
print(X)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


## np.ones()

In [15]:
# Create an ndarray of ones, similarly to zeros.
# Takes an argument of the shape of the nparray as a tuple
# Default datatype is float64
X = np.ones((4, 5))
print(X)

[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]


## np.full()

In [16]:
# Create an ndarray of any constant value
# Takes in the shape and the constant value as args
# Default datatype is the datatype of the given constant (here, int64)
X = np.full((4, 5), 5)
print(X)

[[5 5 5 5 5]
 [5 5 5 5 5]
 [5 5 5 5 5]
 [5 5 5 5 5]]


## np.eye()

In [17]:
# Create an identity matrix
# * All identity matrices are square, so only needs one arg
# Default datatype is float64
X = np.eye(5)
print(X)

[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


## np.diag()

In [18]:
# Create a diagonal matrix
# Takes in a list of numbers to use as the main diagonal
# and fills in the rest with zeros
X = np.diag([10, 20, 30, 40])
print(X)

[[10  0  0  0]
 [ 0 20  0  0]
 [ 0  0 30  0]
 [ 0  0  0 40]]


## np.arange()

In [19]:
# Create a 1-dikm ndarray of evenly spaced values within a given step interval
# arange(start, stop, step)
# stop is exclusive

# With 1 arg, used as stop. 
# Generates array from 0 to stop
x = np.arange(10)
print(x)

# With 2 args, first is start, second is stop
# start is inclusive stop is exclusive
x = np.arange(4, 10)
print(x)

# With 3 args, third arg is the step
# Goes from start to stop, evenly spaced by the step interval
x = np.arange(1, 16, 3)
print(x)

[0 1 2 3 4 5 6 7 8 9]
[4 5 6 7 8 9]
[ 1  4  7 10 13]


## np.linspace()

In [20]:
# np.linspace(start, stop, n)
# returns n evenly spaced numbers from start to stop
# both start and stop are inclusive
# Start, stop required, n defaults to 50
x = np.linspace(0, 25, 10)
print(x)

# Set stop point to be exclusive
x = np.linspace(0, 25, 10, endpoint=False)
print(x)

[ 0.          2.77777778  5.55555556  8.33333333 11.11111111 13.88888889
 16.66666667 19.44444444 22.22222222 25.        ]
[ 0.   2.5  5.   7.5 10.  12.5 15.  17.5 20.  22.5]


## np.reshape()

In [21]:
# Convert a 1-dim array into a multi-dim array
x = np.arange(1, 21)
print(x)

# Takes in array to convert, and the shape you want to convert to
# Shape needs to be compatible with the original array (same number of elements)
X = np.reshape(x, (4, 5))
print(X)

Y = np.reshape(X, (10, 2))
print(Y)



[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20]
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]]
[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]
 [11 12]
 [13 14]
 [15 16]
 [17 18]
 [19 20]]


In [22]:
# reshape() can be used as an ndarray method so you can write inline code
X = np.linspace(0, 50, 10, endpoint=False).reshape(5, 2)
print(X)

[[ 0.  5.]
 [10. 15.]
 [20. 25.]
 [30. 35.]
 [40. 45.]]


## Create arrays with random numbers

### np.random.random()

In [23]:
# Random ndarray of given shape, with values between 0 and 1
# 0 is inclusive, 1 is exclusive
X = np.random.random((3, 3))
print(X)

[[0.67830578 0.25089976 0.51684748]
 [0.74867742 0.97734315 0.81980138]
 [0.99879635 0.99203254 0.96124988]]


### np.random.randint()

In [24]:
# Create random array of integers between a specific range, in a given shape
# random.randint(lower-bound, upper-bound, shape)
# lower-bound is inclusive, upper-bound is exclusive
X = np.random.randint(4, 15, (3, 2))
print(X)


[[12 13]
 [ 5  6]
 [ 4  5]]


### np.random.normal()

In some cases, you may need to create ndarrays with random numbers that satisfy certain statistical properties. For example, you may want the random numbers in the ndarray to have an average of 0. NumPy allows you create random ndarrays with numbers drawn from various probability distributions. The function `np.random.normal(mean, standard deviation, size=shape)`, for example, creates an ndarray with the given shape that contains random numbers picked from a `normal` (Gaussian) distribution with the given `mean` and `standard deviation`. 

In [25]:
# Create ndarray that contains random numbers picked 
# from a normal distribution with a given mean and standard deviation
# np.random.normal(mean, std-dev, size)
X = np.random.normal(0, 0.1, size=(100, 100))

# Print info about X
print('X has dimensions:', X.shape)
print('X is an object of type:', type(X))
print('The elements in X are of type:', X.dtype)
print('The elements in X have a mean of:', X.mean())
print('The maximum value in X is:', X.max())
print('The minimum value in X is:', X.min())
print('X has', (X < 0).sum(), 'negative numbers')
print('X has', (X > 0).sum(), 'positive numbers')

X has dimensions: (100, 100)
X is an object of type: <class 'numpy.ndarray'>
The elements in X are of type: float64
The elements in X have a mean of: 0.0004912116268631499
The maximum value in X is: 0.3837317526655756
The minimum value in X is: -0.34061095867577773
X has 5002 negative numbers
X has 4998 positive numbers


As we can see, the average of the random numbers in the ndarray is close to zero, both the maximum and minimum values in `X` are symmetric about zero (the average), and we have about the same amount of positive and negative numbers.

# Accessing, Deleting, and Inserting Elements into ndarrays

ndarrays are mutable, so they can be changed after creation.
They can be sliced and therefore split in different ways. For example, we can get a subset of an ndarray. For Machine Learning, we can use slicing to separate data, e.g. into training, cross validation, and testing sets.

## Accessing/modifying rank 1 ndarrays

In [26]:
x = np.array([1, 2, 3, 4, 5])
print(x)

# use [] to access indices
print(x[0]) # first index
print(x[1])
print(x[-1]) # last index


[1 2 3 4 5]
1
2
5


In [27]:
x = np.array([1, 2, 3, 4, 5])
print(x)

# use [] = to modify a specific index
x[0] = 20
print(x)

[1 2 3 4 5]
[20  2  3  4  5]


## Accessing/modifying rank 2 ndarrays

Need to provide to indices:
Use `[row, column]` to access

In [32]:
X = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(X)

print(X[0, 0]) # first
print(X[1, 1]) # middle
print(X[2, 2]) # last

[[1 2 3]
 [4 5 6]
 [7 8 9]]
1
5
9


In [33]:
X = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(X)

# modify with [i, i] = 
X[1, 1] = 99
X[2, 2] = 99
print(X)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
[[ 1  2  3]
 [ 4 99  6]
 [ 7  8 99]]


## Deleting from ndarrays

`np.delete(ndarray, elements, axis)`
For rank 1 ndarrays the `axis` keyword is not required. For rank 2 ndarrays, `axis = 0` is used to select rows, and `axis = 1` is used to select columns.

In [36]:
# rank 1 
x = np.array([1, 2, 3, 4, 5])
print(x)

# delete first and last element
x = np.delete(x, [0, 4]) 

print(x)

[1 2 3 4 5]
[2 3 4]


In [39]:
# rank 2
Y = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(Y)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [40]:
# delete first row
W = np.delete(Y, 0, axis=0)
print(W)

[[4 5 6]
 [7 8 9]]


In [42]:
# delete first and last column
v = np.delete(Y, [0, 2], axis=1)
print(v)

[[2]
 [5]
 [8]]


## Appending values

We can append values to ndarrays using the `np.append(ndarray, elements, axis)` function. This function appends the given list of `elements` to `ndarray` along the specified `axis`. 

In [48]:
# rank 1 ndarray
x = np.array([1, 2, 3, 4, 5])
print(x)

# does not need axis arg
# one element
x = np.append(x, 6)
print(x)

# multiple elements
x = np.append(x, [7,8])
print(x)

[1 2 3 4 5]
[1 2 3 4 5 6]
[1 2 3 4 5 6 7 8]


In [46]:
# rank 2 ndarray
Y = np.array([[1,2,3],[4,5,6]])
print(Y)

[[1 2 3]
 [4 5 6]]


In [51]:
# Append row
V = np.append(Y, [[7, 8, 9]], axis=0)
print(V)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [53]:
# Append column
Q = np.append(Y, [[9], [10]], axis=1)
print(Q)

[[ 1  2  3  9]
 [ 4  5  6 10]]


Note that when appending to rank 2 arrays, the shape must match!

## Inserting values

We can insert values to ndarrays using the `np.insert(ndarray, index, elements, axis)` function. This function inserts the given list of `elements` to `ndarray` right before the given `index` along the specified `axis`. 

In [55]:
# rank 1 ndarray
x = np.array([1, 2, 5, 6, 7])
print(x)

# insert 3, 4 just before the 3rd element (index = 2)
x = np.insert(x, 2, [3, 4])
print(x)

[1 2 5 6 7]
[1 2 3 4 5 6 7]


In [56]:
# rank 2 ndarray
Y = np.array([[1,2,3],[7,8,9]])
print(Y)

[[1 2 3]
 [7 8 9]]


In [57]:
# insert row between first and last (index 1) row
W = np.insert(Y, 1, [4, 5, 6], axis=0)
print(W)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [58]:
# insert column full of 5s between first and second column
V = np.insert(Y, 1, 5, axis=1)
print(V)

[[1 5 2 3]
 [7 5 8 9]]


In [62]:
# insert column [0, 9] between first and second column
U = np.insert(Y, 1, [0,9], axis=1)
print(U)

[[1 0 2 3]
 [7 9 8 9]]


## Stacking

We can stack ndarrays on top of each other or side by side.
`np.vstack()` for vertical stacking
`np.hstack()` for horizontal stacking
__The shapes must match__

In [63]:
# rank 1 ndarray 
x = np.array([1,2])
print(x)

[1 2]


In [64]:
# rank 2 ndarray 
Y = np.array([[3,4],[5,6]])
print(Y)

[[3 4]
 [5 6]]


In [68]:
# Stack x on Y
V = np.vstack((x, Y))
print(V)

[[1 2]
 [3 4]
 [5 6]]


In [69]:
# stack x on the right of Y. 
# reshape x in order to stack it on the right of Y.
W = np.hstack((Y, x.reshape(2, 1)))
print(W)

[[3 4 1]
 [5 6 2]]


## Slicing

For accessing subsets of ndarrays. Slicing is performed by combining indices with `:` inside `[]`.

- `ndarray[start:end]` (between given start and end)
- `ndarray[start:]` (given start all the way to end)
- `ndarray[:end]` (start to given end)

__end is excluded__

Since ndarrays can be multidimensional, when doing slicing you usually have to specify a slice for each dimension of the array.

In [70]:
# 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


In [71]:
# Elements in 2nd to 4th rows, and 3rd to 5th columns
Z = X[1:4, 2:5]
print(Z)

[[ 7  8  9]
 [12 13 14]
 [17 18 19]]


In [72]:
# Can also be written as
W = X[1:, 2:5]
print(W)

[[ 7  8  9]
 [12 13 14]
 [17 18 19]]


In [73]:
# Elements first to 3rd rows, 2nd to 4th colums
Y = X[:3, 2:5]
print(Y)

[[ 2  3  4]
 [ 7  8  9]
 [12 13 14]]


In [74]:
# All elements in 3rd row
v = X[2, :]
print(v)

[10 11 12 13 14]


In [75]:
# All elements in 3rd column
q = X[:, 2]
print(q)

[ 2  7 12 17]


In [76]:
# All elements in 3rd column, as a rank 2 array
R = X[:, 2:3]
print(R)

[[ 2]
 [ 7]
 [12]
 [17]]


It is important to note that when we perform slices on ndarrays and save them into new variables, as we did above, the data is not copied into the new variable.

Rather, the new variable is just a different view of the same ndarray. If you make changes to the new variable, it will change the original as well.

In [77]:
# 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


In [78]:
# Make a slice
Z = X[1:4,2:5]
print(Z)

[[ 7  8  9]
 [12 13 14]
 [17 18 19]]


In [80]:
# Change Z in some way
Z[2, 2] = 555
print(Z)

[[  7   8   9]
 [ 12  13  14]
 [ 17  18 555]]


In [81]:
# X also got changed!
print(X)

[[  0   1   2   3   4]
 [  5   6   7   8   9]
 [ 10  11  12  13  14]
 [ 15  16  17  18 555]]


## Copying

We can create ndarrays that contain copies of the values in the slice by using `np.copy(ndarray)`. This function can also be used as a method, in which case you don't need to provide the ndarray.

In [82]:
# 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


In [83]:
# Make a slice that's a copy
Z = np.copy(X[1:4,2:5])
print(Z)

[[ 7  8  9]
 [12 13 14]
 [17 18 19]]


In [84]:
# OR
W = X[1:4,2:5].copy()
print(W)

[[ 7  8  9]
 [12 13 14]
 [17 18 19]]


In [85]:
# Change Z and W
Z[2, 2] = 555
W[2, 2] = 444
print(Z, "\n")
print(W)

[[  7   8   9]
 [ 12  13  14]
 [ 17  18 555]] 

[[  7   8   9]
 [ 12  13  14]
 [ 17  18 444]]


In [86]:
# Original is same
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


### Using one ndarray to make slices, select, or change elements in another ndarray

In [88]:
# 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# Create a rank 1 ndarray to serve as indices to select elements from X
indices = np.array([1,3])

print(X, "\n")
print(indices)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]] 

[1 3]


In [89]:
# Use the indices ndarray to select the 2nd and 4th row of X
Y = X[indices,:]
print(Y)

[[ 5  6  7  8  9]
 [15 16 17 18 19]]


In [90]:
# Use indices to select 2nd and 4th columns of X
Z = X[:, indices]
print(Z)

[[ 1  3]
 [ 6  8]
 [11 13]
 [16 18]]


## Selecting diagonals

NumPy also offers built-in functions to select specific elements within ndarrays. For example, the `np.diag(ndarray, k=N)` function extracts the elements along the `diagonal` defined by `N`. As default is `k=0`, which refers to the main diagonal. Values of `k > 0` are used to select elements in diagonals above the main diagonal, and values of `k < 0` are used to select elements in diagonals below the main diagonal.

In [92]:
# 5 x 5 ndarray that contains integers from 0 to 24
X = np.arange(25).reshape(5, 5)
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]


In [94]:
# main diagonal
z = np.diag(X)
print(z)

[ 0  6 12 18 24]


In [96]:
# diagaonal above main diagonal
y = np.diag(X, k=1)
print(y)

[ 1  7 13 19]


In [97]:
# diagaonal below main diagonal
w = np.diag(X, k=-1)
print(w)

[ 5 11 17 23]


### Unique elements

#### np.unique()

In [98]:
# 3 x 3 ndarray with repeated values
X = np.array([[1,2,3],[5,2,8],[1,2,3]])
print(X)


[[1 2 3]
 [5 2 8]
 [1 2 3]]


In [99]:
# Get unique values
y = np.unique(X)
print(y)

[1 2 3 5 8]


## Boolean Indexing

There are many situations in which we don't know the indices of the elements we want to select. For example, suppose we have a 10,000 x 10,000 ndarray of random integers ranging from 1 to 15,000 and we only want to select those integers that are less than 20. Boolean indexing can help us in these cases, by allowing us select elements using logical arguments instead of explicit indices.

In [100]:
# 5 x 5 ndarray that contains integers from 0 to 24
X = np.arange(25).reshape(5, 5)
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]


In [103]:
# Get elements greater that 10
y = X[X > 10]
print(y)

# <= 7
y = X[X <= 7]
print(y)


# between 10 and 17
y = X[(X > 10) & (X < 17)]
print(y)

[11 12 13 14 15 16 17 18 19 20 21 22 23 24]
[0 1 2 3 4 5 6 7]
[11 12 13 14 15 16]


In [104]:
# Use boolean indexing to assign
X[(X > 10) & (X < 17)] = -1
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 -1 -1 -1 -1]
 [-1 -1 17 18 19]
 [20 21 22 23 24]]


## Set operations

Use to compare two ndarrays, e.g. finding common elements


In [108]:
# We create a rank 1 ndarray
x = np.array([1,2,3,4,5])
print(x)

# We create a rank 1 ndarray
y = np.array([6,7,2,8,4])
print(y)

[1 2 3 4 5]
[6 7 2 8 4]


### np.intersect1d()

In [109]:
# Get element both in x and y
z = np.intersect1d(x, y)
print(z)

[2 4]


### np.setdiff1d()

In [111]:
# Get elements that are in x but no in y
z = np.setdiff1d(x, y)
print(z)

[1 3 5]


### np.union1d()

In [113]:
# All (unique) elements of x and y
z = np.union1d(x, y)
print(z)

[1 2 3 4 5 6 7 8]


## Sorting

We can also sort ndarrays in NumPy. The `np.sort()` function sorts rank 1 and rank 2 ndarrays in different ways. The sort function can also be used as a method. However, there is a big difference on how the data is stored in memory in this case. When `np.sort()` is used as a function, it sorts the ndrrays out of place, meaning, that it doesn't change the original ndarray being sorted. However, when you use sort as a method, `ndarray.sort()` sorts the ndarray in place, meaning, that the original array will be changed to the sorted one. 

### Sorting rank 1 ndarrays

In [114]:
# unsorted rank 1 ndarray
x = np.random.randint(1,11,size=(10,))
print(x)

[4 4 5 6 7 5 9 5 5 7]


In [115]:
# Non-mutating sort
y = np.sort(x)
print(y)
print(x)

[4 4 5 5 5 5 6 7 7 9]
[4 4 5 6 7 5 9 5 5 7]


In [117]:
# Mutating sort
x.sort()
print(x)

[4 4 5 5 5 5 6 7 7 9]


In [119]:
# Sort and get only unique values by combing with np.unique()
x = np.random.randint(1,11,size=(10,))
print(x)

y = np.sort(np.unique(x))
print(y)

[7 2 5 7 8 4 6 8 2 6]
[2 4 5 6 7 8]


### Sorting rank 2 ndarrays

When sorting rank 2 ndarrays, we need to specify to the `np.sort()` function whether we are sorting by rows or columns. This is done by using the `axis` keyword.

In [120]:
# unsorted rank 2 ndarray
X = np.random.randint(1,11,size=(5,5))
print(X)

[[ 7  1  5  4  5]
 [ 9  8 10  3  2]
 [ 6  4  9  5  8]
 [ 1  2  7  2  2]
 [ 8  5  3  3  8]]


In [121]:
# Sort by columns
Y = np.sort(X, axis=0)
print(Y)

[[ 1  1  3  2  2]
 [ 6  2  5  3  2]
 [ 7  4  7  3  5]
 [ 8  5  9  4  8]
 [ 9  8 10  5  8]]


In [122]:
# Sort by rows
W = np.sort(X, axis=1)
print(W)

[[ 1  4  5  5  7]
 [ 2  3  8  9 10]
 [ 4  5  6  8  9]
 [ 1  2  2  2  7]
 [ 3  3  5  8  8]]


# Arithmetic operations and Broadcasting

NumPy allows element-wise operations on ndarrays as well as matrix operations. We will only be looking at element-wise operations on ndarrays for now. In order to do element-wise operations, NumPy sometimes uses something called Broadcasting. Broadcasting is the term used to describe how NumPy handles element-wise arithmetic operations with ndarrays of different shapes. For example, broadcasting is used implicitly when doing arithmetic operations between scalars and ndarrays.

## Element-wise addition, subtraction, multiplication, and division, between ndarrays

To do this, NumPy provides a functional approach, where we use functions such as `np.add()`, or by using arithmetic symbols, such as `+`, that resembles more how we write mathematical equations. Both forms will do the same operation, the only difference is that if you use the function approach, the functions usually have options that you can tweak using keywords. It is important to note that when performing element-wise operations, the shapes of the ndarrays being operated on, must have the same shape or be broadcastable.

### Rank 1 ndarrays

In [123]:
# rank 1 ndarrays
x = np.array([1,2,3,4])
y = np.array([5.5,6.5,7.5,8.5])
print(x)
print(y)

[1 2 3 4]
[5.5 6.5 7.5 8.5]


In [126]:
# Adding
print(x + y)
print(np.add(x, y))

[ 6.5  8.5 10.5 12.5]
[ 6.5  8.5 10.5 12.5]


In [127]:
# subtracting
print(x - y)
print(np.subtract(x, y))

[-4.5 -4.5 -4.5 -4.5]
[-4.5 -4.5 -4.5 -4.5]


In [128]:
# multiplying
print(x * y)
print(np.multiply(x, y))

[ 5.5 13.  22.5 34. ]
[ 5.5 13.  22.5 34. ]


In [129]:
# dividing
print(x / y)
print(np.divide(x, y))

[0.18181818 0.30769231 0.4        0.47058824]
[0.18181818 0.30769231 0.4        0.47058824]


### rank 2 ndarrays
* must have the same shape or be broadcastable

In [130]:
# Rank 2 ndarrays
X = np.array([1,2,3,4]).reshape(2,2)
Y = np.array([5.5,6.5,7.5,8.5]).reshape(2,2)
print(X)
print()
print(Y)

[[1 2]
 [3 4]]
[[5.5 6.5]
 [7.5 8.5]]


In [131]:
# Adding
print(X + Y)
print()
print(np.add(X, Y))

[[ 6.5  8.5]
 [10.5 12.5]]

[[ 6.5  8.5]
 [10.5 12.5]]


In [132]:
# subtracting
print(X - Y)
print()
print(np.subtract(X, Y))

[[-4.5 -4.5]
 [-4.5 -4.5]]

[[-4.5 -4.5]
 [-4.5 -4.5]]


In [133]:
# multiplying
print(X * Y)
print()
print(np.multiply(X, Y))

[[ 5.5 13. ]
 [22.5 34. ]]

[[ 5.5 13. ]
 [22.5 34. ]]


In [134]:
# dividing
print(X / Y)
print()
print(np.divide(X, Y))

[[0.18181818 0.30769231]
 [0.4        0.47058824]]

[[0.18181818 0.30769231]
 [0.4        0.47058824]]


### Other mathematical functions
We can also apply mathematical functions, such as sqrt(x), to all elements of an ndarray at once.

#### Square root

In [135]:
x = np.array([1,2,3,4])
print(x)
print(np.sqrt(x))

[1 2 3 4]
[1.         1.41421356 1.73205081 2.        ]


#### EXP

In [136]:
print(np.exp(x))

[ 2.71828183  7.3890561  20.08553692 54.59815003]


#### Power

In [138]:
# raise all elements to power of 2
print(np.power(x, 2)) 

[ 1  4  9 16]


## Statistical functions


In [139]:
# 2 x 2 ndarray
X = np.array([[1,2], [3,4]])
print(X)

[[1 2]
 [3 4]]


In [142]:
# Mean of all elements of X
print(X.mean())

# Mean of all elements in columns of X
print(X.mean(axis=0))

# Mean of all elements in rows of X
print(X.mean(axis=1))

2.5
[2. 3.]
[1.5 3.5]


In [143]:
# Sum of all elements of X
print(X.sum())

# Sum of all elements in columns of X
print(X.sum(axis=0))

# Sum of all elements in rows of X
print(X.sum(axis=1))

10
[4 6]
[3 7]


In [144]:
# Standard deviation of all elements of X
print(X.std())

# Standard deviation of all elements in columns of X
print(X.std(axis=0))

# Standard deviation of all elements in rows of X
print(X.std(axis=1))

1.118033988749895
[1. 1.]
[0.5 0.5]


In [147]:
# Median of all elements
print(np.median(X))

# Median of all elements in columns of X
print(np.median(X, axis=0))

# Median of all elements in rows of X
print(np.median(X, axis=1))

2.5
[2. 3.]
[1.5 3.5]


In [148]:
# Max of all elements of X
print(X.max())

# Max of all elements in columns of X
print(X.max(axis=0))

# Max of all elements in rows of X
print(X.max(axis=1))

4
[3 4]
[2 4]


In [149]:
# Min of all elements of X
print(X.min())

# Min of all elements in columns of X
print(X.min(axis=0))

# Min of all elements in rows of X
print(X.min(axis=1))

1
[1 2]
[1 3]


## Add, subtract, multiply, divide simple numbers to each element

In the examples below, NumPy is working behind the scenes to broadcast 3 along the ndarray so that they have the same shape. This allows us to add 3 to each element of X with just one line of code.

In [150]:
# We create a 2 x 2 ndarray
X = np.array([[1,2], [3,4]])
print(X)

[[1 2]
 [3 4]]


In [151]:
# adding
print(3 + X)

[[4 5]
 [6 7]]


In [152]:
# substracting
print(3 - X)

[[ 2  1]
 [ 0 -1]]


In [153]:
# multiplying
print(3 * X)

[[ 3  6]
 [ 9 12]]


In [154]:
# dividing
print(3 / X)

[[3.   1.5 ]
 [1.   0.75]]


Subject to certain constraints, Numpy can do the same for two ndarrays of different shapes, as we can see below.

In [156]:
# Create a rank 1 ndarray
x = np.array([1,2,3])

# Create a 3 x 3 ndarray
Y = np.array([[1,2,3],[4,5,6],[7,8,9]])

# Create a 3 x 1 ndarray
Z = np.array([1,2,3]).reshape(3,1)

print(x)
print()
print(Y)
print()
print(Z)

[1 2 3]

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[[1]
 [2]
 [3]]


In [157]:
# add a 1 x 3 ndarray to 3 x 3 ndarray
print(x + Y)

[[ 2  4  6]
 [ 5  7  9]
 [ 8 10 12]]


In [159]:
# add 3 x 1 ndarray to 3 x 3 ndarray
print(Z + Y)

[[ 2  3  4]
 [ 6  7  8]
 [10 11 12]]
[[ 2  3  4]
 [ 6  7  8]
 [10 11 12]]


Make sure you check out the NumPy Documentation for more information on Broadcasting and its rules:
[Broadcasting](https://docs.scipy.org/doc/numpy-1.13.0/user/basics.broadcasting.html)