# NumPy Introduction

NumPy is a package for scientific computing that gives Python an extensive, efficient math library. It has many features that make it better for mathmatical computations compared to using plain Python lists, including speed! It is memory-efficient and optimized for doing arithmatic, statistical, and linear algebra operations.

NumPy has multidimensional array data structures that represent vectors and matrices used in linear algebra, which are in turn used heavily in machine learning algorithms.

Numpy also has many built-in (and optimized) mathmatical functions that allow you to do complex computations quickly and without the need to write tons of code.

Install with `conda install numpy`

Documentation:
- [NumPy Manual](https://docs.scipy.org/doc/numpy-1.13.0/contents.html)
- [NumPy User Guide](https://docs.scipy.org/doc/numpy-1.13.0/user/index.html)
- [NumPy Reference](https://docs.scipy.org/doc/numpy-1.13.0/reference/index.html#reference)
- [Scipy Lectures](http://www.scipy-lectures.org/intro/numpy/index.html)

In [1]:
# importing numpy
import numpy as np

Compare regular python vs numpy when calculating the mean of a huge set of numbers

In [2]:
import time

x = np.random.random(100000000)

In [3]:
start = time.time()

sum(x)/len(x)

print(time.time() - start)

12.900490045547485


In [4]:
start = time.time()

np.mean(x)

print(time.time() - start)

0.06459307670593262


## NumPy ndarrays

__ndarray__ stands for n-dimensional array and is a multidimensional array of elements all of the same type. They can hold numbers or strings, or other types.

Often used in machine learning. For example, an ndarray could be used to hold pixel values of an image to input into a neural network for image processing/classification.

NumPy can handle more data types than python.

### Creating

Method 1 for creation:
Create by providing python lists to NumPy's `np.array()`

In [5]:
x = np.array([1, 2, 3, 4, 5]) # 1-d array or a rank 1 array

print(x)
print(type(x))

[1 2 3 4 5]
<class 'numpy.ndarray'>


In [6]:
# return dtype of elements in the array
x.dtype

dtype('int64')

### ndarray methods

In [7]:
Y = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
# this is a 2-d array, or rank 2 array

print(Y)

[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


#### Shape
See the dimesions, returned as a tuple.
You can see Y is 2-dimensional 

In [8]:
Y.shape

(4, 3)

Shape of a rank 1 array:

In [9]:
x = np.array([1, 2, 3, 4, 5])
x.shape

(5,)

#### Size
Get how many elements are in an array

In [10]:
Y.size

12

ndarrays can only have elements with one data type. If you create one with different datatypes, it will convert them to be the same. This is called upcasting: the elements are converted so precision won't be lost.

In [11]:
x = np.array([1, 2.5, 4])
# Converts integers to floats
print(x, x.dtype)

[1.  2.5 4. ] float64


You could also specify the datatype you want:

In [12]:
x = np.array([1, 2.5, 4], dtype = np.int64)
# Will converts everything to integers
print(x, x.dtype)

[1 2 4] int64


Choosing the datatype is useful if you don't want NumPy to accidently choose the wrong one, or if you don't need that much precision and want to save memory.

#### Save ndarrays into files

This lets you save ndarrays for later use. 


In [13]:
# Saving
x = np.array([1, 2, 3, 4])
np.save('my_array', x)

# x will be saved into a file named my_array.npy (ends up in the same dir)

# Loading
# Make sure to include filename + npy
y = np.load('my_array.npy')
print(y)

[1 2 3 4]


## Creating ndarrays with Built-in functions

## np.zeros()

In [14]:
# Create an ndarray of zeros
# Takes an argument of the shape of the nparray as a tuple
# Default datatype is float64
X = np.zeros((3, 4))
print(X)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


## np.ones()

In [15]:
# Create an ndarray of ones, similarly to zeros.
# Takes an argument of the shape of the nparray as a tuple
# Default datatype is float64
X = np.ones((4, 5))
print(X)

[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]


## np.full()

In [16]:
# Create an ndarray of any constant value
# Takes in the shape and the constant value as args
# Default datatype is the datatype of the given constant (here, int64)
X = np.full((4, 5), 5)
print(X)

[[5 5 5 5 5]
 [5 5 5 5 5]
 [5 5 5 5 5]
 [5 5 5 5 5]]


## np.eye()

In [17]:
# Create an identity matrix
# * All identity matrices are square, so only needs one arg
# Default datatype is float64
X = np.eye(5)
print(X)

[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


## np.diag()

In [18]:
# Create a diagonal matrix
# Takes in a list of numbers to use as the main diagonal
# and fills in the rest with zeros
X = np.diag([10, 20, 30, 40])
print(X)

[[10  0  0  0]
 [ 0 20  0  0]
 [ 0  0 30  0]
 [ 0  0  0 40]]


## np.arange()

In [19]:
# Create a 1-dimensional ndarray of evenly spaced values within a given step interval
# arange(start, stop, step)
# stop is exclusive

# With 1 arg, used as stop. 
# Generates array from 0 to stop
x = np.arange(10)
print(x)

# With 2 args, first is start, second is stop
# start is inclusive stop is exclusive
x = np.arange(4, 10)
print(x)

# With 3 args, third arg is the step
# Goes from start to stop, evenly spaced by the step interval
x = np.arange(1, 16, 3)
print(x)

[0 1 2 3 4 5 6 7 8 9]
[4 5 6 7 8 9]
[ 1  4  7 10 13]


## np.linspace()

In [20]:
# np.linspace(start, stop, n)
# returns n evenly spaced numbers from start to stop
# both start and stop are inclusive
# Start, stop required, n defaults to 50
x = np.linspace(0, 25, 10)
print(x)

# Set stop point to be exclusive
x = np.linspace(0, 25, 10, endpoint=False)
print(x)

[ 0.          2.77777778  5.55555556  8.33333333 11.11111111 13.88888889
 16.66666667 19.44444444 22.22222222 25.        ]
[ 0.   2.5  5.   7.5 10.  12.5 15.  17.5 20.  22.5]


## np.reshape()

Convert ndarrays into different shapes

In [21]:
# Convert a 1-dimensional array into a multi-dimensional array
x = np.arange(1, 21)
print(x, "\n")

# Takes in array to convert, and the shape you want to convert to
# Shape needs to be compatible with the original array (same number of elements)
X = np.reshape(x, (4, 5))
print(X, "\n")

Y = np.reshape(X, (10, 2))
print(Y)



[ 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20] 

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]
 [11 12 13 14 15]
 [16 17 18 19 20]] 

[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]
 [11 12]
 [13 14]
 [15 16]
 [17 18]
 [19 20]]


In [22]:
# reshape() can be used as an ndarray method so you can write inline code
X = np.linspace(0, 50, 10, endpoint=False).reshape(5, 2)
print(X)

[[ 0.  5.]
 [10. 15.]
 [20. 25.]
 [30. 35.]
 [40. 45.]]


## Create arrays with random numbers

### np.random.random()

In [23]:
# Random ndarray of given shape, with values between 0 and 1
# 0 is inclusive, 1 is exclusive
X = np.random.random((3, 3))
print(X)

[[0.61749554 0.64381786 0.22377283]
 [0.8569913  0.34041235 0.08918336]
 [0.78101588 0.35953508 0.81126181]]


### np.random.randint()

In [24]:
# Create random array of integers between a specific range, in a given shape
# random.randint(lower-bound, upper-bound, shape)
# lower-bound is inclusive, upper-bound is exclusive
X = np.random.randint(4, 15, (3, 2))
print(X)


[[13 13]
 [ 8  9]
 [14 14]]


### np.random.normal()

Sometimes, you may need to create ndarrays with random numbers that satisfy certain statistical properties. For example, you may want the random numbers in the ndarray to have an average of 0.

NumPy lets you create random ndarrays with numbers calculated from different probability distributions.

The function `np.random.normal(mean, standard deviation, size=shape)` creates an ndarray with the given shape that contains random numbers picked from a `normal` (Gaussian) distribution with the given `mean` and `standard deviation`. 

In [25]:
# Create ndarray that contains random numbers picked 
# from a normal distribution with a given mean and standard deviation
# np.random.normal(mean, std-dev, size)
X = np.random.normal(0, 0.1, size=(100, 100))

# Print info about X
print('X has dimensions:', X.shape)
print('X is an object of type:', type(X))
print('The elements in X are of type:', X.dtype)
print('The elements in X have a mean of:', X.mean())
print('The maximum value in X is:', X.max())
print('The minimum value in X is:', X.min())
print('X has', (X < 0).sum(), 'negative numbers')
print('X has', (X > 0).sum(), 'positive numbers')

X has dimensions: (100, 100)
X is an object of type: <class 'numpy.ndarray'>
The elements in X are of type: float64
The elements in X have a mean of: -0.0019445773230990493
The maximum value in X is: 0.385260279128671
The minimum value in X is: -0.4530296725210538
X has 5025 negative numbers
X has 4975 positive numbers


Here, this ndarray has the average of the random numbers within the ndarray, which is close to zero. Both the maximum and minimum values in `X` are symmetric around zero (the average), and it has about the same amount of positive and negative numbers.

# Accessing, Deleting, and Inserting Elements into ndarrays

ndarrays are mutable, so they can be changed after creation.
They can be sliced and split in different ways, so you can also get a subset of an ndarray. For machine learning, you can use slicing to separate data, e.g. into training, cross validation, and testing sets.

## Accessing/modifying rank 1 ndarrays

In [26]:
x = np.array([1, 2, 3, 4, 5])
print(x)

# use [] to access indices
print(x[0]) # first index
print(x[1])
print(x[-1]) # last index


[1 2 3 4 5]
1
2
5


In [27]:
x = np.array([1, 2, 3, 4, 5])
print(x)

# use [] = to modify a specific index
x[0] = 20
print(x)

[1 2 3 4 5]
[20  2  3  4  5]


## Accessing/modifying rank 2 ndarrays

Need to provide to indices:
Use `[row, column]` to access

In [28]:
X = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(X)

print(X[0, 0]) # first
print(X[1, 1]) # middle
print(X[2, 2]) # last

[[1 2 3]
 [4 5 6]
 [7 8 9]]
1
5
9


In [29]:
X = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(X, "\n")

# modify with [i, i] = 
X[1, 1] = 99
X[2, 2] = 99
print(X)

[[1 2 3]
 [4 5 6]
 [7 8 9]] 

[[ 1  2  3]
 [ 4 99  6]
 [ 7  8 99]]


## Deleting from ndarrays

`np.delete(ndarray, elements, axis)`
For rank 1 ndarrays the `axis` keyword is not required. For rank 2 ndarrays, `axis = 0` is used to select rows, and `axis = 1` is used to select columns.

In [30]:
# rank 1 
x = np.array([1, 2, 3, 4, 5])
print(x)

# delete first and last element
x = np.delete(x, [0, 4]) 

print(x)

[1 2 3 4 5]
[2 3 4]


In [31]:
# rank 2
Y = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(Y)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [32]:
# delete first row
W = np.delete(Y, 0, axis=0)
print(W)

[[4 5 6]
 [7 8 9]]


In [33]:
# delete first and last column
v = np.delete(Y, [0, 2], axis=1)
print(v)

[[2]
 [5]
 [8]]


## Appending values

Append values to ndarrays using the `np.append(ndarray, elements, axis)` function. This appends the given list of `elements` to `ndarray` along the specified `axis`. 

Note that when appending to rank 2 arrays, the shape must match, or you will get an error!

In [34]:
# rank 1 ndarray
x = np.array([1, 2, 3, 4, 5])
print(x)

# does not need axis arg
# one element
x = np.append(x, 6)
print(x)

# multiple elements
x = np.append(x, [7,8])
print(x)

[1 2 3 4 5]
[1 2 3 4 5 6]
[1 2 3 4 5 6 7 8]


In [35]:
# rank 2 ndarray
Y = np.array([[1,2,3],[4,5,6]])
print(Y)

[[1 2 3]
 [4 5 6]]


In [36]:
# Append row
V = np.append(Y, [[7, 8, 9]], axis=0)
print(V)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [37]:
# Append column
Q = np.append(Y, [[9], [10]], axis=1)
print(Q)

[[ 1  2  3  9]
 [ 4  5  6 10]]


## Inserting values

Insert values to ndarrays using the `np.insert(ndarray, index, elements, axis)` function. This inserts the given list of `elements` to `ndarray` right before the given `index` along the specified `axis`. 

In [38]:
# rank 1 ndarray
x = np.array([1, 2, 5, 6, 7])
print(x)

# insert 3, 4 just before the 3rd element (index = 2)
x = np.insert(x, 2, [3, 4])
print(x)

[1 2 5 6 7]
[1 2 3 4 5 6 7]


In [39]:
# rank 2 ndarray
Y = np.array([[1,2,3],[7,8,9]])
print(Y)

[[1 2 3]
 [7 8 9]]


In [40]:
# insert row between first and last (index 1) row
W = np.insert(Y, 1, [4, 5, 6], axis=0)
print(W)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [41]:
# insert column full of 5s between first and second column
V = np.insert(Y, 1, 5, axis=1)
print(V)

[[1 5 2 3]
 [7 5 8 9]]


In [42]:
# insert column [0, 9] between first and second column
U = np.insert(Y, 1, [0,9], axis=1)
print(U)

[[1 0 2 3]
 [7 9 8 9]]


## Stacking

You can stack ndarrays on top of each other or side by side.

- `np.vstack()` for vertical stacking
- `np.hstack()` for horizontal stacking

__The shapes must match__

In [43]:
# rank 1 ndarray 
x = np.array([1,2])
print(x)

[1 2]


In [44]:
# rank 2 ndarray 
Y = np.array([[3,4],[5,6]])
print(Y)

[[3 4]
 [5 6]]


In [45]:
# Stack x on Y
V = np.vstack((x, Y))
print(V)

[[1 2]
 [3 4]
 [5 6]]


In [46]:
# stack x on the right of Y. 
# reshape x in order to stack it on the right of Y.
W = np.hstack((Y, x.reshape(2, 1)))
print(W)

[[3 4 1]
 [5 6 2]]


## Slicing

For accessing subsets of ndarrays. Slicing is performed by combining indices with `:` inside `[]`.

- `ndarray[start:end]` (between given start and end)
- `ndarray[start:]` (given start all the way to end)
- `ndarray[:end]` (start to given end)

__start is inclusive, end is exclusive__

Since ndarrays can be multidimensional, when doing slicing, you usually have to specify a slice for each dimension of the array.

In [47]:
# 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


In [48]:
# Elements in 2nd to 4th rows, and 3rd to 5th columns
Z = X[1:4, 2:5]
print(Z)

[[ 7  8  9]
 [12 13 14]
 [17 18 19]]


In [49]:
# Can also be written as
W = X[1:, 2:5]
print(W)

[[ 7  8  9]
 [12 13 14]
 [17 18 19]]


In [50]:
# Elements first to 3rd rows, 2nd to 4th colums
Y = X[:3, 2:5]
print(Y)

[[ 2  3  4]
 [ 7  8  9]
 [12 13 14]]


In [51]:
# All elements in 3rd row
v = X[2, :]
print(v)

[10 11 12 13 14]


In [52]:
# All elements in 3rd column
q = X[:, 2]
print(q)

[ 2  7 12 17]


In [53]:
# All elements in 3rd column, as a rank 2 array
R = X[:, 2:3]
print(R)

[[ 2]
 [ 7]
 [12]
 [17]]


Note that when slicing ndarrays and saving them into new variables, the data is not copied into the new variable.

Rather, the new variable is just a different "view" of the same ndarray. So, if you make changes to the new variable, it will change the original as well.

In [54]:
# 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


In [55]:
# Make a slice
Z = X[1:4,2:5]
print(Z)

[[ 7  8  9]
 [12 13 14]
 [17 18 19]]


In [56]:
# Change Z in some way
Z[2, 2] = 555
print(Z)

[[  7   8   9]
 [ 12  13  14]
 [ 17  18 555]]


In [57]:
# X also got changed!
print(X)

[[  0   1   2   3   4]
 [  5   6   7   8   9]
 [ 10  11  12  13  14]
 [ 15  16  17  18 555]]


## Copying

To create ndarrays that contain copies of the values in the slice, use `np.copy(ndarray)`. This function can also be used as a method, in which case you don't need to provide the ndarray as an arg.

In [58]:
# 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


In [59]:
# Make a slice that's a copy
Z = np.copy(X[1:4,2:5])
print(Z)

[[ 7  8  9]
 [12 13 14]
 [17 18 19]]


In [60]:
# OR
W = X[1:4,2:5].copy()
print(W)

[[ 7  8  9]
 [12 13 14]
 [17 18 19]]


In [61]:
# Change Z and W
Z[2, 2] = 555
W[2, 2] = 444
print(Z, "\n")
print(W)

[[  7   8   9]
 [ 12  13  14]
 [ 17  18 555]] 

[[  7   8   9]
 [ 12  13  14]
 [ 17  18 444]]


In [62]:
# Original is same
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]]


### Using one ndarray to make slices, select, or change elements in another ndarray

In [63]:
# 4 x 5 ndarray that contains integers from 0 to 19
X = np.arange(20).reshape(4, 5)

# Create a rank 1 ndarray to serve as indices to select elements from X
indices = np.array([1,3])

print(X, "\n")
print(indices)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]] 

[1 3]


In [64]:
# Use the indices ndarray to select the 2nd and 4th row of X
Y = X[indices,:]
print(Y)

[[ 5  6  7  8  9]
 [15 16 17 18 19]]


In [65]:
# Use indices to select 2nd and 4th columns of X
Z = X[:, indices]
print(Z)

[[ 1  3]
 [ 6  8]
 [11 13]
 [16 18]]


## Selecting diagonals

NumPy has built-in functions to select specific elements within ndarrays. For example, the `np.diag(ndarray, k=N)` function gives you the elements along the `diagonal` defined by `N`. The default is `k=0`, which is the main diagonal. Values of `k > 0` are used to select elements in diagonals above the main diagonal, and values of `k < 0` are used to select elements in diagonals below the main diagonal.

In [66]:
# 5 x 5 ndarray that contains integers from 0 to 24
X = np.arange(25).reshape(5, 5)
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]


In [67]:
# main diagonal
z = np.diag(X)
print(z)

[ 0  6 12 18 24]


In [68]:
# diagaonal above main diagonal
y = np.diag(X, k=1)
print(y)

[ 1  7 13 19]


In [69]:
# diagaonal below main diagonal
w = np.diag(X, k=-1)
print(w)

[ 5 11 17 23]


### Unique elements

Get unique values in an ndarray with `np.unique()`

In [70]:
# 3 x 3 ndarray with repeated values
X = np.array([[1,2,3],[5,2,8],[1,2,3]])
print(X)


[[1 2 3]
 [5 2 8]
 [1 2 3]]


In [71]:
# Get unique values
y = np.unique(X)
print(y)

[1 2 3 5 8]


## Boolean Indexing

Boolean indexing can be used when you don't know the indices of the elements you want to select or when you you want to select element based on certain conditions, e.g., selecting only values that are less than 20. Boolean indexing lets you select elements using logical arguments instead of explicit indices.

In [72]:
# 5 x 5 ndarray that contains integers from 0 to 24
X = np.arange(25).reshape(5, 5)
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 11 12 13 14]
 [15 16 17 18 19]
 [20 21 22 23 24]]


In [73]:
# Get elements greater that 10
y = X[X > 10]
print(y)

# <= 7
y = X[X <= 7]
print(y)


# between 10 and 17
y = X[(X > 10) & (X < 17)]
print(y)

[11 12 13 14 15 16 17 18 19 20 21 22 23 24]
[0 1 2 3 4 5 6 7]
[11 12 13 14 15 16]


In [74]:
# Use boolean indexing to assign
X[(X > 10) & (X < 17)] = -1
print(X)

[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
 [10 -1 -1 -1 -1]
 [-1 -1 17 18 19]
 [20 21 22 23 24]]


## Set operations

Use to compare two ndarrays, like finding common elements and different elements. Can also be using to join via union.


In [75]:
# rank 1 ndarray
x = np.array([1,2,3,4,5])
print(x)

# rank 1 ndarray
y = np.array([6,7,2,8,4])
print(y)

[1 2 3 4 5]
[6 7 2 8 4]


### np.intersect1d()

In [76]:
# Get elements both in x and y
z = np.intersect1d(x, y)
print(z)

[2 4]


### np.setdiff1d()

In [77]:
# Get elements that are in x but no in y
z = np.setdiff1d(x, y)
print(z)

[1 3 5]


### np.union1d()

In [78]:
# All (unique) elements of x and y
z = np.union1d(x, y)
print(z)

[1 2 3 4 5 6 7 8]


## Sorting

ndarrays can be sorted in NumPy using `np.sort()`. Note that this function sorts rank 1 and rank 2 ndarrays in different ways! 

The sort function can also be used as a method, but there is a big difference on how the data is stored in memory in this case. When `np.sort()` is used as a function, it sorts the ndrrays out of place without changing the original ndarray. BUT, when you use `.sort()` as a method (`ndarray.sort()`), it sorts the ndarray in place, so the original is changed. 

### Sorting rank 1 ndarrays

In [79]:
# unsorted rank 1 ndarray
x = np.random.randint(1,11,size=(10,))
print(x)

[1 3 3 1 9 9 5 4 3 7]


In [80]:
# Non-mutating sort
y = np.sort(x)
print(y)
print(x)

[1 1 3 3 3 4 5 7 9 9]
[1 3 3 1 9 9 5 4 3 7]


In [81]:
# Mutating sort
x.sort()
print(x)

[1 1 3 3 3 4 5 7 9 9]


In [82]:
# Sort and get only unique values by combing with np.unique()
x = np.random.randint(1,11,size=(10,))
print(x)

y = np.sort(np.unique(x))
print(y)

[ 1  9  5 10  7  9  6 10  6  2]
[ 1  2  5  6  7  9 10]


### Sorting rank 2 ndarrays

When sorting rank 2 ndarrays, you need to specify whether you want to sort rows or columns using the `axis` keyword. For columns, `axis = 0`, for rows `axis = 1`

In [83]:
# unsorted rank 2 ndarray
X = np.random.randint(1,11,size=(5,5))
print(X)

[[ 4  5  7  7  8]
 [ 1 10  5  9  6]
 [ 6  5 10  6  6]
 [ 4  9  1  9  8]
 [ 8  2  2  1  1]]


In [84]:
# Sort by columns
Y = np.sort(X, axis=0)
print(Y)

[[ 1  2  1  1  1]
 [ 4  5  2  6  6]
 [ 4  5  5  7  6]
 [ 6  9  7  9  8]
 [ 8 10 10  9  8]]


In [85]:
# Sort by rows
W = np.sort(X, axis=1)
print(W)

[[ 4  5  7  7  8]
 [ 1  5  6  9 10]
 [ 5  6  6  6 10]
 [ 1  4  8  9  9]
 [ 1  1  2  2  8]]


# Arithmetic operations and Broadcasting

NumPy lets you do both element-wise operations and matrix operations on ndarrays. 

In order to do element-wise operations, NumPy sometimes uses "broadcasting". Broadcasting is how NumPy handles element-wise arithmetic operations with ndarrays of different shapes. For example, broadcasting is used implicitly when doing arithmetic operations between scalars and ndarrays.

## Element-wise addition, subtraction, multiplication, and division, between ndarrays

You can do basic element-wise calculations functionally (e.g. with `np.add()`) or by using arithmetic operators like such as `+`. Both forms will do the same operation, but with functional approach, you can set options using different keyword arguments. 

Note that these operations require the shapes of the ndarrays to match or to be broadcastable.

### Rank 1 ndarrays

In [86]:
# rank 1 ndarrays
x = np.array([1,2,3,4])
y = np.array([5.5,6.5,7.5,8.5])
print(x)
print(y)

[1 2 3 4]
[5.5 6.5 7.5 8.5]


In [87]:
# Adding
print(x + y)
print(np.add(x, y))

[ 6.5  8.5 10.5 12.5]
[ 6.5  8.5 10.5 12.5]


In [88]:
# subtracting
print(x - y)
print(np.subtract(x, y))

[-4.5 -4.5 -4.5 -4.5]
[-4.5 -4.5 -4.5 -4.5]


In [89]:
# multiplying
print(x * y)
print(np.multiply(x, y))

[ 5.5 13.  22.5 34. ]
[ 5.5 13.  22.5 34. ]


In [90]:
# dividing
print(x / y)
print(np.divide(x, y))

[0.18181818 0.30769231 0.4        0.47058824]
[0.18181818 0.30769231 0.4        0.47058824]


### rank 2 ndarrays

**must have the same shape or be broadcastable**

In [91]:
# Rank 2 ndarrays
X = np.array([1,2,3,4]).reshape(2,2)
Y = np.array([5.5,6.5,7.5,8.5]).reshape(2,2)
print(X)
print()
print(Y)

[[1 2]
 [3 4]]

[[5.5 6.5]
 [7.5 8.5]]


In [92]:
# Adding
print(X + Y)
print()
print(np.add(X, Y))

[[ 6.5  8.5]
 [10.5 12.5]]

[[ 6.5  8.5]
 [10.5 12.5]]


In [93]:
# subtracting
print(X - Y)
print()
print(np.subtract(X, Y))

[[-4.5 -4.5]
 [-4.5 -4.5]]

[[-4.5 -4.5]
 [-4.5 -4.5]]


In [94]:
# multiplying
print(X * Y)
print()
print(np.multiply(X, Y))

[[ 5.5 13. ]
 [22.5 34. ]]

[[ 5.5 13. ]
 [22.5 34. ]]


In [95]:
# dividing
print(X / Y)
print()
print(np.divide(X, Y))

[[0.18181818 0.30769231]
 [0.4        0.47058824]]

[[0.18181818 0.30769231]
 [0.4        0.47058824]]


### Other mathematical functions
NumPy has other mathematical functions, such as sqrt(x), which operate on all elements of an ndarray at once.

#### Square root

In [96]:
x = np.array([1,2,3,4])
print(x)
print(np.sqrt(x))

[1 2 3 4]
[1.         1.41421356 1.73205081 2.        ]


#### EXP

In [97]:
print(np.exp(x))

[ 2.71828183  7.3890561  20.08553692 54.59815003]


#### Power

In [98]:
# raise all elements to power of 2
print(np.power(x, 2)) 

[ 1  4  9 16]


## Statistical functions

You can get the mean, min, max, standard deviation, etc. with ndarray methods.


In [99]:
# 2 x 2 ndarray
X = np.array([[1,2], [3,4]])
print(X)

[[1 2]
 [3 4]]


In [100]:
# Mean of all elements of X
print(X.mean())

# Mean of all elements in columns of X
print(X.mean(axis=0))

# Mean of all elements in rows of X
print(X.mean(axis=1))

2.5
[2. 3.]
[1.5 3.5]


In [101]:
# Sum of all elements of X
print(X.sum())

# Sum of all elements in columns of X
print(X.sum(axis=0))

# Sum of all elements in rows of X
print(X.sum(axis=1))

10
[4 6]
[3 7]


In [102]:
# Standard deviation of all elements of X
print(X.std())

# Standard deviation of all elements in columns of X
print(X.std(axis=0))

# Standard deviation of all elements in rows of X
print(X.std(axis=1))

1.118033988749895
[1. 1.]
[0.5 0.5]


In [103]:
# Median of all elements
print(np.median(X))

# Median of all elements in columns of X
print(np.median(X, axis=0))

# Median of all elements in rows of X
print(np.median(X, axis=1))

2.5
[2. 3.]
[1.5 3.5]


In [104]:
# Max of all elements of X
print(X.max())

# Max of all elements in columns of X
print(X.max(axis=0))

# Max of all elements in rows of X
print(X.max(axis=1))

4
[3 4]
[2 4]


In [105]:
# Min of all elements of X
print(X.min())

# Min of all elements in columns of X
print(X.min(axis=0))

# Min of all elements in rows of X
print(X.min(axis=1))

1
[1 2]
[1 3]


## Add, subtract, multiply, divide simple numbers to each element

This uses broadcasting to apply the single-number operation along the ndarray in a way that the number and array have the same shape. For example, that lets us add 3 to each element of a multi-dimensional ndarray `X` with just one line of code.

In [106]:
# We create a 2 x 2 ndarray
X = np.array([[1,2], [3,4]])
print(X)

[[1 2]
 [3 4]]


In [107]:
# adding
print(3 + X)

[[4 5]
 [6 7]]


In [108]:
# substracting
print(3 - X)

[[ 2  1]
 [ 0 -1]]


In [109]:
# multiplying
print(3 * X)

[[ 3  6]
 [ 9 12]]


In [110]:
# dividing
print(3 / X)

[[3.   1.5 ]
 [1.   0.75]]


Numpy can do the same for two ndarrays of different shapes, to some extent, as below:

In [111]:
# Create a rank 1 ndarray
x = np.array([1,2,3])

# Create a 3 x 3 ndarray
Y = np.array([[1,2,3],[4,5,6],[7,8,9]])

# Create a 3 x 1 ndarray
Z = np.array([1,2,3]).reshape(3,1)

print(x)
print()
print(Y)
print()
print(Z)

[1 2 3]

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[[1]
 [2]
 [3]]


In [112]:
# add a 1 x 3 ndarray to 3 x 3 ndarray
print(x + Y)

[[ 2  4  6]
 [ 5  7  9]
 [ 8 10 12]]


In [113]:
# add 3 x 1 ndarray to 3 x 3 ndarray
print(Z + Y)

[[ 2  3  4]
 [ 6  7  8]
 [10 11 12]]


NumPy Documentation for Broadcasting and its rules:
[Broadcasting](https://docs.scipy.org/doc/numpy-1.13.0/user/basics.broadcasting.html)