Numpy
-----

Numpy is the core library for scientific computing in Python. It
provides a high-performance multidimensional array object, and tools for
working with these arrays.

To use Numpy, we first need to import the `numpy` package. By
convention, we import it using the alias `np`. Then, when we want to use
modules or functions in this library, we preface them with `np.`

In [1]:
import numpy as np

### Arrays and array construction

A numpy array is a grid of values, all of the same type, and is indexed
by a tuple of nonnegative integers. The number of dimensions is the rank
of the array; the shape of an array is a tuple of integers giving the
size of the array along each dimension.

We can create a `numpy` array by passing a Python list to `np.array()`.

In [2]:
a = np.array([1, 2, 3])  # Create a rank 1 array
a

array([1, 2, 3])

This creates the array we can see on the right here:

![](http://jalammar.github.io/images/numpy/create-numpy-array-1.png)

In [3]:
print(type(a), a.shape, a[0], a[1], a[2])
a[0] = 5                 # Change an element of the array
print(a)                  

<class 'numpy.ndarray'> (3,) 1 2 3
[5 2 3]


To create a `numpy` array with more dimensions, we can pass nested
lists, like this:

![](http://jalammar.github.io/images/numpy/numpy-array-create-2d.png)

![](http://jalammar.github.io/images/numpy/numpy-3d-array.png)

In [4]:
b = np.array([[1,2],[3,4]])   # Create a rank 2 array
print(b)

[[1 2]
 [3 4]]


In [5]:
print(b.shape)

(2, 2)


There are often cases when we want numpy to initialize the values of the
array for us. numpy provides methods like `ones()`, `zeros()`, and
`random.random()` for these cases. We just pass them the number of
elements we want it to generate:

![](http://jalammar.github.io/images/numpy/create-numpy-array-ones-zeros-random.png)

We can also use these methods to produce multi-dimensional arrays, as
long as we pass them a tuple describing the dimensions of the matrix we
want to create:

![](http://jalammar.github.io/images/numpy/numpy-matrix-ones-zeros-random.png)

![](http://jalammar.github.io/images/numpy/numpy-3d-array-creation.png)

Sometimes, we need an array of a specific shape with “placeholder”
values that we plan to fill in with the result of a computation. The
`zeros` or `ones` functions are handy for this:

In [6]:
a = np.zeros((2,2))  # Create an array of all zeros
print(a)

[[0. 0.]
 [0. 0.]]


In [7]:
b = np.ones((1,2))   # Create an array of all ones
print(b)

[[1. 1.]]


In [8]:
c = np.full((2,2), 7) # Create a constant array
print(c)

[[7 7]
 [7 7]]


In [9]:
d = np.eye(2)        # Create a 2x2 identity matrix
print(d)

[[1. 0.]
 [0. 1.]]


In [10]:
e = np.random.random((2,2)) # Create an array filled with random values
print(e)

[[0.80275712 0.65149981]
 [0.32200107 0.65800372]]


Lastly, I want to mention two very useful functions for creating
sequences of numbers within a specified range, namely, arange and
linspace. NumPy’s arange function follows the same syntax as Python’s
range objects: If two arguments are provided, the first argument
represents the start value and the second value defines the stop value
of a half-open interval:

Numpy also has two useful functions for creating sequences of numbers:
`arange` and `linspace`.

The `arange` function accepts three arguments, which define the start
value, stop value of a half-open interval, and step size. (The default
step size, if not explicitly specified, is 1; the default start value,
if not explicitly specified, is 0.)

The `linspace` function is similar, but we can specify the number of
values instead of the step size, and it will create a sequence of evenly
spaced values.

In [11]:
f = np.arange(10,50,5)   # Create an array of values starting at 10 in increments of 5
print(f)

[10 15 20 25 30 35 40 45]


Note this ends on 45, not 50 (does not include the top end of the
interval).

In [12]:
g = np.linspace(0., 1., num=5)
print(g)

[0.   0.25 0.5  0.75 1.  ]


Sometimes, we may want to construct an array from existing arrays by
“stacking” the existing arrays, either vertically or horizontally. We
can use `vstack()` (or `row_stack`) and `hstack()` (or `column_stack`),
respectively.

In [13]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.vstack((a,b))

array([[1, 2, 3],
       [4, 5, 6]])

In [14]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.hstack((a,b))

array([1, 2, 3, 4, 5, 6])

### Array indexing

Numpy offers several ways to index into arrays.

We can index and slice numpy arrays in all the ways we can slice Python
lists:

![](http://jalammar.github.io/images/numpy/numpy-array-slice.png)

And you can index and slice numpy arrays in multiple dimensions. If
slicing an array with more than one dimension, you should specify a
slice for each dimension:

![](http://jalammar.github.io/images/numpy/numpy-matrix-indexing.png)

In [15]:
# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
b = a[:2, 1:3]
print(b)

[[2 3]
 [6 7]]


A slice of an array is a view into the same data, so modifying it will
modify the original array.

In [16]:
print(a[0, 1])
b[0, 0] = 77    # b[0, 0] is the same piece of data as a[0, 1]
print(a[0, 1]) 

2
77


Two ways of accessing the data in the middle row of the array. Mixing
integer indexing with slices yields an array of lower rank, while using
only slices yields an array of the same rank as the original array:

In [17]:
a

array([[ 1, 77,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [18]:
row_r1 = a[1, :]    # Rank 1 view of the second row of a  
row_r2 = a[1:3, :]  # Rank 2 view of the second row of a
print(row_r1, row_r1.shape)
print(row_r2, row_r2.shape)

[5 6 7 8] (4,)
[[ 5  6  7  8]
 [ 9 10 11 12]] (2, 4)


In [19]:
# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape)
print()
print(col_r2, col_r2.shape)

[77  6 10] (3,)

[[77]
 [ 6]
 [10]] (3, 1)


Boolean array indexing: Boolean array indexing lets you pick out
arbitrary elements of an array. Frequently this type of indexing is used
to select the elements of an array that satisfy some condition. Here is
an example:

In [20]:
import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

bool_idx = (a > 2)  # Find the elements of a that are bigger than 2;
                    # this returns a numpy array of Booleans of the same
                    # shape as a, where each slot of bool_idx tells
                    # whether that element of a is > 2.

print(bool_idx)

[[False False]
 [ True  True]
 [ True  True]]


### Array math

What makes working with `numpy` so powerful and convenient is that it
comes with many *vectorized* math functions for computation over
elements of an array. These functions are highly optimized and are
*very* fast - much, much faster than using an explicit `for` loop.

For example, let’s create a large array of random values and then sum it
both ways. We’ll use a `%%time` *cell magic* to time them.

In [21]:
a = np.random.random(100000000)

In [22]:
%%time
x = np.sum(a)

CPU times: user 103 ms, sys: 7 µs, total: 103 ms
Wall time: 105 ms


In [23]:
%%time
x = 0 
for element in a:
  x = x + element

CPU times: user 17.1 s, sys: 55.9 ms, total: 17.2 s
Wall time: 17.3 s


Look at the “Wall Time” in the output - note how much faster the
vectorized version of the operation is! This type of fast computation is
a major enabler of machine learning, which requires a *lot* of
computation.

Whenever possible, we will try to use these vectorized operations.

Some mathematic functions are available both as operator overloads and
as functions in the numpy module.

For example, you can perform an elementwise sum on two arrays using
either the + operator or the `add()` function.

![](http://jalammar.github.io/images/numpy/numpy-arrays-adding-1.png)

![](http://jalammar.github.io/images/numpy/numpy-matrix-arithmetic.png)

In [24]:
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum; both produce the array
print(x + y)
print(np.add(x, y))

[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]


And this works for other operations as well, not only addition:

![](http://jalammar.github.io/images/numpy/numpy-array-subtract-multiply-divide.png)

In [25]:
# Elementwise difference; both produce the array
print(x - y)
print(np.subtract(x, y))

[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]


In [26]:
# Elementwise product; both produce the array
print(x * y)
print(np.multiply(x, y))

[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]


In [27]:
# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(x / y)
print(np.divide(x, y))

[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]


In [28]:
# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))

[[1.         1.41421356]
 [1.73205081 2.        ]]


We use the `dot()` function to compute inner
products of vectors, to multiply a vector by a matrix, and to multiply
matrices. `dot()` is available both as a function in the numpy module
and as an instance method of array objects:

![](http://jalammar.github.io/images/numpy/numpy-matrix-dot-product-1.png)

In [29]:
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))

219
219


You can also use the `@` operator which is equivalent to numpy's `dot`
operator.

In [30]:
print(v @ w)

219


In [31]:
# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))
print(x @ v)

[29 67]
[29 67]
[29 67]


In [32]:
# Matrix / matrix product; both produce the rank 2 array
# [[19 22]
#  [43 50]]
print(x.dot(y))
print(np.dot(x, y))
print(x @ y)

[[19 22]
 [43 50]]
[[19 22]
 [43 50]]
[[19 22]
 [43 50]]


Besides for the functions that overload operators, Numpy also provides
many useful functions for performing computations on arrays, such as
`min()`, `max()`, `sum()`, and others:

![](http://jalammar.github.io/images/numpy/numpy-matrix-aggregation-1.png)

In [33]:
x = np.array([[1, 2], [3, 4], [5, 6]])

print(np.max(x)) 
print(np.min(x))  
print(np.sum(x)) 

6
1
21


Not only can we aggregate all the values in a matrix using these
functions, but we can also aggregate across the rows or columns by using
the `axis` parameter:

![](http://jalammar.github.io/images/numpy/numpy-matrix-aggregation-4.png)

In [34]:
x = np.array([[1, 2], [5, 3], [4, 6]])

print(np.max(x, axis=0))  # Compute max of each column; prints "[5 6]"
print(np.max(x, axis=1))  # Compute max of each row; prints "[2 5 6]"

[5 6]
[2 5 6]


You can find the full list of mathematical functions provided by numpy
in the
[documentation](http://docs.scipy.org/doc/numpy/reference/routines.math.html).

Apart from computing mathematical functions using arrays, we frequently
need to reshape or otherwise manipulate data in arrays. The simplest
example of this type of operation is transposing a matrix; to transpose
a matrix, simply use the T attribute of an array object.

![](http://jalammar.github.io/images/numpy/numpy-transpose.png)

In [35]:
x = np.array([[1, 2], [3, 4], [5, 6]])

print(x)
print("transpose\n", x.T)

[[1 2]
 [3 4]
 [5 6]]
transpose
 [[1 3 5]
 [2 4 6]]


# Lab Task

In [37]:
# 1. Import the numpy package under the name np


In [20]:
# 2. Create a null vector of size 10


In [21]:
# 3. Create a null vector of size 10 but the fifth value which is 1


In [22]:
# 4. Create a vector with values ranging from 10 to 49


In [23]:
# 5. Create a 3x3 matrix with values ranging from 0 to 8


In [24]:
# 6. Create a 10x10 array with random values and find the minimum and maximum values 


In [25]:
# 7. Create a 5x5 matrix with row values ranging from 0 to 4


In [26]:
# 8. Create a vector of size 10 with values ranging from 0 to 1, both excluded 


In [27]:
# 9. Create a 2d array with 1 on the border and 0 inside


In [28]:
# 10. Create a random vector of size 10 and sort it 


In [29]:
# 11. Subtract the mean of each row of an array


In [30]:
# 12. create a 2darray
# swap two rows of an array


In [None]:
# 13. Find the number of occurrences of a sequence in an array


In [None]:
# 14. Combining a one and a two-dimensional Array

In [None]:
# 15. Flatten a 2d array into 1d array

In [None]:
# 16. How can one determine the occurrence frequency of distinct values within an array

In [None]:
# 17. Compute the determinant of a matrix

In [None]:
# 18. Ways to add row/columns in an array

In [None]:
# 19. Convert a matrix into a list

In [None]:
# 20. Calculate inner, outer, and cross products of matrices and vectors