# Xiaotian Tian's Notes on Numpy basics

Learned from:
- https://numpy.org/doc/stable/user/quickstart.html
- https://cs231n.github.io/python-numpy-tutorial/#numpy

## Arrays

A **numpy array** is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

### Array Basics

In [46]:
%%time

import numpy as np

# initialise
a = np.array([[1, 2, 3],
            [4, 5, 6],
            [1, 2, 3]])
a

CPU times: user 62 µs, sys: 4 µs, total: 66 µs
Wall time: 72 µs


array([[1, 2, 3],
       [4, 5, 6],
       [1, 2, 3]])

In [47]:
# shape indicates the size of the array in each dimension
a.shape

(3, 3)

In [48]:
# we can index into the array
a[0]
a[0][0]

1

In [49]:
# ndim if the rank
a.ndim

2

In [50]:
# type of the array object
type(a)

numpy.ndarray

### Array Creation

In [51]:
# create array from list
a = np.array([[1, 2, 3],
            [4, 5, 6],
            [1, 2, 3]])
a

array([[1, 2, 3],
       [4, 5, 6],
       [1, 2, 3]])

In [52]:
# create array of all zeros
b = np.zeros((2,3))
b

array([[0., 0., 0.],
       [0., 0., 0.]])

In [53]:
# create array of all ones
c = np.ones((1,4))
c

array([[1., 1., 1., 1.]])

In [54]:
# create array of a specific constant
d = np.full((3,2), 0.4)
d

array([[0.4, 0.4],
       [0.4, 0.4],
       [0.4, 0.4]])

In [55]:
# create identity matrix
e = np.eye(3)
e

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [56]:
# create random array
f = np.random.random((3,3))
f

array([[0.85091372, 0.12870585, 0.05119397],
       [0.09019419, 0.29338164, 0.63070636],
       [0.77954241, 0.81753751, 0.69119168]])

In [57]:
%reset

Once deleted, variables cannot be recovered. Proceed (y/[n])?  y


### Array Indexing and Slicing

#### Slicing Indexing and Integer Indexing

In [58]:
%%time

# reimport numpy
import numpy as np

# create a sample array
a = np.array([
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12],
])

CPU times: user 49 µs, sys: 39 µs, total: 88 µs
Wall time: 94.2 µs


In [59]:
# slicing
# Use slicing to pull out the subarray consisting of:
    # the first 2 rows
    # and columns 3 and 4
# produces array of shape (2, 2):
a[:2, 2:4]

array([[3, 4],
       [7, 8]])

Slicing only creates a view of the original array, not a copy, so modifying the slice also modifies the original array:

In [60]:
b = a[:2, :2]
b[0, 0] = 114514
a

array([[114514,      2,      3,      4],
       [     5,      6,      7,      8],
       [     9,     10,     11,     12]])

While **slicing indexing** creates a view, any use of **integer indexing** will create an array of lower rank:

In [61]:
c = np.array([
    [1, 1, 2, 3],
    [1, 234, 21, 2],
    [21, 2, 1, 3]
])

# purely slicing indexing
d = c[:1, :]
print(d)
d.shape

[[1 1 2 3]]


(1, 4)

In [62]:
# any integer indexing yields an array of a lower rank
e = c[0, :]
print(e)
e.shape

[1 1 2 3]


(4,)

Same distinction, but apply to columns (note that a single column will be transposed to a row):

In [63]:
f = c[:, 0]
print(f)
f.shape

[ 1  1 21]


(3,)

In [64]:
g = c[:, :1]
print(g)
g.shape

[[ 1]
 [ 1]
 [21]]


(3, 1)

When you index into numpy arrays using *slicing*, the resulting array view will always be a *subarray of the original array*.
In contrast, *integer array indexing* allows you to construct arbitrary arrays using the data from another array. Here is an example:

In [75]:
h = np.array([
    [1, 2],
    [3, 4],
    [5, 6],
    [7, 8]
])

# an example of integer array indexing instead of slicing
g = h[[0, 2, 3], [0, 1, 0]]
print(g)

# this is equivalent to:
k = np.array([h[0,0], h[2,1], h[3,0]])
print(k)

[1 6 7]
[1 6 7]


One useful trick with integer array indexing is selecting or mutating one element from *each row* of a matrix.

In [77]:
l = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
    [10, 11, 12]
])

# indices array
m = np.array([0, 0, 0, 1])

# mutate the original array
# Mutate one element from each row of a using the indices in m:
l[np.arange(4), m] += 10  # np.arange(4) yields array([0, 1, 2, 3])
l

array([[11,  2,  3],
       [14,  5,  6],
       [17,  8,  9],
       [10, 21, 12]])

In [78]:
%reset

Once deleted, variables cannot be recovered. Proceed (y/[n])?  y


#### Boolean Indexing

We can use **boolean indexing** to convert an array to an array consists of boolean values, or filtering out all the element in an array that satisfies a boolean expression.

In [81]:
import numpy as np

a = np.array([
    [1, 2],
    [3, 4],
    [5, 6]
])

# array converted to boolean array
bool_idx = (a > 2)

bool_idx

array([[False, False],
       [ True,  True],
       [ True,  True]])

In [82]:
# create a rank 1 array 
# which consists of all of the elements
# that satisfies a boolean expression:
print(a[a > 2])

[3 4 5 6]
