# NumPy Basics
---

In [None]:
%conda install numpy

- Array Library
- Mainly for computing computational arrays
- Data structure to represent tensor.
- Mostly implemented using C (python glue language around C code)

$$
z = \sum_i x_i w_i = \mathbf{x}^\top \mathbf{w} = x_1 \times w_1 + x_2 \times w_2 + ... + x_n \times w_n
$$

In [None]:
import numpy as np

def numpy_dotproduct_approach(x, w):
    # np.dot(x, w)
    return x.dot(w)
    # np.dot == @

a = np.array([1., 2., 3.])
b = np.array([4., 5., 6.])

print(numpy_dotproduct_approach(a, b))

NumPy example of dot product

## N-dimensional Arrays

NumPy array is simliar to python list except it has a fixed size and can only have elements
of same type.

In [None]:
import numpy as np

lst =  [[1, 2, 3], 
        [4, 5, 6]]
ary2d = np.array(lst)
ary2d

first column in array first axis, after comma second axis

In [None]:
ary2d[1, 2]

check the dimensions of the array

In [None]:
ary2d.ndim

shape of array (2x3 matrix)

In [None]:
ary2d.shape

# len(ary2d.shape) == ndim

## NumPy Array Construction and Indexing

### Array Construction Routines

In [None]:
import numpy as np

In [None]:
np.ones((3,4))

In [None]:
# empty array (don't care about the numbers in the array there aren't any there)

np.empty((3,3))

In [None]:
np.zeros((3,3))

In [None]:
np.eye(3)

In [None]:
# diagonal matricies
np.diag((3,3,3))

from 4 but no 10, only till 9

In [None]:
# NumPy array from start to end
np.arange(4, 10)

In [None]:
# NumPy array from 0 to 5
np.arange(5)

In [None]:
# array from 1 to 10, going up by two (prime numbers)
np.arange(1,11,2)

In [None]:
# useful when you want to create a particular number of evenly spaced values 
# in a specified half-open interval.
# how many values we want in the range num=5

np.linspace(0,1,num=5)

### Array Indexing (Basics)

In [None]:
ary = np.array([1,2,3])
ary[0]

In [None]:
# slice (same as python)
ary[:2]

---

**Two dimensional array can be done in NumPy**

In [None]:
ary = np.array([[1, 2, 3],
               [4, 5, 6]])

ary[0, 0] # upper left

In [None]:
ary[-1, -1] # lower right

In [None]:
ary[0, 1] # first row, second column

In [None]:
ary[0] # entire first row

In [None]:
ary[:, 0] # entire first column

In [None]:
ary[:, :2] # first two columns

## NumPy Array Math and Universal Functions

### Array Math and Universal Functions

NumPy is efficient and convenient because of vectorisation

ufunc - universal functions (60 ufuncs available)

In [None]:
# Python example

lst = [[1, 2, 3],
       [4, 5, 6]] #2d array

# inefficient way of doing this
for row_idx, row_val in enumerate(lst):
    for col_idx, col_val in enumerate(row_val):
        lst[row_idx][col_idx] += 1

lst

In [None]:
# list comprehension

lst = [[1, 2, 3],[4, 5, 6]]
[[cell + 1 for cell in row] for row in lst]

NumPy's ufunc for element-wise scalar addition

In [None]:
# NumPy example - much faster than 2x for loops

import numpy as np

ary = np.array([[1, 2, 3],[4, 5, 6]])
ary = np.add(ary, 1) # binary ufunc
ary

NumPy Operator overloading (`+`, `-`, `/`, `*`, `**`)

In [None]:
# this will do Python addition
print(1 + 1)

# this will do NumPy addition - Operator overloading
ary + 1

In [None]:
ary = np.array([[1, 2, 3],
                [4, 5, 6]])

np.add.reduce(ary) # column sums

Compute the sum of the array above, specified with `axis=1`

In [None]:
np.add.reduce(ary, axis=1) # row sums

Can be more intuative with using `reduce`. NumPy also provides shorthands for specific operations such as `product` and `sum`. For example, `sum(axis=0)`, and that is equivalent to `add.reduce`.

In [None]:
ary.sum(axis=0) # == np.add.reduce(ary)
# axis=1 = columns

In [None]:
ary.sum() # sum of whole array

Other useful unary ufuncs are:
- `np.mean` (compute arithemetic mean or average)
- `np.std` (computes the standard deviation)
- `np.var` (computes variance)
- `np.sort` (sorts an array)
- `np.argsort` (returns indicies that would sort an array)
- `np.min` (returns the minimum value of an array)
- `np.max` (returns the maximum value of an array)
- `np.argmin` (returns the index of the minimum value)
- `np.argmax` (returns the index of the maximum value)
- `np.array_equal` (checks if two arrays have the same shape and elements)

### NumPy Broadcasting

Broadcasting happens in the background in NumPy. Broadcasting lets us perform vectorised operations, even if the two arrays are not the same dimensions.

In [None]:
# wouldn't be a valid operation in linear algebra but it works with NumPy
import numpy as np

np.array([1, 2, 3]) + 1

[1][2][3] + [1][1][1] --> [2][3][4]

In [None]:
ary3 = np.array([[4, 5, 6],
                 [7, 8, 9]])

ary1 = np.array([1, 2, 3])

ary3 + ary1

## Advanced Indexing

### Memory Views and Copies

We're creating a view of the array -> first_row isn't an object, it's a link to the original part of the array.

In [None]:
ary = np.array([[1, 2, 3],
               [4, 5, 6]])

first_row = ary[0]

In [None]:
first_row += 99

Modified both `first_row` and also `ary`

In [None]:
first_row

In [None]:
ary

Slicing will also create a memory view

In [None]:
ary = np.array([[1, 2, 3],
               [4, 5, 6]])

first_row = ary[:1]
first_row += 99
ary

In [None]:
ary = np.array([[1, 2, 3],
               [4, 5, 6]])

first_row = ary[:, 1]
first_row += 99
ary

Can create copies with `.copy()` (`first_row` will be copied)

In [None]:
ary = np.array([[1, 2, 3],
               [4, 5, 6]])

first_row = ary[0].copy()
first_row += 99
ary

**Fancy indexing** in NumPy creates copies of array's not views

In [None]:
ary = np.array([[1, 2, 3],
                [4, 5, 6]])

# Fancy indexing
ary[:, [0, 2]] # first and last column

In [None]:
this_is_a_copy = ary[:, [0, 2]]
this_is_a_copy += 99
ary

In [None]:
this_is_a_copy

Fancy indexing allows you to shuffle (re-arrange) the order

In [None]:
ary[:, [2, 0]] # third and first column

Boolean masks for indexing, array of `True` and `False`

In [None]:
aray = np.array([[1, 2, 3],
                 [4, 5, 6]])

greater3_mask = ary > 3
greater3_mask

In [None]:
ary[greater3_mask] # select True values (creates copy)

Can also chain different selection criteria using the logical _and_ operator `&` or the logical _or_ operator `|`

In [None]:
(ary > 3) & (ary % 2 == 0)

In [None]:
ary[(ary > 3) & (ary % 2 == 0)] # 4, 6

## Random Number Generators

In [None]:
import numpy as np

np.random.seed(123) # doesn't make the random number fluctuate
np.random.rand(3)

`RandomState` object to create the same results that was obtained via `np.random.rand`

In [None]:
rng1 = np.random.RandomState(seed=123)
rng1.rand(3)

In [None]:
rng2 = np.random.default_rng(seed=123)
rng2.random(3)

## Reshaping NumPy Arrays

Reshaping matricies to vectors and vice-versa

In [57]:
import numpy as np

ary1d = np.array([1, 2, 3, 4, 5, 6])
ary2d_view = ary1d.reshape(2, 3) # 2x3 matrix (create a view of array)
ary2d_view

array([[1, 2, 3],
       [4, 5, 6]])

In [58]:
np.may_share_memory(ary2d_view, ary1d)

True

`-1` acts as a placeholder, in `reshape`

In [61]:
ary1d.reshape(2, -1)

array([[1, 2, 3],
       [4, 5, 6]])

In [60]:
ary1d.reshape(-1, 2)

array([[1, 2],
       [3, 4],
       [5, 6]])

`reshape` to flatten an array

In [62]:
ary = np.array([[1, 2, 3],
                [4, 5, 6]])

ary.reshape(-1)

array([1, 2, 3, 4, 5, 6])

In [65]:
ary.flatten() # will also flatten array (copy)

array([1, 2, 3, 4, 5, 6])

`concatenate` combine different arrays

In [66]:
aray = np.array([1, 2, 3])

# stack along the first axis
np.concatenate((aray, aray))

array([1, 2, 3, 1, 2, 3])

In [74]:
ary.shape

(2, 3)

In [80]:
aray = np.array([[1, 2, 3]])

# stack along the first axis (rows)
np.concatenate((aray, aray), axis=0)
# can double check with .shape at the end
# np.concatenate((aray, aray), axis=0).shape

array([[1, 2, 3],
       [1, 2, 3]])

In [79]:
np.concatenate((aray, aray), axis=1)

array([[1, 2, 3, 1, 2, 3]])

## NumPy Comparison Operators and Masks

Boolean masks in NumPy. `bool`- type arrays (storing `True` and `False` values)

In [82]:
import numpy as np

ary = np.array([1, 2, 3, 4])
mask = ary > 2
mask

array([False, False,  True,  True])

Once Boolean mask is created, we can use it to select certain entries from the target array

In [83]:
ary[mask]

array([3, 4])

In [84]:
mask

array([False, False,  True,  True])

Useful fucntion to assign values to specific elements in an array is the `np.where` function. In the example below, we assign a 1 to all values in the array that are greater than 2, and 0, otherwise:

In [85]:
np.where(ary > 2, 1, 0)

array([0, 0, 1, 1])

In [86]:
ary = np.array([1, 2, 3, 4])
mask = ary > 2
ary[mask] = 1
ary[~mask] = 0
ary

array([0, 0, 1, 1])

`~` operator is one of the logical operators in NumPy:
- A: `&` or `np.bitwise_and`
- Or: `|` or `np.bitwise_or`
- Xor: `^` or `np.bitwise_xor`
- Not: `~` or `np.bitwise_not`

These logical operators allow us to chain an arbitrary number of conditions to create even more "complex" Boolean masks

In [87]:
aray = np.array([1, 2, 3, 4])

(ary >3) | (ary < 2)

array([ True,  True,  True,  True])

Negate condition with `~`

In [88]:
~((ary >3) | (ary < 2))

array([False, False, False, False])

## Linear Algebra with NumPy

One-dimensional array as row vectors:

In [89]:
import numpy as np

row_vector = np.array([1, 2, 3])
row_vector

array([1, 2, 3])

In [90]:
row_vector = np.array([[1, 2, 3]])
row_vector.shape

(1, 3)

Can use two-dimensional arrays to create column vectors:

In [91]:
column_vector = np.array([1, 2, 3]).reshape(-1, 1)
column_vector

array([[1],
       [2],
       [3]])

Instead of reshaping a one-dimensional array into two-dimensional one, we can simply add a new axis:

In [93]:
row_vector = np.array([1, 2, 3])

In [94]:
row_vector[:, np.newaxis]

array([[1],
       [2],
       [3]])

In [95]:
matrix = np.array([[1, 2, 3],
                   [4, 5, 6]])

In [96]:
np.matmul(matrix, column_vector)

array([[14],
       [32]])

Dot-product between two vectors:

In [97]:
np.matmul(row_vector, row_vector)

np.int32(14)

NumPy has a special `dot` function that behaves similar to `matmul` on pairs of one or two dimensional arrays (sometimes either `dot` or `matmul` can be faster on some machines)

In [98]:
np.dot(row_vector, row_vector)

np.int32(14)

In [99]:
np.dot(matrix, row_vector)

array([14, 32])

In [100]:
np.dot(matrix, column_vector)

array([[14],
       [32]])

NumPy arrays have a handy `transpose` method to tranpose matricies if necesary:

In [101]:
matrix = np.array([[1, 2, 3],
                   [4, 5, 6]])

matrix.transpose()

array([[1, 4],
       [2, 5],
       [3, 6]])

In [103]:
# 2x3 3x2 => 2x2

np.matmul(matrix, matrix.transpose())

array([[14, 32],
       [32, 77]])