# Some Basics for using numpy

Numpy is a python library for fast numerical operations in python. It is extremely useful for scientific and mathematical computation. Much of it is implemented with a combination of C and fortran, so it is **fast**, and implements many tricks for fast computation of basic vector and matrix math.

https://numpy.org/doc/stable/index.html

In [1]:
# By convention, we often import it as `np`. I am honestly not sure why :).
import numpy as np


## Ndarray

Numpy gives us a generic array data structure that is a bit like a python list, but implements fast vector operations.

In [2]:
# A python List
x = [2,4,6,8,10]

# Make it a numpy ndarray
np_x = np.array(x)
np_x

array([ 2,  4,  6,  8, 10])

In [3]:
# We can loop with the same semantics as python Lists
for i in np_x:
    print(i)

2
4
6
8
10


In [4]:
# How about a matrix
# A python List
x = [[2,4,6,8,10], [1,2,3,4,5]]

# Make it a numpy ndarray
np_x = np.array(x)
np_x

array([[ 2,  4,  6,  8, 10],
       [ 1,  2,  3,  4,  5]])

In [10]:
for i in np_x:
    print(i)

[ 2  4  6  8 10]
[1 2 3 4 5]


### We can append a numpy array.
Note that this creates a new object (unlike python Lists)!

In [11]:
np.append(np_x[0], 17)

array([ 2,  4,  6,  8, 10, 17])

In [12]:
# np_x is uneffected
np_x

array([[ 2,  4,  6,  8, 10],
       [ 1,  2,  3,  4,  5]])

# Types in numpy arrays

In [13]:
type(np_x) #type of npx

numpy.ndarray

In [15]:
print(np_x.dtype) #type of the elements of npx

int32


In [18]:
np_str = np.array(["hi", "these", "are", "strings"])
print(np_str)
# Unicode type, of up to 7 characters.
print(np_str.dtype)

['hi' 'these' 'are' 'strings']
<U7


# Numpy arrays are python objects, so types are still assigned dynamically

In [19]:
# np_x has dtype int64.
np_x / 3
(np_x / 3).dtype

dtype('float64')

# Numpy implements python operators
Typically these are implemented to operate over the entire vector or matrix

In [22]:
print(np_x)
print("Multiplication")
print(np_x*2) ##these dont change the np_x array see cell below, it is still the original array
print("Powers")
print(np_x**2)

[[ 2  4  6  8 10]
 [ 1  2  3  4  5]]
Multiplication
[[ 4  8 12 16 20]
 [ 2  4  6  8 10]]
Powers
[[  4  16  36  64 100]
 [  1   4   9  16  25]]


In [23]:
np_x

array([[ 2,  4,  6,  8, 10],
       [ 1,  2,  3,  4,  5]])

## This even applies to boolean operations

In [24]:
x = np.array([1,2,3,4])
y = np.array([1,2,3,4])

x == y

array([ True,  True,  True,  True])

In [25]:
# Note, we could use the `all` method to get a single bool
(x == y).all()

True

### We can use the pairwise division operator to get a distribution

In [26]:
numbers = np.array([3,7,8,9,14,16])
numbers / numbers.sum()

array([0.05263158, 0.12280702, 0.14035088, 0.15789474, 0.24561404,
       0.28070175])

# We also have methods for lots of typical mathematical operations

In [27]:
# Take the mean
print(numbers.mean())
print(numbers.max())
print(numbers.min())

9.5
16
3


# What if we want the vector norm?

In [28]:
# Take the l2 norm
np.sqrt(np.sum(numbers**2))

25.592967784139454

In [29]:
# Or just use the linalg module!
np.linalg.norm(numbers)

25.592967784139454

# Let's compute the euclidean distance between 2 vectors

In [30]:
x = np.array([2,4,6,8])
y = np.array([1,2,3,4])

np.sqrt(np.sum((x-y) ** 2))

5.477225575051661

In [31]:
# Or just use linalg.norm!
np.linalg.norm(x-y)

5.477225575051661

# Dot product in numpy

In [32]:
x.dot(y)

60

# Let's implement the cosine distance of two vectors

In [33]:
x.dot(y) / np.linalg.norm(x) * np.linalg.norm(y)

30.0

# Let's look back at out matrix!

- *Note* numpy does have a specific matrix type. 
- The matrix type `np.matrix` inherits for ndarray, but has slightly different notation for matrix operations, and implements some operators differently.

Here we will just focus on using a higher dimensional ndarray.

In [34]:
np_x


array([[ 2,  4,  6,  8, 10],
       [ 1,  2,  3,  4,  5]])

# Transpose

In [35]:
np_x.T

array([[ 2,  1],
       [ 4,  2],
       [ 6,  3],
       [ 8,  4],
       [10,  5]])

In [36]:
np_x.dot(np_x.T)

array([[220, 110],
       [110,  55]])

## Notice the above does matrix multiplication.

But if we get into rank 3, then we have a different method, `@` (`.matmul`), which behaves differently than `dot`

In [37]:
rank_3_x = np.array(
    [
        [[1,2], [4,5]],
        [[2,4], [8,10]],
    ]
)
# Each vector is treated as an element here.
rank_3_x.dot(rank_3_x.T)

array([[[[  9,  18],
         [ 12,  24]],

        [[ 24,  48],
         [ 33,  66]]],


       [[[ 18,  36],
         [ 24,  48]],

        [[ 48,  96],
         [ 66, 132]]]])

In [38]:
# Each matrix is treated as an element here.
rank_3_x @ rank_3_x.T

array([[[  9,  18],
        [ 24,  48]],

       [[ 24,  48],
        [ 66, 132]]])

# Other useful methods for matrices

In [39]:
# A 4 column identity matrix
np.identity(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [40]:
# Easy way to get higher magnitude alongthe diagonal
np.identity(5) * 3

array([[3., 0., 0., 0., 0.],
       [0., 3., 0., 0., 0.],
       [0., 0., 3., 0., 0.],
       [0., 0., 0., 3., 0.],
       [0., 0., 0., 0., 3.]])

In [41]:
# Extracting the diagonal values from a matrix
x = np.identity(5) * 3
np.diag(x)

array([3., 3., 3., 3., 3.])

In [None]:
fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff