# Numpy
The numpy package is used for doing mathematical operations and computing basic statistics. A key feature of the numpy package is that it introduces the *ndarray* datatype. 

Start by importing the numpy package:

In [2]:
import numpy as np # convention

There are multiple ways of creating an array. Arrays are commonly created from lists, but alteratively, random arrays can be generated as well.

In [3]:
# Create a one dimensional array using the np.arange function
one_d_array = np.arange(5)
print(type(one_d_array))
one_d_array

<class 'numpy.ndarray'>


array([0, 1, 2, 3, 4])

In [3]:
# Create a numpy array from the following list
some_list = [10, 30, 25, 76, 11]
from_list_array = np.array(some_list)
print(type(from_list_array))

<class 'numpy.ndarray'>


#### Basic vector operations
Perform basic vector operations on the following arrays:

In [8]:
a = np.random.randint(100, size=(50))
b = np.random.randint(10, size=(50))
a, b

(array([83, 66, 33, 51, 84, 89, 25, 41, 64, 54, 17, 28, 95,  7, 46, 13, 56,
        84, 69, 66, 20, 69, 96, 60, 33, 51, 38, 36, 39,  3, 35, 23, 88, 36,
        56, 39, 78, 24, 15,  5, 10,  9, 41, 82, 82, 29, 44, 13, 60, 48]),
 array([5, 6, 5, 1, 0, 7, 2, 8, 7, 8, 6, 5, 8, 9, 2, 9, 6, 1, 6, 5, 7, 0,
        4, 7, 5, 7, 6, 0, 5, 2, 6, 1, 8, 4, 7, 7, 8, 9, 3, 6, 0, 2, 5, 5,
        6, 6, 9, 6, 2, 9]))

Add the two arrays (element-wise)

In [5]:
a + b

array([  1,  20,  58,  72,  16,  85,  22,  22,  35,  59,  87,  47,  81,
        82,  68,  83,  66,  65,  54,  53,  89, 102,  91,  23, 108,  37,
        23,  56,  77,  11,  13,  68,   2,  30,  97,  83,  73,  42,  11,
        47,  78,  85,  38,   6,  55,  79,  89,  31,  87,   9])

Compute the dot product of the arrays

In [9]:
np.dot(a, b)

12227

Compute the minimum, maximum, mean, and median of array a

In [12]:
print(a.min())
print(a.max())
print(np.median(a))
print(a.mean())

3
96
42.5
46.66


**Matrices**

Random matrices can be computed in the same way as above. Create a random matrix with shape 5x5 and values as integers between 0 and 100.

In [14]:
a = np.random.randint(100, size=(5, 5))
a

array([[68, 51, 69, 39, 30],
       [ 4, 50, 75, 96, 88],
       [38, 80, 20, 75, 16],
       [63, 30, 82, 19, 49],
       [94, 44, 44, 23, 53]])

Compute the eigenvalues and eigenvectors of a

In [15]:
eigenvalues, eigenvectors = np.linalg.eig(a)
eigenvalues

array([259.37911058 +0.j        , -51.83694637 +0.j        ,
         8.30860988+31.12190681j,   8.30860988-31.12190681j,
       -14.15938398 +0.j        ])

Compute the transpose of a

In [16]:
a.transpose()

array([[68,  4, 38, 63, 94],
       [51, 50, 80, 30, 44],
       [69, 75, 20, 82, 44],
       [39, 96, 75, 19, 23],
       [30, 88, 16, 49, 53]])

Compute the covariance matrix of a

In [17]:
np.cov(a)

array([[ 299.3 , -426.8 ,  -80.4 ,  298.2 ,  224.45],
       [-426.8 , 1376.8 ,  -18.85, -293.45, -850.7 ],
       [ -80.4 ,  -18.85,  909.2 , -621.6 , -317.85],
       [ 298.2 , -293.45, -621.6 ,  636.3 ,  336.3 ],
       [ 224.45, -850.7 , -317.85,  336.3 ,  683.3 ]])

Compute the minimum values for the entire matrix, each individual row, and each individual column

In [18]:
print(np.min(a))
print(np.min(a, axis=0))
print(np.min(a, axis=1))

4
[ 4 30 20 19 16]
[30  4 16 19 23]


#### Subsetting

In [27]:
ages = np.array([26, 31, 28, 56, 45, 19, 18, 59])

juniors = ages[ages < 35]
seniors = ages[ages >= 35]

mean_age_juniors = juniors.mean()
mean_age_seniors = seniors.mean()
print(mean_age_juniors, mean_age_seniors)

24.4 53.333333333333336


## Revisit variable assignments

In [14]:
a = np.arange(5)
b = a
a, b

(array([0, 1, 2, 3, 4]), array([0, 1, 2, 3, 4]))

In [15]:
a[0] = 100
a

array([100,   1,   2,   3,   4])

In [16]:
b

array([100,   1,   2,   3,   4])

With masking

In [17]:
c = a[:2]
a[0] = 1
c[0]

1

In [18]:
d = a[[0, 1]]
a[0] = 100
d[0]

1

In [19]:
print(c.flags.owndata)
print(d.flags.owndata)

False
True


In [20]:
c = a[:2].copy()
a[0] = 1
print(c.flags.owndata)
c[0]

True


100

Test for understanding

In [21]:
an_array = np.arange(10)
an_array

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [22]:
an_array[3:4][0] = 0
print(an_array)

[0 1 2 0 4 5 6 7 8 9]


In [23]:
an_array[[3]][0] = 3
print(an_array)

[0 1 2 0 4 5 6 7 8 9]
