# Introduction to Numpy

NumPy is the fundamental package for scientific computing
in Python. It is a Python library that provides a multidimensional array
object. In this course, we will be using NumPy for linear algebra.

If you are interested in learning more about NumPy, you can find the user
guide and reference at https://docs.scipy.org/doc/numpy/index.html

Let's first import the NumPy package

In [1]:
import numpy as np # we commonly use the np abbreviation when referring to numpy

## Creating Numpy Arrays

New arrays can be made in several ways. We can take an existing list and convert it to a numpy array:

In [2]:
a = np.array([1,2,3])
a

array([1, 2, 3])

There are also functions for creating arrays with ones and zeros

In [3]:
np.zeros((2,2))

array([[0., 0.],
       [0., 0.]])

In [4]:
np.ones((3,2))

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

## Accessing Numpy Arrays
You can use the common square bracket syntax for accessing elements
of a numpy array

In [5]:
A = np.arange(9).reshape(3,3)
print(A)

[[0 1 2]
 [3 4 5]
 [6 7 8]]


In [6]:
print(A[0]) # Access the first row of A
print(A[0, 1]) # Access the second item of the first row
print(A[:, 1]) # Access the second column

[0 1 2]
1
[1 4 7]


## Operations on Numpy Arrays
You can use the operations '*', '**', '\', '+' and '-' on numpy arrays and they operate elementwise.

In [7]:
a = np.array([[1,2], 
              [2,3]])
b = np.array([[4,5],
              [6,7]])

In [8]:
print(a + b)

[[ 5  7]
 [ 8 10]]


In [9]:
print(a - b)

[[-3 -3]
 [-4 -4]]


In [10]:
print(a * b)

[[ 4 10]
 [12 21]]


In [11]:
print(a / b)

[[0.25       0.4       ]
 [0.33333333 0.42857143]]


In [12]:
print(a**2)

[[1 4]
 [4 9]]


There are also some commonly used function
For example, you can sum up all elements of an array

In [13]:
print(a)
print(np.sum(a))

[[1 2]
 [2 3]]
8


Or sum along the first dimension

In [14]:
np.sum(a, axis=0)

array([3, 5])

There are many other functions in numpy, and some of them **will be useful**
for your programming assignments. As an exercise, check out the documentation
for these routines at https://docs.scipy.org/doc/numpy/reference/routines.html
and see if you can find the documentation for `np.sum` and `np.reshape`.

## Linear Algebra

In this course, we use the numpy arrays for linear algebra.
We usually use 1D arrays to represent vectors and 2D arrays to represent
matrices

In [15]:
A = np.array([[2,4], 
             [6,8]])

You can take transposes of matrices with `A.T`

In [16]:
print('A\n', A)
print('A.T\n', A.T)

A
 [[2 4]
 [6 8]]
A.T
 [[2 6]
 [4 8]]


Note that taking the transpose of a 1D array has **NO** effect.

In [17]:
a = np.ones(3)
print(a)
print(a.shape)
print(a.T)
print(a.T.shape)


[1. 1. 1.]
(3,)
[1. 1. 1.]
(3,)


But it does work if you have a 2D array of shape (3,1)


In [18]:
a = np.ones((3,1))
print(a)
print(a.shape)
print(a.T)
print(a.T.shape)

[[1.]
 [1.]
 [1.]]
(3, 1)
[[1. 1. 1.]]
(1, 3)


### Dot product

We can compute the dot product between two vectors with np.dot

In [19]:
x = np.array([1,2,3])
y = np.array([4,5,6])
np.dot(x, y)

32

We can compute the matrix-matrix product, matrix-vector product too. In Python 3, this is conveniently expressed with the @ syntax

In [20]:
A = np.eye(3) # You can create an identity matrix with np.eye
B = np.random.randn(3,3)
x = np.array([1,2,3])

In [21]:
# Matrix-Matrix product
A @ B

array([[-0.53941337,  0.94506466, -0.13152373],
       [-1.25591243, -0.87715192,  0.57727525],
       [ 0.40673838,  0.8737607 , -0.23865083]])

In [22]:
# Matrix-vector product
A @ x

array([1., 2., 3.])

Sometimes, we might want to compute certain properties of the matrices. For example, we might be interested in a matrix's determinant, eigenvalues/eigenvectors. Numpy ships with the `numpy.linalg` package to do
these things on 2D arrays (matrices).

In [23]:
from numpy import linalg

In [24]:
# This computes the determinant
linalg.det(A)

1.0

In [25]:
# This computes the eigenvalues and eigenvectors
eigenvalues, eigenvectors = linalg.eig(A)
print("The eigenvalues are\n", eigenvalues)
print("The eigenvectors are\n", eigenvectors)

The eigenvalues are
 [1. 1. 1.]
The eigenvectors are
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


## Miscellaneous

### Time your code
One tip that is really useful is to use the magic commannd `%time` to time the execution time of your function.

In [26]:
%time np.abs(A)

CPU times: user 5 µs, sys: 3 µs, total: 8 µs
Wall time: 10.7 µs


array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [27]:
ar=np.zeros(4)
ar

array([0., 0., 0., 0.])

In [28]:
ar[0]=2
ar

array([2., 0., 0., 0.])

In [29]:
X = np.array([[0., 1., 1.], 
              [1., 2., 1.]])

In [30]:
mean = np.zeros((X.shape[1],))
mean

array([0., 0., 0.])

In [31]:
for d in range(X.shape[1]):
    mean[d]=np.mean(X[:, d])

In [32]:
mean

array([0.5, 1.5, 1. ])

In [33]:
def mean_naive(X):
    """Compute the sample mean for a dataset by iterating over the dataset.
    
    Args:
        X: `ndarray` of shape (N, D) representing the dataset. N 
        is the size of the dataset and D is the dimensionality of the dataset.
    Returns:
        mean: `ndarray` of shape (D, ), the sample mean of the dataset `X`.
    """
    # YOUR CODE HERE
    ### Uncomment and edit the code below
#     iterate over the dataset and compute the mean vector.
    N, D = X.shape
    mean = np.zeros((D,))
    for d in range(D):
        for n in range(N):
            mean[d]+=X[n, d]
        #mean[d]=np.mean(X[:,d])
#     for n in range(N):
#         mean[n]=np.mean(X[n,:])
    mean/=N
    return mean

In [34]:
X = np.array([[0., 1., 1.], 
              [1., 2., 1.]])

In [35]:
mean_naive(X)

array([0.5, 1.5, 1. ])

In [45]:
def cov_naive(X):
    """Compute the sample covariance for a dataset by iterating over the dataset.
    
    Args:
        X: `ndarray` of shape (N, D) representing the dataset. N 
        is the size of the dataset and D is the dimensionality of the dataset.
    Returns:
        ndarray: ndarray with shape (D, D), the sample covariance of the dataset `X`.
    """
    # YOUR CODE HERE
    ### Uncomment and edit the code below
    N, D = X.shape
#     ### Edit the code below to compute the covariance matrix by iterating over the dataset.
    covariance = np.zeros((D, D))
#     ### Update covariance
    means = mean_naive(X)
    cov = []
    for j in range(len(means)):
        for k in range(len(means)):
            terms = ( (X[i][j] - means[j]) * (X[i][k] - means[k]) for i in range(len(means)) )
            cov  = sum(terms) / len(means)
            covariance[j,k]=cov
#     ###
    return covariance

In [46]:
X = np.array([[0., 1.], 
              [1., 2.],
     [0., 1.], 
     [1., 2.]])

cov_naive(X)

array([[0.25, 0.25],
       [0.25, 0.25]])

In [38]:
expected_cov = np.array(
    [[0.25, 0.25],
    [0.25, 0.25]])

In [41]:
X = np.array([[0., 1.], 
              [2., 3.]])

In [42]:
cov_naive(X)

array([[1., 1.],
       [1., 1.]])

In [43]:
expected_cov = np.array(
    [[1., 1.],
    [1., 1.]])

In [44]:
# Test covariance is zero
X = np.array([[0., 1.], 
              [0., 1.],
              [0., 1.]])

cov_naive(X)

array([[0., 0.],
       [0., 0.]])