<h1>01 Numpy</h1>
$\newcommand{\Set}[1]{\{#1\}}$ 
$\newcommand{\Tuple}[1]{\langle#1\rangle}$ 
$\newcommand{\v}[1]{\pmb{#1}}$ 
$\newcommand{\cv}[1]{\begin{bmatrix}#1\end{bmatrix}}$ 
$\newcommand{\rv}[1]{[#1]}$ 
$\DeclareMathOperator{\argmax}{arg\,max}$ 
$\DeclareMathOperator{\argmin}{arg\,min}$ 
$\DeclareMathOperator{\dist}{dist}$
$\DeclareMathOperator{\abs}{abs}$

<h2>Preliminaries</h2>
<p>
    One of my first code cells always looks like this:
</p>

In [16]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


<p>
    The first two lines mean that modules get reloaded before executing anything. So if I have my own module
    and I change it in an editor, then I can run the code without worrying about how to reload the changed
    module: it's done automatically.
</p>
<p>
    The third line says that when we draw graphs, they will appear in the notebook itself, not in a separate
    window.
</p>

<p>
    My next cell usually contains these three imports:
</p>

In [4]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

<p>
    My third code cell also contains <code>import</code> statements, ones that are specific to this notebook. 
    Here's an example:
</p>

In [5]:
from math import sqrt

<p>
    So now we can compute square roots:
</p>

In [8]:
sqrt(81)

9.0

<h2>Numpy</h2>
<p>
    Numpy is short for Numerical Python. It offers <code>ndarray</code>, which is a fast and space-efficient
    multidimensional array providing vectorized arithmetic operations, among other things. Pandas is built atop
    of numpy, and is somehwat more high-level; scikit-learn uses numpy ndarrays as its main data structure;
    matplotlib works with numpy arrays also.
</p>

<h2>Exercises</h2>
<ol>
    <li>
        Let
        $$\v{u} = \cv{2\\-7\\1}\,\,\,
          \v{v} = \cv{-3\\0\\4}$$
        and
        $$\v{A} =  \begin{bmatrix}
                      1 &  2 & 0 \\
                      3 & -1 & 4
                  \end{bmatrix}\,\,\,
           \v{B} = \begin{bmatrix}
                       2 & -1 \\
                       1 &  0 \\
                      -3 & 4
                \end{bmatrix}$$
        Use numpy to compute:
        <ol>
            <li>$\v{u} + \v{v}$</li>
            <li>$-3\v{u}$</li>
            <li>$\v{u}\v{v}$ (Strictly, we should write $\v{u}^T\v{v}$. Why? But it is common to write it without the transpose.)</li>
            <li>$\v{u}\v{u}$</li>
            <li>$\sqrt{\v{u}\v{u}}$</li>
            <li>$\v{u} * \v{v}$</li>
            <li>$\v{A} + \v{A}$</li>
            <li>$\v{A} + \v{u}$</li>
            <li>$10\v{A}$</li>
            <li>$\v{A}\v{v}$</li>
            <li>$\v{A}\v{B}$</li>
            <li>$\v{A}^T$</li>
            <li>$\v{A}\v{A}^T$</li>
            <li>$\v{A}^T\v{A}$</li>
            <li>the smallest element in $\v{u}$</li>
            <li>the index of the smallest element in $\v{u}$</li>
            <li>the mean of the values in $\v{u}$</li>
        </ol>
    </li>
    <li>Play with the <code>cumsum</code> method on 1-dimensional numpy arrays. Then define a Python function
        that does the same thing for regular Python lists. Then compare how long they take to run on an 
        array/list that contains all the integers from 1 to 1000 inclusive.
    </li>
</ol>

In [20]:
u = np.array([2,-7,1])
v = np.array([-3,0,4])
A = np.array([[1,2,0], [3,-1,4]])
B = np.array([[2,-1],[1,0],[-3,4]])

<h1> A </h1>

In [16]:
u + v

array([[-1, -7,  5]])

<h1> B </h1>

In [17]:
-3 * u

array([[-6, 21, -3]])

<h1> C </h1>

In [22]:
u.dot(v)

-2

<h1> D </h1>

In [23]:
u.dot(u)

54

# E

In [25]:
np.sqrt(u.dot(u))

7.3484692283495345

# F

In [26]:
u * v

array([-6,  0,  4])

# G

In [27]:
A + A

array([[ 2,  4,  0],
       [ 6, -2,  8]])

# H

In [28]:
A + u

array([[ 3, -5,  1],
       [ 5, -8,  5]])

# I

In [29]:
10 * A

array([[ 10,  20,   0],
       [ 30, -10,  40]])

# J

In [30]:
A.dot(v)

array([-3,  7])

# K

In [31]:
A.dot(B)

array([[ 4, -1],
       [-7, 13]])

# L

In [32]:
A.T

array([[ 1,  3],
       [ 2, -1],
       [ 0,  4]])

# M

In [33]:
A.dot(A.T)

array([[ 5,  1],
       [ 1, 26]])

# N

In [38]:
A.T.dot(A)

array([[10, -1, 12],
       [-1,  5, -4],
       [12, -4, 16]])

# O

In [98]:
u.min()

-7

# P

In [37]:
np.argmin(u)

1

#  Q

In [39]:
np.mean(u)

-1.3333333333333333

# 2

In [43]:
np.cumsum(u)

array([ 2, -5, -4])

In [64]:
def pythonCumSum(toSum):
    [sum(toSum[:x]) for x in range(1, len(toSum)+1)]

        

In [65]:
pythonCumSum([1,2,3])

[1, 3, 6]

In [96]:
pythonTestList = [x for x in range(0,1000)]

In [97]:
%timeit pythonCumSum(pythonTestList)

2.45 ms ± 34.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [80]:
testList = np.arange(1000)

In [94]:
%timeit np.cumsum(testList)

3.52 µs ± 37.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
