![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)
<hr style="margin-bottom: 40px;">

<img src="https://user-images.githubusercontent.com/7065401/39118381-910eb0c2-46e9-11e8-81f1-a5b897401c23.jpeg"
    style="width:300px; float: right; margin: 0 40px 40px 40px;"></img>

# Numpy: Numeric computing library

NumPy (Numerical Python) is one of the core packages for numerical computing in Python. Pandas, Matplotlib, Statmodels and many other Scientific libraries rely on NumPy.

NumPy major contributions are:

* Efficient numeric computation with C primitives
* Efficient collections with vectorized operations
* An integrated and natural Linear Algebra API
* A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.

Let's develop on efficiency. In Python, **everything is an object**, which means that even simple ints are also objects, with all the required machinery to make object work. We call them "Boxed Ints". In contrast, NumPy uses primitive numeric types (floats, ints) which makes storing and computation efficient.

<img src="https://docs.google.com/drawings/d/e/2PACX-1vTkDtKYMUVdpfVb3TTpr_8rrVtpal2dOknUUEOu85wJ1RitzHHf5nsJqz1O0SnTt8BwgJjxXMYXyIqs/pub?w=726&h=396" />


![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)

## Hands on! 

In [426]:
import sys
import numpy as np

## Basic Numpy Arrays

In [427]:
np.array([1, 2, 3, 4])

array([1, 2, 3, 4])

In [428]:
b = np.array([0, .5, 1, 1.5, 2])

In [429]:
b[0], b[1]

(0.0, 0.5)

In [430]:
a = np.array([1, 2, 3, 4])

In [431]:
a[1:]

array([2, 3, 4])

In [432]:
a[1:3] #includes upper limit   

array([2, 3])

In [433]:
a[1:-1]

array([2, 3])

In [434]:
a[::2] #stepsize of 2

array([1, 3])

In [435]:
print(b)

[0.  0.5 1.  1.5 2. ]


In [436]:
b[0], b[2], b[-1]

(0.0, 1.0, 2.0)

In [437]:
b[[0, 2, -1]]

array([0., 1., 2.])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Array Types

In [438]:
a

array([1, 2, 3, 4])

In [439]:
a.dtype

dtype('int64')

In [440]:
b

array([0. , 0.5, 1. , 1.5, 2. ])

In [441]:
b.dtype

dtype('float64')

In [442]:
np.array([1, 2, 3, 4], dtype=np.float)

array([1., 2., 3., 4.])

In [443]:
np.array([1, 2, 3, 4], dtype=np.int8) #this makes the the array containing 8bytes integer

array([1, 2, 3, 4], dtype=int8)

In [444]:
c = np.array(['a', 'b', 'c'])

In [445]:
c.dtype

dtype('<U1')

In [446]:
d = np.array([{'a': 1}, sys])

In [447]:
d.dtype

dtype('O')

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Dimensions and shapes

In [448]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6]
])

In [449]:
A.shape

(2, 3)

In [450]:
A.ndim      #ndim gets the dimension of the array

2

In [451]:
A.size

6

In [452]:
B = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4],
        [3, 2, 1]
    ]
])

In [453]:
B

array([[[12, 11, 10],
        [ 9,  8,  7]],

       [[ 6,  5,  4],
        [ 3,  2,  1]]])

In [454]:
B.shape

(2, 2, 3)

In [455]:
B.ndim

3

In [456]:
B.size

12

If the shape isn't consistent, it'll just fall back to regular Python objects:

In [457]:
C = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4]
    ]
])

In [458]:
C.dtype

dtype('O')

In [459]:
C.shape

(2,)

In [460]:
C.size

2

In [461]:
type(C[0])

list

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Indexing and Slicing of Matrices

In [462]:
# Square matrix
A = np.array([
#.   0. 1. 2
    [1, 2, 3], # 0
    [4, 5, 6], # 1
    [7, 8, 9]  # 2
])

In [463]:
A[1]    #gets you the first row

array([4, 5, 6])

In [464]:
A[1][0]     #row col

4

In [465]:
# A[d1, d2, d3, d4]

In [466]:
A[1, 0]     #same as the one before

4

In [467]:
A[0:2]      #gets you the first 2 rows not including the upper limit

array([[1, 2, 3],
       [4, 5, 6]])

In [468]:
A[:, :2]    #get all rows & the first 2 cols

array([[1, 2],
       [4, 5],
       [7, 8]])

In [469]:
A[:2, :2]   #separator is the comma, as this example gets the first 2 cols & rows

array([[1, 2],
       [4, 5]])

In [470]:
A[:2, 2:]   #get the the first 2 rows & and the last col

array([[3],
       [6]])

In [471]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [472]:
A[1] = np.array([10, 10, 10])       #change the 2nd row to the one specified in the arr

In [473]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [ 7,  8,  9]])

In [474]:
A[2] = 99

In [475]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [99, 99, 99]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Summary statistics

In [476]:
a = np.array([1, 2, 3, 4])

In [477]:
a.sum()

10

In [478]:
a.mean()

2.5

In [479]:
a.std()

1.118033988749895

In [480]:
a.var()

1.25

In [481]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [482]:
A.sum()

45

In [483]:
A.mean()

5.0

In [484]:
A.std()

2.581988897471611

In [485]:
A.sum(axis=0)   #sum of cols axis 0

array([12, 15, 18])

In [486]:
A.sum(axis=1)       #sum of rows axis 1

array([ 6, 15, 24])

In [487]:
A.mean(axis=0)

array([4., 5., 6.])

In [488]:
A.mean(axis=1)

array([2., 5., 8.])

In [489]:
A.std(axis=0)

array([2.44948974, 2.44948974, 2.44948974])

In [490]:
A.std(axis=1)

array([0.81649658, 0.81649658, 0.81649658])

And [many more](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.ndarray.html#array-methods)...

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Broadcasting and Vectorized operations

In [491]:
a = np.arange(4)    #make a vector containing x elem from 0 to x-1

In [492]:
a

array([0, 1, 2, 3])

In [493]:
a + 10

array([10, 11, 12, 13])

In [494]:
a * 10

array([ 0, 10, 20, 30])

In [495]:
a

array([0, 1, 2, 3])

In [496]:
a += 100

In [497]:
a

array([100, 101, 102, 103])

In [498]:
l = [0, 1, 2, 3]

In [499]:
[i * 10 for i in l]

[0, 10, 20, 30]

In [500]:
a = np.arange(4)

In [501]:
a

array([0, 1, 2, 3])

In [502]:
b = np.array([10, 10, 10, 10])

In [503]:
b

array([10, 10, 10, 10])

In [504]:
a + b

array([10, 11, 12, 13])

In [505]:
a * b

array([ 0, 10, 20, 30])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Boolean arrays
_(Also called masks)_

In [506]:
a = np.arange(4)

In [507]:
a

array([0, 1, 2, 3])

In [508]:
a[0], a[-1]

(0, 3)

In [509]:
a[[0, -1]]

array([0, 3])

In [510]:
a[[True, False, False, True]]

array([0, 3])

In [511]:
a

array([0, 1, 2, 3])

In [512]:
a >= 2      #this is the mask

array([False, False,  True,  True])

In [513]:
a[a >= 2]

array([2, 3])

In [514]:
a.mean()

1.5

In [515]:
a[a > a.mean()]

array([2, 3])

In [516]:
a[~(a > a.mean())]      # ~ is not

array([0, 1])

In [517]:
a[(a == 0) | (a == 1)]

array([0, 1])

In [518]:
a[(a <= 2) & (a % 2 == 0)]

array([0, 2])

In [519]:
A = np.random.randint(100, size=(3, 3))

In [520]:
A

array([[34, 99, 36],
       [70, 60, 69],
       [62, 55, 16]])

In [521]:
A[np.array([
    [True, False, True],
    [False, True, False],
    [True, False, True]
])]

array([34, 36, 60, 62, 16])

In [522]:
A > 30

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True, False]])

In [523]:
A[A > 30]

array([34, 99, 36, 70, 60, 69, 62, 55])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Linear Algebra

In [524]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [525]:
B = np.array([
    [6, 5],
    [4, 3],
    [2, 1]
])

In [526]:
A.dot(B)

array([[20, 14],
       [56, 41],
       [92, 68]])

In [527]:
A @ B   #matrix multiplication

array([[20, 14],
       [56, 41],
       [92, 68]])

In [528]:
B.T         # transpose

array([[6, 4, 2],
       [5, 3, 1]])

In [529]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [530]:
B.T @ A

array([[36, 48, 60],
       [24, 33, 42]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Size of objects in Memory

### Int, floats

In [531]:
# An integer in Python is > 24bytes
sys.getsizeof(1)

28

In [532]:
# Longs are even larger
sys.getsizeof(10**100)

72

In [533]:
# Numpy size is much smaller
np.dtype(int).itemsize

8

In [534]:
# Numpy size is much smaller
np.dtype(np.int8).itemsize

1

In [535]:
np.dtype(float).itemsize

8

### Lists are even larger

In [536]:
# A one-element list
sys.getsizeof([1])

64

In [537]:
# An array of one element in numpy
np.array([1]).nbytes

8

### And performance is also important

In [538]:
l = list(range(100000))

In [539]:
a = np.arange(100000)

In [540]:
%time np.sum(a ** 2)

CPU times: user 1.02 ms, sys: 327 µs, total: 1.35 ms
Wall time: 574 µs


333328333350000

In [541]:
%time sum([x ** 2 for x in l])

CPU times: user 22.8 ms, sys: 0 ns, total: 22.8 ms
Wall time: 22.3 ms


333328333350000

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Useful Numpy functions

### `random` 

In [542]:
np.random.random(size=2)

array([0.14615502, 0.65163495])

In [543]:
np.random.normal(size=2)

array([-1.26991515, -1.936757  ])

In [544]:
np.random.rand(2, 4)

array([[0.88819424, 0.15604106, 0.19590417, 0.86690251],
       [0.59362574, 0.34175965, 0.8029215 , 0.33370924]])

---
### `arange`

In [545]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [546]:
np.arange(5, 10)    #range of arange

array([5, 6, 7, 8, 9])

In [547]:
np.arange(0, 1, .1)     #the last one is stepsize

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

---
### `reshape`

In [548]:
np.arange(10).reshape(2, 5)     #reshape the vector intro 2*5 matrix

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [549]:
np.arange(10).reshape(5, 2)

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

---
### `linspace`

In [550]:
np.linspace(0, 1, 5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [551]:
np.linspace(0, 1, 20)

array([0.        , 0.05263158, 0.10526316, 0.15789474, 0.21052632,
       0.26315789, 0.31578947, 0.36842105, 0.42105263, 0.47368421,
       0.52631579, 0.57894737, 0.63157895, 0.68421053, 0.73684211,
       0.78947368, 0.84210526, 0.89473684, 0.94736842, 1.        ])

In [552]:
np.linspace(0, 1, 20, False)

array([0.  , 0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 ,
       0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])

---
### `zeros`, `ones`, `empty`

In [553]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [554]:
np.zeros((3, 3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [555]:
np.zeros((3, 3), dtype=np.int)

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

In [556]:
np.ones(5)

array([1., 1., 1., 1., 1.])

In [557]:
np.ones((3, 3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [558]:
np.empty(5)

array([1., 1., 1., 1., 1.])

In [559]:
np.empty((2, 2))

array([[0.25, 0.5 ],
       [0.75, 1.  ]])

---
### `identity` and `eye`

In [560]:
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [561]:
np.eye(3, 3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [562]:
np.eye(8, 4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [567]:
np.eye(8, 4, k=1)       #eye shifted

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [564]:
np.eye(8, 4, k=-3)

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.]])

In [565]:
"Hello World"[6]

'W'

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)