![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)
<hr style="margin-bottom: 40px;">

<img src="https://user-images.githubusercontent.com/7065401/39118381-910eb0c2-46e9-11e8-81f1-a5b897401c23.jpeg"
    style="width:300px; float: right; margin: 0 40px 40px 40px;"></img>

# Numpy: Numeric computing library

NumPy (Numerical Python) is one of the core packages for numerical computing in Python. Pandas, Matplotlib, Statmodels and many other Scientific libraries rely on NumPy.

NumPy major contributions are:

* Efficient numeric computation with C primitives
* Efficient collections with vectorized operations
* An integrated and natural Linear Algebra API
* A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.

Let's develop on efficiency. In Python, **everything is an object**, which means that even simple ints are also objects, with all the required machinery to make object work. We call them "Boxed Ints". In contrast, NumPy uses primitive numeric types (floats, ints) which makes storing and computation efficient.

<img src="https://docs.google.com/drawings/d/e/2PACX-1vTkDtKYMUVdpfVb3TTpr_8rrVtpal2dOknUUEOu85wJ1RitzHHf5nsJqz1O0SnTt8BwgJjxXMYXyIqs/pub?w=726&h=396" />


![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)

## Hands on! 

In [5]:
import sys
import numpy as np

## Basic Numpy Arrays

In [2]:
np.array([1, 2, 3, 4])

array([1, 2, 3, 4])

In [13]:
a = np.array([1, 2, 3, 4])

In [5]:
b = np.array([0, .5, 1, 1.5, 2])

In [7]:
print(a.dtype)
print(b.dtype)

int64
float64


In [8]:
a[0], a[1] # Seleccionamos los elementos 0 y 1 del array

(1, 2)

In [9]:
a[0:] # Seleccionamos un pedazo del array, desde cero hasta el final

array([1, 2, 3, 4])

In [11]:
a[1:] # Acá seleccionamos desde el elemento en posicion 1 hasta el final

array([2, 3, 4])

In [None]:
a[1:3] # Selecionaos los elementos desde el de posición 1 hasta el de posición 2 (la posición 3 no está incluida)

array([2, 3])

In [12]:
a[1:-1] # Seleccionamos desde la posición 1 HASTA la posición -2, contando desde atrás para adelante

array([2, 3])

In [13]:
a[::2] # Seleccionamos desde la posición 0

array([1, 3])

In [6]:
b

array([0. , 0.5, 1. , 1.5, 2. ])

In [8]:
b.dtype

dtype('float64')

In [7]:
b[0], b[2], b[-1] # Seleccionamos los elementos por separado

(0.0, 1.0, 2.0)

In [9]:
b[[0, 2, -1]] # Si hacemos un slice entero, me crea un nuevo array, de las posiciones 0, 2, y la última

array([0., 1., 2.])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Array Types

In [14]:
a

array([1, 2, 3, 4])

In [15]:
a.dtype

dtype('int64')

In [16]:
b

array([0. , 0.5, 1. , 1.5, 2. ])

In [17]:
b.dtype

dtype('float64')

In [18]:
np.array([1, 2, 3, 4], dtype=np.float) # Podemos cambiar el tipo de datos de un array a float, por ejemplo

array([1., 2., 3., 4.])

In [19]:
np.array([1, 2, 3, 4], dtype=np.int8) # O a integers pero de 8 bits

array([1, 2, 3, 4], dtype=int8)

In [20]:
c = np.array(['a', 'b', 'c']) # También existen arrays de strings, aunque no se usan frecuentemente

In [21]:
c.dtype # Y ese string tiene un tipo unico, el u1

dtype('<U1')

In [22]:
d = np.array([{'a': 1}, sys])

In [23]:
d.dtype

dtype('O')

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Dimensions and shapes

In [24]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6]
]) # Este es un ejemplo de un array de dos dimensiones, donde los AXIS son los lados del cuadrado/cubo. 

En un array de una dimension, el axis es 0. En un array de dos dimensiones el axis 0 corresponde a las COLUMNAS y el axis 1 a las ROWS. En un array de 3 dimensiones, el axis 0 corresponde a las columnas, el axis 1 a las rows y el axis 2 a la profundidad. 

Shape me dice el número de FILAS y le número de COLUMNAS.

In [None]:
A.shape # Cantidad de filas y columnas

(2, 3)

In [None]:
A.ndim # Dimensiones

2

In [25]:
A.size # Nro total de elementos

6

In [26]:
B = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4],
        [3, 2, 1]
    ]
])

In [27]:
B

array([[[12, 11, 10],
        [ 9,  8,  7]],

       [[ 6,  5,  4],
        [ 3,  2,  1]]])

In [28]:
B.shape

(2, 2, 3)

In [29]:
B.ndim

3

In [30]:
B.size

12

If the shape isn't consistent, it'll just fall back to regular Python objects:

In [31]:
C = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4]
    ]
])

  import sys


In [32]:
C.dtype

dtype('O')

In [33]:
C.shape

(2,)

In [34]:
C.size

2

In [None]:
type(C[0])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Indexing and Slicing of Matrices

In [35]:
# Square matrix
A = np.array([
#.   0. 1. 2
    [1, 2, 3], # 0
    [4, 5, 6], # 1
    [7, 8, 9]  # 2
])

In [36]:
A[1]

array([4, 5, 6])

In [37]:
A[1][0]

4

In [38]:
# A[d1, d2, d3, d4]

In [None]:
A[1, 0]

4

In [39]:
A[0:2]

array([[1, 2, 3],
       [4, 5, 6]])

In [40]:
A[:, :2]

array([[1, 2],
       [4, 5],
       [7, 8]])

In [41]:
A[:2, :2]

array([[1, 2],
       [4, 5]])

In [42]:
A[:2, 2:]

array([[3],
       [6]])

In [43]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [44]:
A[1] = np.array([10, 10, 10])

In [45]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [ 7,  8,  9]])

In [46]:
A[2] = 99

In [None]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [99, 99, 99]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Summary statistics

In [47]:
a = np.array([1, 2, 3, 4])

In [48]:
a.sum()

10

In [49]:
a.mean()

2.5

In [50]:
a.std()

1.118033988749895

In [51]:
a.var()

1.25

In [52]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [53]:
A.sum()

45

In [54]:
A.mean()

5.0

In [55]:
A.std()

2.581988897471611

In [56]:
A.sum(axis=0)

array([12, 15, 18])

In [57]:
A.sum(axis=1)

array([ 6, 15, 24])

In [58]:
A.mean(axis=0)

array([4., 5., 6.])

In [59]:
A.mean(axis=1)

array([2., 5., 8.])

In [60]:
A.std(axis=0)

array([2.44948974, 2.44948974, 2.44948974])

In [61]:
A.std(axis=1)

array([0.81649658, 0.81649658, 0.81649658])

And [many more](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.ndarray.html#array-methods)...

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Broadcasting and Vectorized operations

In [62]:
a = np.arange(4)

In [63]:
a

array([0, 1, 2, 3])

In [64]:
a + 10 # Aplicamos una suma a cada elemento del array, sin modificar el array original

array([10, 11, 12, 13])

In [65]:
a * 10

array([ 0, 10, 20, 30])

In [66]:
a 

array([0, 1, 2, 3])

In [67]:
a += 100 # Pero si aplicamos +=, sí cambia el array original

In [68]:
a

array([100, 101, 102, 103])

In [69]:
l = [0, 1, 2, 3]

In [70]:
[i * 10 for i in l]

[0, 10, 20, 30]

In [71]:
l

[0, 1, 2, 3]

In [72]:
l = [i * 10 for i in l]

In [73]:
l

[0, 10, 20, 30]

In [74]:
a = np.arange(4)

In [75]:
a

array([0, 1, 2, 3])

In [76]:
b = np.array([10, 10, 10, 10])

In [77]:
a + b

array([10, 11, 12, 13])

In [78]:
a * b

array([ 0, 10, 20, 30])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Boolean arrays
_(Also called masks)_

In [79]:
a = np.arange(4)

In [80]:
a

array([0, 1, 2, 3])

In [81]:
a[0], a[-1]

(0, 3)

In [82]:
a[[0, -1]]

array([0, 3])

In [86]:
a[[True, False, False, True]] # Selección booleana

array([0, 3])

In [87]:
a >= 2

array([False, False,  True,  True])

In [88]:
a[a >= 2] # Seleccionamos o FILTRAMOS aquellos elementos del array que sean mayores a o cumplan x condicion

array([2, 3])

In [89]:
a 

array([0, 1, 2, 3])

In [90]:
a.mean()

1.5

In [91]:
a[a > a.mean()]

array([2, 3])

In [92]:
a[~(a > a.mean())]

array([0, 1])

In [93]:
a[(a == 0) | (a == 1)]

array([0, 1])

In [94]:
a[(a <= 2) & (a % 2 == 0)]

array([0, 2])

In [95]:
A = np.random.randint(100, size=(3, 3))

In [96]:
A

array([[82, 77, 72],
       [40, 17, 65],
       [77, 37, 77]])

In [97]:
A[np.array([
    [True, False, True],
    [False, True, False],
    [True, False, True]
])]

array([82, 72, 17, 77, 77])

In [98]:
A > 30

array([[ True,  True,  True],
       [ True, False,  True],
       [ True,  True,  True]])

In [99]:
A[A > 30]

array([82, 77, 72, 40, 65, 77, 37, 77])

In [101]:
g = np.arange(5)
print(g <= 3)

[ True  True  True  True False]


![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Linear Algebra

In [102]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [103]:
B = np.array([
    [6, 5],
    [4, 3],
    [2, 1]
])

In [104]:
A.dot(B)

array([[20, 14],
       [56, 41],
       [92, 68]])

In [105]:
A @ B

array([[20, 14],
       [56, 41],
       [92, 68]])

In [106]:
B.T

array([[6, 4, 2],
       [5, 3, 1]])

In [107]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [108]:
B.T @ A

array([[36, 48, 60],
       [24, 33, 42]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Size of objects in Memory

### Int, floats

In [109]:
# An integer in Python is > 24bytes
sys.getsizeof(1)

28

In [110]:
# Longs are even larger
sys.getsizeof(10**100)

72

In [111]:
# Numpy size is much smaller
np.dtype(int).itemsize

8

In [112]:
np.dtype(float).itemsize

8

### Lists are even larger

In [115]:
# A one-element list
sys.getsizeof([1])

80

In [116]:
# An array of one element in numpy
np.array([1]).nbytes

8

### And performance is also important

In [117]:
l = list(range(1000))

In [118]:
a = np.arange(1000)

In [119]:
%time np.sum(a ** 2)

CPU times: user 86 µs, sys: 13 µs, total: 99 µs
Wall time: 105 µs


332833500

In [120]:
%time sum([x ** 2 for x in l])

CPU times: user 260 µs, sys: 39 µs, total: 299 µs
Wall time: 302 µs


332833500

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Useful Numpy functions

### `random` 

In [121]:
np.random.random(size=2)

array([0.17575622, 0.09948374])

In [122]:
np.random.normal(size=2)

array([2.14178793, 0.70699827])

In [123]:
np.random.rand(2, 4)

array([[0.41720585, 0.41946296, 0.66978185, 0.01831138],
       [0.25412758, 0.71774402, 0.2447797 , 0.43410497]])

---
### `arange`

In [124]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [125]:
np.arange(5, 10)

array([5, 6, 7, 8, 9])

In [126]:
np.arange(0, 1, .1)

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

---
### `reshape`

In [127]:
np.arange(10).reshape(2, 5)

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [128]:
np.arange(10).reshape(5, 2)

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

---
### `linspace`

In [6]:
np.linspace(0, 1, 5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [7]:
np.linspace(0, 1, 20)

array([0.        , 0.05263158, 0.10526316, 0.15789474, 0.21052632,
       0.26315789, 0.31578947, 0.36842105, 0.42105263, 0.47368421,
       0.52631579, 0.57894737, 0.63157895, 0.68421053, 0.73684211,
       0.78947368, 0.84210526, 0.89473684, 0.94736842, 1.        ])

In [8]:
np.linspace(0, 1, 20, False)

array([0.  , 0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 ,
       0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])

---
### `zeros`, `ones`, `empty`

In [9]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

In [10]:
np.zeros((3, 3))

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [11]:
np.zeros((3, 3), dtype=np.int)

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

In [12]:
np.ones(5)

array([1., 1., 1., 1., 1.])

In [13]:
np.ones((3, 3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [14]:
np.empty(5)

array([1., 1., 1., 1., 1.])

In [15]:
np.empty((2, 2))

array([[0.25, 0.5 ],
       [0.75, 1.  ]])

---
### `identity` and `eye`

In [16]:
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [17]:
np.eye(3, 3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [18]:
np.eye(8, 4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [19]:
np.eye(8, 4, k=1)

array([[0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [20]:
np.eye(8, 4, k=-3)

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.]])

In [21]:
"Hello World"[6]

'W'

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)