
# Tutorial de Numpy


"Numpy" es la abreviatura de "Numerical Python".

Numpy nos proporciona operaciones sobre arrays multi-dimensionales muy eficientes.

Numpy nos permite operar con Vectores o Matrices.

La características clave de Numpy son:
- ndarrays: array de n dimensiones con el mismo tipo de datos, optimizado para un cálculo rápido y eficiente.
- Broadcasting: herramienta que facilita la operación con arrays de diferentes dimensiones.
- Vectorización: permite operaciones aritméticas con ndarrays
- Input/Output: simplifica la lectura y escritura de datos en ficheros.

Otros Recursos sobre Numpy:
- Manual de Referencia: https://docs.scipy.org/doc/numpy-1.13.0/reference/
- Python for Data Analysis, Wes McKinney
- Python Data Science Handbook, Jave VanderPlas


# Introducción

**ndarrays** son arrays multi-dimensionales optimizados para un cálculo rápido y eficiente.


## Creación de un Vector o Array de 1 dimensión

In [1]:
import numpy as np
np.__version__

'1.14.2'

In [2]:
# Creamos un array de 1 dimensión
an_array = np.array([3, 33, 333])

print(type(an_array))

<class 'numpy.ndarray'>


In [3]:
# Consultamos las dimensiones del array
print(an_array.shape)

(3,)


In [4]:
# Accedemos al array con 1 indice, ya que tiene 1 dimensión
print(an_array[0], an_array[1], an_array[2]) 

3 33 333


In [5]:
# ndarrays son mutables, podemos modificar los elementos del array
an_array[0] = 888

print(an_array)

[888  33 333]


## Creación de una Matriz o Array de 2 dimensiones

Un Array de 2 dimensiones es una Matriz y proporciona todas las operaciones habituales para el cálculo matricial.

In [6]:
# Crear una Matriz
another = np.array([[11,12,13],[21,22,23]])

print(another)

print("Las dimensión de la mátriz creada es (filas, columnas):", another.shape)

print("Accediendo a los elementos [0,0], [0,1] y [1,0]: ", another[0,0], ", ", another[0,1], ", ", another[1,0])

[[11 12 13]
 [21 22 23]]
Las dimensión de la mátriz creada es (filas, columnas): (2, 3)
Accediendo a los elementos [0,0], [0,1] y [1,0]:  11 ,  12 ,  21



## Creación de Arrays

Podemos crear Arrays con diferentes funciones de Numpy.

In [7]:
import numpy as np

# Crear un array 2x2 con ceros
ex1 = np.zeros((2,2))      
print(ex1)                              

[[0. 0.]
 [0. 0.]]


In [8]:
# Crear un array 2x2 con 9.0
ex2 = np.full((2,2), 9.0)  
print(ex2)   

[[9. 9.]
 [9. 9.]]


In [9]:
# Crear una matriz identidad (con todos los elementos de la diagonal igual a 1)
ex3 = np.eye(2,2)
print(ex3)  

[[1. 0.]
 [0. 1.]]


In [10]:
# Crear un arrar con 1
ex4 = np.ones((1,2))
print(ex4)    

[[1. 1.]]


In [11]:
# El array ex4, que hemos creado, es un array de dimension 1x2
print(ex4.shape)

# Tenemos que acceder usando 2 indices
print()
print(ex4[0,1])

(1, 2)

1.0


In [12]:
# Crear un array con números aleatorios entre 0 y 1
ex5 = np.random.random((2,2))
print(ex5)    

[[0.74930351 0.38728933]
 [0.53446221 0.85097372]]


## Indexación de Arrays

La indexación nos permite obtener sub-arrays de un array.

In [13]:
import numpy as np

# Crear una matriz de dimensiones 3 x 4
an_array = np.array([[11,12,13,14], [21,22,23,24], [31,32,33,34]])
print(an_array)

[[11 12 13 14]
 [21 22 23 24]
 [31 32 33 34]]


In [14]:
# Obtener un subarray de 2 x 2
a_slice = an_array[:2, 1:3]
print(a_slice)

[[12 13]
 [22 23]]


In [16]:
# Modificar un subarray también modifica el array original
print("Before:", an_array[0, 1]) 
a_slice[0, 0] = 1000
print("After:", an_array[0, 1])
print(an_array)

Before: 1000
After: 1000
[[  11 1000   13   14]
 [  21   22   23   24]
 [  31   32   33   34]]


In [17]:
# Crear un nuevo array copiando de un array
an_array[0:1] = 12
a_slice = np.array(an_array[:2, 1:3])
print(a_slice)

[[12 12]
 [22 23]]


In [19]:
# Modificar una copia no modifica el array original
print("Before:", an_array[0, 1]) 
a_slice[0, 0] = 1000
print("After:", an_array[0, 1])
print(an_array)

Before: 12
After: 12
[[12 12 12 12]
 [21 22 23 24]
 [31 32 33 34]]


La indexación nos permite obtener filas o columnas de un array.

In [20]:
# Crear una matriz de dimensiones 3 x 4
an_array = np.array([[11,12,13,14], [21,22,23,24], [31,32,33,34]])
print(an_array)

[[11 12 13 14]
 [21 22 23 24]
 [31 32 33 34]]


In [21]:
# Obtener una Fila de un Array como Vector de dimensión 4
row_rank1 = an_array[1, :]

print(row_rank1, row_rank1.shape)
print(row_rank1[2])

[21 22 23 24] (4,)
23


In [22]:
# Obtener una Fila de un Array como Matriz de dimensión 1 x 4
row_rank2 = an_array[1:2, :]

print(row_rank2, row_rank2.shape)
print(row_rank2[0,2])

[[21 22 23 24]] (1, 4)
23


In [23]:
# Obtener una Columna de un Array
print()
col_rank1 = an_array[:, 1]
col_rank2 = an_array[:, 1:2]

print(col_rank1, col_rank1.shape)
print()
print(col_rank2, col_rank2.shape)


[12 22 32] (3,)

[[12]
 [22]
 [32]] (3, 1)


## Indexación Avanzada con Array de Indices

Nos permite acceder/modificar elementos de un array de indices

In [24]:
# Crear un nuevo array
an_array = np.array([[11,12,13], [21,22,23], [31,32,33], [41,42,43]])

print('Array:')
print(an_array)

Array:
[[11 12 13]
 [21 22 23]
 [31 32 33]
 [41 42 43]]


In [25]:
# Crear un array de indices
col_indices = np.array([0, 1, 2, 0])
print('\nIndices de Columna: ', col_indices)

row_indices = np.arange(4)
print('\nIndices de Fila : ', row_indices)


Indices de Columna:  [0 1 2 0]

Indices de Fila :  [0 1 2 3]


In [26]:
# Obtener un array de filas y columnas
for row,col in zip(row_indices,col_indices):
    print(row, ", ",col)

0 ,  0
1 ,  1
2 ,  2
3 ,  0


In [27]:
# Consultar los valores del array
print('Consultar valores del array: ',an_array[row_indices, col_indices])

Consultar valores del array:  [11 22 33 41]


In [31]:
# Modificar los valores del array
an_array[row_indices, col_indices] += 100000

print('\nArray:')
print(an_array)


Array:
[[100011     12     13]
 [    21 100022     23]
 [    31     32 100033]
 [100041     42     43]]


## Indexación Avanzada con Array de Booleanos

Nos permite acceder/modificar elementos de un array de Booleanos

In [32]:
# Crear un array de 3x2
an_array = np.array([[11,12], [21, 22], [31, 32]])
print(an_array)

[[11 12]
 [21 22]
 [31 32]]


In [33]:
# Crear un filtro o array de Booleanos
filter = (an_array > 15)
filter

array([[False, False],
       [ True,  True],
       [ True,  True]])

In [34]:
# Consultar valores de un array
print(an_array[filter])

[21 22 31 32]


In [35]:
# Crear un filtro
filter = ((an_array > 20) & (an_array < 30))
filter

array([[False, False],
       [ True,  True],
       [False, False]])

In [37]:
# Consultar valores de un array
print(an_array[filter])

[21 22]


In [38]:
# Consultar valores de un array
an_array[(an_array % 2 == 0)]

array([12, 22, 32])

In [39]:
# Modificar valores de un array
an_array[an_array % 2 == 0] +=100
print(an_array)

[[ 11 112]
 [ 21 122]
 [ 31 132]]


# Tipos de Datos y Operaciones con Arrays

## Tipos de Datos

Un **ndarray** tiene un tipo de datos único

In [40]:
ex1 = np.array([11, 12])
print(ex1.dtype)

int32


In [41]:
ex2 = np.array([11.0, 12.0])
print(ex2.dtype)

float64


In [42]:
# Asignación explícita del tipo de datos
ex3 = np.array([11, 21], dtype=np.int64)
print(ex3.dtype)

int64


In [43]:
# Asignación explícita del tipo de datos convirtiendo los reales a enteros
ex4 = np.array([11.1,12.7], dtype=np.int64)
print(ex4.dtype)
print()
print(ex4)

int64

[11 12]


In [44]:
# Asignación explícita del tipo de datos convirtiendo los enteros a reales
ex5 = np.array([11, 21], dtype=np.float64)
print(ex5.dtype)
print()
print(ex5)

float64

[11. 21.]


# Operaciones Aritméticas con Arrays

In [49]:
# Creamos 2 Arrays
x = np.array([[111,112],[121,122]], dtype=np.int)
y = np.array([[211.1,212.1],[221.1,222.1]], dtype=np.float64)

print(x)
print()
print(y)

[[111 112]
 [121 122]]

[[211.1 212.1]
 [221.1 222.1]]


In [50]:
# Suma
print(x + y)
print()
print(np.add(x, y))

[[322.1 324.1]
 [342.1 344.1]]

[[322.1 324.1]
 [342.1 344.1]]


In [51]:
# Resta
print(x - y)
print()
print(np.subtract(x, y))

[[-100.1 -100.1]
 [-100.1 -100.1]]

[[-100.1 -100.1]
 [-100.1 -100.1]]


In [52]:
# Multiplicación
print(x * y)
print()
print(np.multiply(x, y))

[[23432.1 23755.2]
 [26753.1 27096.2]]

[[23432.1 23755.2]
 [26753.1 27096.2]]


In [53]:
# División
print(x / y)
print()
print(np.divide(x, y))

[[0.52581715 0.52805281]
 [0.54726368 0.54930212]]

[[0.52581715 0.52805281]
 [0.54726368 0.54930212]]


In [54]:
# Raiz Cuadrada
print(np.sqrt(x))

[[10.53565375 10.58300524]
 [11.         11.04536102]]


In [55]:
# Exponencial (e ** x)
print(np.exp(x))

[[1.60948707e+48 4.37503945e+48]
 [3.54513118e+52 9.63666567e+52]]


# Operaciones Estadísticas con Arrays

In [56]:
# Crear una matriz aleatoría de 2 x 4
arr = 10 * np.random.randn(2,5)
print(arr)

[[  1.50680271  -3.08395944  -2.78047072 -12.82987482 -17.39867806]
 [ 17.74475698  18.43020237   4.98328475   2.03219823   6.33114154]]


In [57]:
# Calcular la Media
print(arr.mean())

1.493540353807532


In [58]:
# Calcular la Media por Filas
print(arr.mean(axis = 1))

[-6.91723607  9.90431677]


In [59]:
# Calcular la Media por Columnas
print(arr.mean(axis = 0))

[ 9.62577985  7.67312146  1.10140701 -5.39883829 -5.53376826]


In [60]:
# Calcular la Suma
print(arr.sum())

14.93540353807532


In [61]:
# Calcular la Mediana
print(np.median(arr, axis = 1))

[-3.08395944  6.33114154]


# Ordenación de Arrays

In [62]:
# Crear un Array de 10 elementos
unsorted = np.random.randn(10)

print(unsorted)

[-1.57002245 -0.75386872  0.88867619 -0.9122537   1.04876552 -0.58017708
 -1.15891518 -1.41069226  1.20629906  0.47568618]


In [63]:
# Crear una Copia y Ordenar
sorted = np.array(unsorted)
sorted.sort()

print(sorted)
print()
print(unsorted)

[-1.57002245 -1.41069226 -1.15891518 -0.9122537  -0.75386872 -0.58017708
  0.47568618  0.88867619  1.04876552  1.20629906]

[-1.57002245 -0.75386872  0.88867619 -0.9122537   1.04876552 -0.58017708
 -1.15891518 -1.41069226  1.20629906  0.47568618]


In [64]:
# Ordenar
unsorted.sort() 

print(unsorted)

[-1.57002245 -1.41069226 -1.15891518 -0.9122537  -0.75386872 -0.58017708
  0.47568618  0.88867619  1.04876552  1.20629906]


# Búsqueda de valores únicos en un Array

In [65]:
array = np.array([1,2,1,4,2,1,4,2])

print(np.unique(array))

[1 2 4]


# Operaciones de conjuntos con un Array

In [66]:
s1 = np.array(['desk','chair','bulb'])
s2 = np.array(['lamp','bulb','chair'])
print(s1, s2)

['desk' 'chair' 'bulb'] ['lamp' 'bulb' 'chair']


In [67]:
print( np.intersect1d(s1, s2) ) 

['bulb' 'chair']


In [68]:
print( np.union1d(s1, s2) )

['bulb' 'chair' 'desk' 'lamp']


In [69]:
# Elementos del conjunto s1 que no están en el conjunto s2
print( np.setdiff1d(s1, s2) )

['desk']


In [70]:
# ¿Qué elementos de s1 están en s2?
print( np.in1d(s1, s2) )

[False  True  True]


# *Broadcasting*: expansión o adaptación de matrices

Manual de Referencia: https://docs.scipy.org/doc/numpy-1.13.0/user/basics.broadcasting.html

In [71]:
import numpy as np

start = np.zeros((4,3))
print(start)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


In [72]:
# Crear una Fila con 3 valores
add_rows = np.array([1, 0, 2])
print(add_rows)

[1 0 2]


In [73]:
# Sumar Filas
y = start + add_rows
print(y)

[[1. 0. 2.]
 [1. 0. 2.]
 [1. 0. 2.]
 [1. 0. 2.]]


In [84]:
# Crear una Columna con 4 valores
add_cols = np.array([[0,1,2,3]])
# Calcular la matriz traspuesta
print(add_cols)
add_cols = add_cols.T
print(add_cols)

[[0 1 2 3]]
[[0]
 [1]
 [2]
 [3]]


In [80]:
# Sumar Columnas
y = start + add_cols 
print(y)

[[0. 0. 0.]
 [1. 1. 1.]
 [2. 2. 2.]
 [3. 3. 3.]]


In [76]:
# Sumar un escalar
add_scalar = np.array([1])  
print(start+add_scalar)

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


# Test de Velocidad: ndarrays vs lists

In [85]:
# Import Modules
from numpy import arange

# Definir Parámetros
size    = 1000000

In [86]:
# Crear un array con valores 0,1,2,...,size-1
nd_array = arange(size)
print( type(nd_array) )

<class 'numpy.ndarray'>


In [87]:
# Sumar los elementos de un Array
%timeit nd_array.sum()

543 µs ± 3.57 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


In [88]:
# Crear una lista con valores 0,1,2,...,size-1
a_list = list(range(size))
print (type(a_list) )

<class 'list'>


In [89]:
# Sumar los elementos de una Lista
%timeit sum(a_list)

32.6 ms ± 331 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


# Lectura/Escritura de datos en Ficheros

## Formato Binario

In [90]:
x = np.array([ 23.23, 24.24] )

In [91]:
np.save('an_array', x)

In [92]:
np.load('an_array.npy')

array([23.23, 24.24])

## Formato Texto

In [93]:
np.savetxt('array.txt', X=x, delimiter=',')

In [94]:
!more array.txt

2.323000000000000043e+01
2.423999999999999844e+01


In [95]:
np.loadtxt('array.txt', delimiter=',')

array([23.23, 24.24])

# Otras Operaciones con Arrays

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Dot Product on Matrices and Inner Product on Vectors:

</p>

In [96]:
# determine the dot product of two matrices
x2d = np.array([[1,1],[1,1]])
y2d = np.array([[2,2],[2,2]])

print(x2d.dot(y2d))
print()
print(np.dot(x2d, y2d))

[[4 4]
 [4 4]]

[[4 4]
 [4 4]]


In [97]:
# determine the inner product of two vectors
a1d = np.array([9 , 9 ])
b1d = np.array([10, 10])

print(a1d.dot(b1d))
print()
print(np.dot(a1d, b1d))

180

180


In [98]:
# dot produce on an array and vector
print(x2d.dot(a1d))
print()
print(np.dot(x2d, a1d))

[18 18]

[18 18]


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Sum:
</p>

In [99]:
# sum elements in the array
ex1 = np.array([[11,12],[21,22]])

print(np.sum(ex1))          # add all members

66


In [100]:
print(np.sum(ex1, axis=0))  # columnwise sum

[32 34]


In [101]:
print(np.sum(ex1, axis=1))  # rowwise sum

[23 43]


<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Element-wise Functions: </p>

For example, let's compare two arrays values to get the maximum of each.

In [102]:
# random array
x = np.random.randn(8)
x

array([ 0.60723309, -1.35490103, -0.1950453 , -1.86437296,  0.09986453,
       -0.13116881,  0.12942507, -2.16345661])

In [103]:
# another random array
y = np.random.randn(8)
y

array([ 1.42467667,  1.10539663,  1.41129685,  0.4645668 , -0.86487682,
       -0.6284248 , -1.34944244, -0.09095865])

In [104]:
# returns element wise maximum between two arrays

np.maximum(x, y)

array([ 1.42467667,  1.10539663,  1.41129685,  0.4645668 ,  0.09986453,
       -0.13116881,  0.12942507, -0.09095865])

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Reshaping array:
</p>

In [105]:
# grab values from 0 through 19 in an array
arr = np.arange(20)
print(arr)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]


In [106]:
# reshape to be a 4 x 5 matrix
arr.reshape(4,5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Transpose:

</p>

In [107]:
# transpose
ex1 = np.array([[11,12],[21,22]])

ex1.T

array([[11, 21],
       [12, 22]])

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Indexing using where():</p>

In [108]:
x_1 = np.array([1,2,3,4,5])

y_1 = np.array([11,22,33,44,55])

filter = np.array([True, False, True, False, True])

In [124]:
out = np.where(filter, x_1, y_1)
print(out)

[ 1 22  3 44  5]


In [110]:
mat = np.random.rand(5,5)
mat

array([[6.14545039e-01, 3.82405409e-01, 2.58493275e-01, 8.30656115e-01,
        6.48008676e-01],
       [8.50867250e-02, 1.05606286e-01, 3.98169225e-01, 7.26340748e-01,
        8.68952474e-01],
       [2.55671541e-04, 4.91524631e-01, 3.58913666e-01, 1.57380295e-01,
        8.30824609e-01],
       [7.91129371e-02, 5.44668518e-01, 5.37503879e-01, 9.58025910e-01,
        6.34321345e-01],
       [6.44521665e-01, 3.86899465e-01, 4.88299779e-01, 8.38419202e-01,
        4.60844525e-01]])

In [None]:
np.where( mat > 0.5, 1000, -1)

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

"any" or "all" conditionals:</p>

In [111]:
arr_bools = np.array([ True, False, True, True, False ])

In [125]:
print(arr_bools.any())

True


In [113]:
arr_bools.all()

False

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Random Number Generation:
</p>

In [114]:
Y = np.random.normal(size = (1,5))[0]
print(Y)

[-1.9622262   0.69792492  0.4825188   0.63138604 -1.83357443]


In [115]:
Z = np.random.randint(low=2,high=50,size=4)
print(Z)

[18 39 33 15]


In [116]:
np.random.permutation(Z) #return a new ordering of elements in Z

array([18, 33, 15, 39])

In [117]:
np.random.uniform(size=4) #uniform distribution

array([0.0113996 , 0.23363676, 0.13656596, 0.17155169])

In [118]:
np.random.normal(size=4) #normal distribution

array([-0.28284029,  3.34292783,  2.08901371, -0.6857033 ])

<p style="font-family: Arial; font-size:1.75em;color:#2462C0; font-style:bold"><br>

Merging data sets:
</p>

In [119]:
K = np.random.randint(low=2,high=50,size=(2,2))
print(K)

print()
M = np.random.randint(low=2,high=50,size=(2,2))
print(M)

[[18 27]
 [37 45]]

[[35 22]
 [24 13]]


In [120]:
np.vstack((K,M))

array([[18, 27],
       [37, 45],
       [35, 22],
       [24, 13]])

In [121]:
np.hstack((K,M))

array([[18, 27, 35, 22],
       [37, 45, 24, 13]])

In [122]:
np.concatenate([K, M], axis = 0)

array([[18, 27],
       [37, 45],
       [35, 22],
       [24, 13]])

In [123]:
np.concatenate([K, M.T], axis = 1)

array([[18, 27, 35, 24],
       [37, 45, 22, 13]])