# Multiprocessing

**Pros**

- Separate memory space
- Code is usually straightforward
- Takes advantage of multiple CPUs & cores
- Avoids GIL limitations for cPython
- Eliminates most needs for synchronization primitives unless if you use shared memory (instead, it's more of a communication model for IPC)
- Child processes are interruptible/killable
- Python multiprocessing module includes useful abstractions with an interface much like threading.Thread
- A must with cPython for CPU-bound processing

**Cons**

- IPC a little more complicated with more overhead (communication model vs. shared memory/objects)
- Larger memory footprint

Source: https://stackoverflow.com/questions/3044580/multiprocessing-vs-threading-python

In [11]:
import multiprocessing
import numpy as np
multiprocessing.cpu_count()

8

Tienes 4 núcleos físicos, pero 8 procesadores lógicos. Porque en este caso tengo un procesador con hyper-threading.

In [2]:
%%time
#Codigo secuencial
from time import sleep

resultado = []

for i in range(8):
    sleep(1)
    resultado.append(i+1)

CPU times: user 921 µs, sys: 1.36 ms, total: 2.28 ms
Wall time: 8.02 s


In [3]:
%%time

from concurrent.futures import ProcessPoolExecutor
e = ProcessPoolExecutor()

def incrementar(x):
    sleep(1)
    return x + 1

resultado = list(e.map(incrementar,range(8)))

CPU times: user 20.9 ms, sys: 27.5 ms, total: 48.4 ms
Wall time: 1.05 s


In [4]:
multiprocessing.cpu_count()

8

###  Ejemplo1: Establecer comunicación entre procesos mediante la creación de un canal de comunicación
----


In [5]:
import os, random
from multiprocessing import Process, Pipe

In [6]:
def creador_valores(conn):
       value = random.randint(1, 10)
       conn.send(value)
       print('Valor [%d] enviado por el proceso PID [%d]' % (value, os.getpid()))
       conn.close()

In [7]:
def consumidor_valores(conn):
       print('Valor [%d] recibido por el proceso PID [%d]' % (conn.recv(),
       os.getpid()))

In [8]:
producer_conn, consumer_conn = Pipe()
consumer = Process(target=consumidor_valores,args=(consumer_conn,))
producer = Process(target=creador_valores,args=(producer_conn,))

In [9]:
consumer.start()
producer.start()
consumer.join()
producer.join()

Valor [6] enviado por el proceso PID [13258]
Valor [6] recibido por el proceso PID [13257]


### Ejercicio: Crear una función paralela que cree 100 archivos CSV con procesos
----

Función serial que resuelve el problema

In [13]:
%%time
np.random.seed(123)
x = np.random.poisson(100,(1000,1000))
for i_ in range(0, 100):
    np.savetxt('data_serial/x%06d.csv' % i_, x, delimiter=',', fmt='%d')

CPU times: user 14.5 s, sys: 561 ms, total: 15.1 s
Wall time: 15.1 s


### Crear la función paralela que resuelve el problema.

In [14]:
%%time

def crear_csv(i):
    x = np.random.poisson(100,(1000,1000))
    np.savetxt('data_paralelo/x%06d.csv' % i, x, delimiter=',', fmt='%d')
    

np.random.seed(123)
with ProcessPoolExecutor() as pool:
    pool.map(crear_csv, range(100))

CPU times: user 41.2 ms, sys: 34.5 ms, total: 75.7 ms
Wall time: 1.35 s


### Referencias
---

https://github.com/rsnemmen/parallel-python-tutorial/blob/master/Parallel%20Computing%20with%20Python%20public.ipynb

https://github.com/rsnemmen/parallel-python-tutorial

https://github.com/dask/dask-tutorial

https://docs.python.org/3/library/concurrent.futures.html