## Multiprocessing 

In [1]:
import multiprocessing as mp
print("number of cpu :", mp.cpu_count())

number of cpu : 8


In [2]:
def printing(data = 'asia'):
    print(data)

###  Creating a single process parallelly

1. Need to used start to start the process 
2. Need to use join to wait for the program to stoop and when its done, close the program otherwise need to kill with the task manager

In [3]:
proc = mp.Process(target = printing)
proc.start()
proc.join()

asia


### Creating a multiple process all printing differnet names 

1. The following implementation may not work parallely as it waits afterward for ending of one process .. and than going to another loop

In [4]:
names = ['asia', 'africa', 'india', 'antarctica']
for name in names:
    proc = mp.Process(target = printing, args=(name,))
    proc.start();
    proc.join()

asia
africa
india
antarctica


2. The following implementaion may work as the process are waited for after all the process have started 

In [5]:
names = ['asia', 'africa', 'india', 'antarctica']
prolist = []
for name in names:
    proc = mp.Process(target = printing, args=(name,))
    proc.start();
    prolist.append(proc)
    
for proc in prolist:
    proc.join()

asia
africa
india
antarctica


## For better demonstartion 
1. Below are some interating functions which are created for giving insights on more about the same problem as mentioned 

In [6]:
def printing(fixed = 'my', data = 5):
    for i in range(data): print(fixed, i)

In [7]:
for i in range(4):
    proc = mp.Process(target = printing, args = ('data'+str(i),5,))
    proc.start()
    proc.join()


data0 0
data0 1
data0 2
data0 3
data0 4
data1 0
data1 1
data1 2
data1 3
data1 4
data2 0
data2 1
data2 2
data2 3
data2 4
data3 0
data3 1
data3 2
data3 3
data3 4


2. Below is the correct implementation of process which are working multiple program

In [8]:
prolist = []
for i in range(4):
    proc = mp.Process(target = printing, args = ('data'+str(i),5,))
    prolist.append(proc)
    proc.start()
    
for proc in prolist:
    proc.join()

data0 0
data0 1
data0 2
data1 0
data0 3
data1 1
data0 4
data2 0
data1 2
data2 1
data3 0
data2 2
data3 1
data1 3
data2 3
data1 4
data3 2
data2 4
data3 3
data3 4


## Queue in Multiprocessing

->this queue is used by multiprocessing library which keeps on waiting for get() until something is not being put() --> wheare as in normal queue(), this thing is not possible, it will simply say that the queue is empty.

Wheaas same thing could be replicated like that of normal queue by queue().get_nowait()

In [9]:
import queue as Q
queue = mp.Queue()
for i in range(5):
    queue.put(i)
while not queue.empty():
    print(queue.get())
while(True):
    try:
        print(queue.get_nowait())
    except Q.Empty:  #this acts like queue is ended 
        break

### Lock

--> if there are multipule process using same segment of code, then one can use lock so than only the one who aquired the lock can access the code.

--> Remaining can only access the code only when realeased

acquire() and release()

In [10]:
try:
    print("nation")
except KeyboardInterrupt:
    print("Intrupt")
else:
    print(" Else Statement Executed ")   # else statement occurs
    #if there is no exception is found in try

nation
 Else Statement Executed 


##### All the child process will get different process id if we include line 14 as there is a time sleep. Due to which the pc will shift its focus to another process meanwhile

In [23]:
import queue as Q
import time
def task_scheduler(done_process, to_be_done_process):
    while True:
        #will run until the process list is not empty
        try:
            task = to_be_done_process.get_nowait()
        except Q.Empty:
            break
        else:
            # will be executed if there is no exception 
            print(task, mp.current_process())
            done_process.put(task)
            time.sleep(0.5) #if there is a time --> then the process will shift to another
            #since there is a waiting time --> theory confirmed

In [22]:
done_process = mp.Queue()
to_be_done_process = mp.Queue()

for i in range(10):
    to_be_done_process.put(i)
    
processes = []
for i in range(10):
    proc = mp.Process(target= task_scheduler, args=(done_process, to_be_done_process))
    processes.append(proc)
    proc.start()
    
for proc in processes:
    proc.join()

while True:
    try:
        print(" executed process {}".format(done_process.get_nowait()))
    except Q.Empty:
        break

0 <Process(Process-58, started)>
1 <Process(Process-59, started)>
2 <Process(Process-60, started)>
3 <Process(Process-61, started)>
4 <Process(Process-62, started)>
5 <Process(Process-63, started)>
6 <Process(Process-64, started)>
7 <Process(Process-65, started)>
8 <Process(Process-66, started)>
9 <Process(Process-67, started)>
 executed process 0
 executed process 1
 executed process 2
 executed process 3
 executed process 4
 executed process 5
 executed process 6
 executed process 7
 executed process 8
 executed process 9


## Pooling Process
Basically dividing each of the process to multiple pools --> load balancing 
* the same is used for creating a pool of parallel process such that it keeps on accepting the argument till the list is not over.

In [207]:
work = [['A',2],['B',3],['C',1],['D',2],['F',1],['G',1]]

def workdone(working_data):
    print('working for {} for timming {}'.format(working_data[0],working_data[1]))
    print(mp.current_process())
    time.sleep(working_data[1])
    print(' Process finished {} with current id {}'.format(working_data[0], mp.current_process()))
    return mp.current_process().pid

In [208]:
def pool_handler():
    p = mp.Pool(3)  #means 2 process will be executing parallely
    h = p.map(workdone, work)  #maping workdone function to argument work
    print(list(h))
pool_handler()

working for B for timming 3
working for A for timming 2
working for C for timming 1
<ForkProcess(ForkPoolWorker-1174, started daemon)>
<ForkProcess(ForkPoolWorker-1176, started daemon)>
<ForkProcess(ForkPoolWorker-1175, started daemon)>
 Process finished C with current id <ForkProcess(ForkPoolWorker-1176, started daemon)>
working for D for timming 2
<ForkProcess(ForkPoolWorker-1176, started daemon)>
 Process finished A with current id <ForkProcess(ForkPoolWorker-1174, started daemon)>
working for F for timming 1
<ForkProcess(ForkPoolWorker-1174, started daemon)>
 Process finished B with current id <ForkProcess(ForkPoolWorker-1175, started daemon)>
working for G for timming 1
<ForkProcess(ForkPoolWorker-1175, started daemon)>
 Process finished F with current id <ForkProcess(ForkPoolWorker-1174, started daemon)>
 Process finished D with current id <ForkProcess(ForkPoolWorker-1176, started daemon)>
 Process finished G with current id <ForkProcess(ForkPoolWorker-1175, started daemon)>
[265

Process ForkPoolWorker-1175:
Process ForkPoolWorker-1174:
Process ForkPoolWorker-1176:
  File "/home/oxygen/anaconda3/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
  File "/home/oxygen/anaconda3/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/oxygen/anaconda3/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/home/oxygen/anaconda3/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/oxygen/anaconda3/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/home/oxygen/anaconda3/lib/python3.7/multiprocessing/pool.py", line 110, in worker
    task = get()
  File "/home/oxygen/anaconda3/lib/python3.7/multiprocessing/pool.py", line 110, in worke

### Testing if the same can be used with gym 

In [192]:
import gym
gym.logger.set_level(40)
# from collections import namedtuple
import multiprocessing as mp

data_storage = mp.Queue()
# exp = namedtuple('Experience',['process_id','next_state','reward'])


In [193]:
def worker(data_af):
    data = []
    exp = namedtuple('Experience',['process_id'])
    print("Process started for pid ",mp.current_process().pid)
    env = gym.make('CartPole-v0')
    state = env.reset()
    while True:
        action = env.action_space.sample()
        next_state, reward, done, _ = env.step(action)
#         data.append(exp(mp.current_process(), next_state, reward))
        data.append([mp.current_process().pid,next_state,reward])
        state = next_state
        if done:
            break
    data_af.put(data) 

In [194]:
processes = []
nos = 20
for i in range(nos):
    proc = mp.Process(target=worker, args=(data_storage,))
    processes.append(proc)
    proc.start()
    
for proc in processes:
    proc.join()
    
while not data_storage.empty():
    g  = data_storage.get_nowait()
    print(len(g))
    print()
    print(g)
    print()


Process started for pid  26255
Process started for pid  26258
Process started for pid  26259
Process started for pid  26264
Process started for pid  26267
Process started for pid  26270
Process started for pid  26277
Process started for pid  26273
Process started for pid  26283
Process started for pid  26286
Process started for pid  26289
Process started for pid  26298
Process started for pid  26292
Process started for pid  26294
Process started for pid  26306
Process started for pid  26307
Process started for pid  26316
Process started for pid  26311
Process started for pid  26314
Process started for pid  26324
8

[[26255, array([ 0.02186186, -0.23642184,  0.04666768,  0.34097024]), 1.0], [26255, array([ 0.01713342, -0.43217564,  0.05348708,  0.64799657]), 1.0], [26255, array([ 0.00848991, -0.62800036,  0.06644701,  0.95703128]), 1.0], [26255, array([-0.0040701 , -0.82394991,  0.08558764,  1.26982803]), 1.0], [26255, array([-0.0205491 , -1.020054  ,  0.1109842 ,  1.58803949]), 1.0], [

## Multiple rendering using multiple process

In [2]:
import gym
gym.logger.set_level(40)
# from collections import namedtuple
import multiprocessing as mp
# exp = namedtuple('Experience',['process_id','next_state','reward'])

In [13]:
data_storage = mp.Queue()
def worker(data_af):
    data = []
    print("Process started for pid ",mp.current_process().pid)
    env = gym.make('CartPole-v0')
    for i in range(10):
        state = env.reset()
        while True:
            env.render()
            action = env.action_space.sample()
            next_state, reward, done, _ = env.step(action)
    #         data.append(exp(mp.current_process(), next_state, reward))
            data.append([mp.current_process().pid,next_state,reward])
            state = next_state
            if done:
                break
        print('\n {} length \n'.format(len(data)))
    env.close()
    data_af.put(data)

In [14]:
processes = []
nos = 1
for i in range(nos):
    proc = mp.Process(target=worker, args=(data_storage,))
    processes.append(proc)
    proc.start()
    
while not data_storage.empty():
    g  = data_storage.get_nowait()
    print(len(g))
    print()
    print(g)
    print()
    
for proc in processes:
    proc.join()

Process started for pid  30468

 13 length 


 30 length 


 41 length 


 53 length 


 79 length 


 130 length 


 172 length 


 198 length 


 213 length 


 229 length 



* Data is not being passed from the queue as there is limit in queue ( the amount of data that can be stored in a queue at a time is fixed).
* Thus before we can add the data, data should be removed so that function can be stored properly
* Global datas are not acessible by the every child and parent. if any modifiction made by the child process may not be accessibe to parent process
* We can use shared memeory such as queue, pipe, and manager.

## Communication using Manager 

* `Multiprocessing`, Whenever process starts, another set of process startes namely Manger process.
* Any new process which parent process wants to create can be done by communicating with the manager.
* The data which is stored and maintained by the manager is accessible by remaining remainied set of process.
* Manger is not an effective way of communicaton
* Thus for this `Queue` and `pipe` is always an effective way of communicaiton

In [128]:
## testing manager
import multiprocessing as mp

def printing(data):
    print(' Printing {} by child process {}'.format(data,mp.current_process().pid))
    
def insert_record(record, records):
    records.append(record)
    
with mp.Manager() as manager:
    record = ['general',11]
    records = manager.list([('Sam', 10), ('Adam', 9), ('Kevin',9)]) 
    p1 = mp.Process(target = insert_record, args=(record, records,))
    p2 = mp.Process(target = printing, args=(record,))
    p1.start()
    p1.join()
    p2.start()
    p2.join()
    print(records)
    

 Printing ['general', 11] by child process 5409
[('Sam', 10), ('Adam', 9), ('Kevin', 9), ['general', 11]]


### Creating Multiple instance of gym environment parallely and storing in another process and making data globally available 

In [1]:
import multiprocessing as mp
from collections import namedtuple

def worker(data_af):
    print("Process started for pid ",mp.current_process().pid)
    env = gym.make('CartPole-v0')
    for i in range(10):
        state = env.reset()
        while True:
            env.render()
            action = env.action_space.sample()
            next_state, reward, done, _ = env.step(action)
    #         data.append(exp(mp.current_process(), next_state, reward))
            data_af.append([mp.current_process().pid,state,next_state,reward,action])
            state = next_state
            if done:
                break
    env.close()

In [3]:
nos = 10
process = []
data = [] 
exp = namedtuple('Experience',['pid','state','next_state','reward','action'])
with mp.Manager() as manager:
    records = manager.list([]) 
    for i in range(nos):
        proc = mp.Process(target = worker, args=(records,))
        process.append(proc)
        proc.start()
    
    for proc in process:
        proc.join()

    print(len(records))
    data = [exp(x[0],x[1],x[2],x[3],x[4]) for x in records]

Process started for pid  4544
Process started for pid  4546
Process started for pid  4558
Process started for pid  4550
Process started for pid  4554
Process started for pid  4566
Process started for pid  4562
Process started for pid  4578
Process started for pid  4570
Process started for pid  4574
2076


## Pipes 

In [4]:
import multiprocessing as mp

In [22]:
import multiprocessing as mp
from collections import namedtuple
import gym
gym.logger.set_level(40)

for_nos = 10

def worker(env,child):
    global for_nos 
    print("Process started for pid ",mp.current_process().pid)
#     exp = namedtuple('experience',['state','next_state','reward','action','done'])
    for i in range(for_nos):
        state = env.reset()
        while True:
            action = env.action_space.sample()
            next_state, reward, done, _ = env.step(action)
            child.send([state, next_state, reward, action, done])
            state = next_state
            if done:
                env.close()
                break
    env.close()

In [23]:
parent,child = mp.Pipe()
env = gym.make('CartPole-v0')
proc = mp.Process(target = worker, args = (env,child,))
proc.start()
for i in range(10):
    while True:
        data = parent.recv()
        print(data)
        if data[-1] is True:
            break
proc.join()

Process started for pid  5980
[array([-0.04655143,  0.03988379, -0.0288813 ,  0.00829133]), array([-0.04575376,  0.23540778, -0.02871547, -0.29336226]), 1.0, 1, False]
[array([-0.04575376,  0.23540778, -0.02871547, -0.29336226]), array([-0.0410456 ,  0.43092712, -0.03458272, -0.59496156]), 1.0, 1, False]
[array([-0.0410456 ,  0.43092712, -0.03458272, -0.59496156]), array([-0.03242706,  0.6265156 , -0.04648195, -0.89833414]), 1.0, 1, False]
[array([-0.03242706,  0.6265156 , -0.04648195, -0.89833414]), array([-0.01989675,  0.82223571, -0.06444863, -1.20525798]), 1.0, 1, False]
[array([-0.01989675,  0.82223571, -0.06444863, -1.20525798]), array([-0.00345203,  0.62800316, -0.08855379, -0.93344882]), 1.0, 0, False]
[array([-0.00345203,  0.62800316, -0.08855379, -0.93344882]), array([ 0.00910803,  0.82420101, -0.10722277, -1.25259324]), 1.0, 1, False]
[array([ 0.00910803,  0.82420101, -0.10722277, -1.25259324]), array([ 0.02559205,  0.63060346, -0.13227463, -0.99532774]), 1.0, 0, False]
[arr

### Process to run multiple processes and collect data in one part

* The following code will first aggrigate all the data of single interation from multiplie process and then insert it into a single list
* The same can be used for A2C Reinforcmet Policy Network

In [62]:
import multiprocessing as mp
from collections import namedtuple
import gym
gym.logger.set_level(40)

for_nos = 10

def worker(env,child):
    global for_nos 
    print("Process started for pid ",mp.current_process().pid)
    for i in range(for_nos):
        state = env.reset()
        while True:
            action = env.action_space.sample()
            next_state, reward, done, _ = env.step(action)
            child.send([state, next_state, reward, action, done])
            state = next_state
            if done:
                env.close()
                break
    env.close()

In [65]:
nos = 50
process = []
parent_collection = []
main_data = [] 
for i in range(nos):
    parenti,child = mp.Pipe()
    env = gym.make('CartPole-v0')
    proc = mp.Process(target = worker, args = (env,child,))
    proc.daemon = True
    proc.start()
    process.append(proc)
    parent_collection.append(parenti)

flag = True
while flag:
    try:
        main_data.append([data.recv() for data in parent_collection])
    except EOFError:
        break


Process started for pid  7402
Process started for pid  7405
Process started for pid  7408
Process started for pid  7411
Process started for pid  7414
Process started for pid  7420
Process started for pid  7425
Process started for pid  7428
Process started for pid  7431
Process started for pid  7418
Process started for pid  7433
Process started for pid  7438
Process started for pid  7441
Process started for pid  7417
Process started for pid  7444
Process started for pid  7445
Process started for pid  7450
Process started for pid  7453
Process started for pid  7456
Process started for pid  7459
Process started for pid  7462
Process started for pid  7465
Process started for pid  7468
Process started for pid  7471
Process started for pid  7474
Process started for pid  7477
Process started for pid  7480
Process started for pid  7483
Process started for pid  7486
Process started for pid  7489
Process started for pid  7495
Process started for pid  7496
Process started for pid  7501
Process st

In [67]:
len(main_data[0])

50