In [1]:
#!pip install --upgrade celery

In [2]:
!sudo rabbitmqctl add_user myguest myguestpwd
!sudo rabbitmqctl set_permissions -p / myguest "." "." ".*"

Creating user "myguest" ...
Error: user_already_exists: myguest
Setting permissions for user "myguest" in vhost "/" ...


In [3]:
!sudo rabbitmqctl list_users

Listing users ...
guest	[administrator]
myguest	[]


This code will run on the **server** machine. It will **ask its worker machines** to complete some sorting tasks and send the results back to the server.

In [4]:
import random
import time
from celery import group
from mergesort_worker import sort, merge


In [5]:
# Create a list of some elements in random order.
sequence = list(range(800000))
random.shuffle(sequence)


In [6]:
# Split the sequence in a number of chunks and process those independently.
n = 4
l = len(sequence) // n
subseqs = [sequence[i * l:(i + 1) * l] for i in range(n - 1)]
subseqs.append(sequence[(n - 1) * l:])

In [7]:
len(subseqs)

4

In [8]:
for i in range(len(subseqs)):
    print('Lentght of sequence {}: {}'.format(i,len(subseqs[i])))

Lentght of sequence 0: 200000
Lentght of sequence 1: 200000
Lentght of sequence 2: 200000
Lentght of sequence 3: 200000


**Before you run the next cell, you will need to run th code on your worker machines with**

Use three or four workers.

**"celery -A mergesort_worker worker --loglevel=info"** <br>
or with a concurency number, e.g **"celery -A mergesort_worker worker --loglevel=info --concurrency=3"**

Then that machine will become a worker, and will be able to run the app task, i.e. the sort function, whenever the broker requests it.

**The below results will depend on your instance types and internet speed between the machines and rabbitMQ server.**



In [9]:

t0 = time.time()

# Ask the Celery workers to sort each sub-sequence.
# Use a group to run the individual independent tasks as a unit of work.

# celery.group creates a group of tasks to be executed in parallel.
# 'sort.s' is the signature of the sort function. This indicates that we want to call this function on the worker machines
# The group(fun.s) will run the function fun parallel on the worker machines

lazy_partials = group(sort.s(seq) for seq in subseqs)() # call remote workers to run the sort task parallel 
t1 = time.time()-t0

# We will wait till we get back all results from the workers
partials = lazy_partials.get() # will wait for the tasks to return

t2 = time.time()-t0

# Merge all the individual sorted sub-lists into our final result.
result = partials[0]
for partial in partials[1:]:
    result = merge(result, partial) # local merge the results back from the workers

t3 = time.time() - t0

print('Tasks sent to workers in %.02fs' % (t1))
print('Results from all the workers came back in %.02fs' % (t2))
print('Distributed mergesort took %.02fs' % (t3))



Tasks sent to workers in 0.98s
Results from all the workers came back in 6.60s
Distributed mergesort took 7.06s


In [10]:
# Do the same thing locally and compare the times.
t0 = time.time()

# Here we will call the 'sort' function witohut its signature 'sort.s' to indicate we want to run this locally.
truth = sort(sequence)
dt = time.time() - t0
print('Local mergesort took %.02fs' % (dt))



Local mergesort took 18.08s


**In this case local sort took longer time then parralel sort using the workers!**

In [11]:
# Final sanity checks.
assert result == truth
assert result == sorted(sequence)

# Yayyy sorting was successful

**Let us see some more tests**

In [12]:
# The below line just sends the tasks to the workers and ask them to run the tasks parallel on the worker machines.
# The group command make it possible to run these tasks parallel and put the results into the 'lazy_partials' variable.
lazy_partials = group(sort.s(seq) for seq in subseqs)() # call remote workers to run the sort task parallel 

print(len(lazy_partials))
for it, val in enumerate(lazy_partials): 
    print(f'iter {it}: {val}')

# We get the results back in a lazy way. The results have not been calculated yet!   

4
iter 0: d518d820-2415-427d-876b-45c21caeaba2
iter 1: bf1bab34-b5e1-45c3-a602-f8a531c125ed
iter 2: 817f306c-c25a-4223-b897-79dcb6f35f56
iter 3: d21df6f3-34ae-4804-a263-2dcbff5fa2da


In [13]:
# We broke our dataset to 4 parts, that is why we see 4 distributed tasks.
print(len(subseqs))

4


In [14]:
# We need to call the .get() function to get the final results from all the workers:
partials = lazy_partials.get()   

print(len(partials))
for i in range(len(partials)): 
    print(f'length of chunk {i}: {len(partials[i])}')

4
length of chunk 0: 200000
length of chunk 1: 200000
length of chunk 2: 200000
length of chunk 3: 200000


**Let us check the running time again!**

In [17]:
t0 = time.time()
lazy_partials = group(sort.s(seq) for seq in subseqs)() # call remote workers to run the sort task 
dt = time.time() - t0
print(' took %.02fs' % (dt))


 took 0.72s


It took this much time to communicate with the workes, but the results are not calculated yet.
In the background calculation continues...

In [18]:
t0 = time.time()
partials = lazy_partials.get() # will wait for the tasks to return
dt = time.time() - t0
print('took %.02fs to get the results back' % (dt))


took 4.45s to get the results back


We needed this much more time to get all the calculated results from the workers