![alt text](data/img/celery_512.png "Celery")

### What is Celery?
* an open source asynchronous task queue/job queue based on distributed message passing
* "a task queue with batteries included"
* execution units called *tasks* are executed concurrently on one or more worker nodes
* tasks can execute asynchronously (in the background) or synchronously (wait until ready)

### Choosing a Broker
* Celery requires a solution to send and receive messages
  * typically this is performed via a separate service called a *message broker*
* there are several choices of broker available
  * `RabbitMQ`: feature-complete, stable, durable and easy to install
    * http://www.rabbitmq.com/install-standalone-mac.html
  * `Redis`:  also feature-complete, but  more susceptible to data loss in the event of abrupt termination or power failures http://docs.celeryproject.org/en/latest/getting-started/brokers/redis.html#broker-redis
  * Databases: not recommended, but can be sufficient for very small installations and testing purposes:
    * `SQLAlchemy`: http://docs.celeryproject.org/en/latest/getting-started/brokers/sqlalchemy.html#broker-sqlalchemy
    * `Django`: http://docs.celeryproject.org/en/latest/getting-started/brokers/django.html#broker-django
* We'll use `RabbitMQ`, as it's very common, robust, etc.
 

### Installing Celery
* __`pip3 install celery`__
* to test, try __`from celery import Celery`__
* install __`RabbitMQ`__
  * complete installation instructions at http://www.rabbitmq.com/download.html
  * on a Mac, it's easiest to use `Homebrew` to perform the installation http://www.rabbitmq.com/install-standalone-mac.html
  * start server __`/usr/local/sbin/rabbitmq-server`__

If you're using Docker, you can also do the following:

```
$ docker pull rabbitmq
$ docker run --rm -p 5672:5672 rabbitmq
```

In [1]:
!pip install -U celery

Looking in links: /home/rick446/src/wheelhouse
Collecting celery
  Downloading celery-5.1.2-py3-none-any.whl (401 kB)
[K     |████████████████████████████████| 401 kB 7.5 MB/s eta 0:00:01
Collecting billiard<4.0,>=3.6.4.0
  Downloading billiard-3.6.4.0-py3-none-any.whl (89 kB)
[K     |████████████████████████████████| 89 kB 8.7 MB/s  eta 0:00:01
Collecting kombu<6.0,>=5.1.0
  Downloading kombu-5.2.0-py3-none-any.whl (188 kB)
[K     |████████████████████████████████| 188 kB 16.1 MB/s eta 0:00:01
Collecting amqp<6.0.0,>=5.0.6
  Downloading amqp-5.0.6-py3-none-any.whl (53 kB)
[K     |████████████████████████████████| 53 kB 2.4 MB/s  eta 0:00:01
Installing collected packages: billiard, amqp, kombu, celery
  Attempting uninstall: billiard
    Found existing installation: billiard 3.6.3.0
    Uninstalling billiard-3.6.3.0:
      Successfully uninstalled billiard-3.6.3.0
  Attempting uninstall: amqp
    Found existing installation: amqp 5.0.1
    Uninstalling amqp-5.0.1:
      Successfull

In [2]:
import celery

### The Celery Application
* first we need a Celery *instance*, i.e., the Celery application, or "app"
* this instance is used as the entry-point for everything you want to do in Celery, e.g., creating tasks and managing workers
  * therefore it must be possible for other modules to import it
  * for now, we'll put everything in a single module called `tasks.py`

In [3]:
%%file data/celery_examples/tasks.py
from celery import Celery

app = Celery(__name__, broker='amqp://guest@localhost//')

@app.task
def add(x, y):
    return x + y

Overwriting data/celery_examples/tasks.py


* first argument to Celery is the name of the current module–__`tasks`__–this is needed so that names can be automatically generated
* second argument is the broker keyword argument which specifies the URL of the message broker you want to use

### Next: Running the Celery Worker Server
* in a production system, the worker would be run in the background, e.g., as a daemon (see http://docs.celeryproject.org/en/latest/tutorials/daemonizing.html)
  * for demonstration purposes, we'll just run it by hand in a separate window
    * __`celery -A celery_examples.tasks worker --loglevel=info`__
    * make sure you run it from the directory which contains `celery_examples`
* for more information, try
  * __`celery worker --help`__
  * __`celery --help`__

### Calling the task
* we will use the [__`delay()`__](http://docs.celeryproject.org/en/latest/reference/celery.app.task.html#celery.app.task.Task.delay) method to invoke the task
* it's a shortcut for the fully-featured [__`apply_async()`__](http://docs.celeryproject.org/en/latest/reference/celery.app.task.html#celery.app.task.Task.apply_async) method

In [4]:
import sys
sys.path.append('data')

In [5]:
from importlib import reload
from celery_examples import tasks
tasks = reload(tasks)

In [7]:
tasks.add(4, 12)

16

In [8]:
r = tasks.add.delay(4, 12)

In [9]:
print(r), type(r)

17de79ca-ee1e-4754-84bc-ad76b67e9f41


(None, celery.result.AsyncResult)

In [10]:
r.get()

NotImplementedError: No result backend is configured.
Please see the documentation for more information.

In [11]:
cat data/celery_examples/tasks.py

from celery import Celery

app = Celery(__name__, broker='amqp://guest@localhost//')

@app.task
def add(x, y):
    return x + y


* as above, calling __`delay()`__ returns an `AsyncResult` instance
  * can be used to check the state of the task, wait for the task to finish or get its return value (or if the task failed, the exception and traceback)
  * however, this isn’t enabled by default
    * you must configure Celery to use a *result backend*

### Keeping Results
* if you want to keep track of the tasks’ states, Celery needs to store or send the states somewhere
* there are several built-in result backends to choose from: `SQLAlchemy/Django ORM`, `Memcached`, `Redis`, `AMQP (RabbitMQ)`, and `MongoDB`–or you can define your own
* we will use the `rpc` result backend, which sends states back as transient messages (but does not store them)
* the backend is specified via the `backend` argument to Celery
* so we'll update `tasks.py` to specify the backend and then try again to see the result...

In [12]:
%%file data/celery_examples/tasks.py
from celery import Celery

app = Celery('tasks', backend='rpc://', broker='amqp://guest@localhost')

@app.task
def add(x, y):
    return x + y

Overwriting data/celery_examples/tasks.py


In [16]:
# before we excute this, we must restart the Celery worker
tasks = reload(tasks)
from celery_examples.tasks import add
result = add.delay(3.94, 5.27)
result

<AsyncResult: 2e41c5a3-c18c-42a5-a5b8-c71dd7365e1b>

In [17]:
result.ready()

False

In [18]:
result.get(timeout=10)

9.209999999999999

### What if the task raises an exception?
* __`get()`__ will re-raise the exception (unless you tell it not to)
* let's try an example...

In [19]:
%%file data/celery_examples/tasks.py
from celery import Celery

app = Celery('tasks', backend='rpc://', broker='amqp://guest@localhost')

@app.task
def add(x, y):
    return x + y

@app.task
def exc():
    raise ValueError

Overwriting data/celery_examples/tasks.py


In [20]:
from importlib import reload
tasks = reload(tasks)
# celery_examples.tasks = reload(celery_examples.tasks)
from celery_examples.tasks import exc

result = exc.delay()

In [21]:
result.get()

ValueError: 

In [29]:
%%time
from celery_examples.tasks import exc

try:
    result = exc.delay()
    result.get()
except Exception as err:
    print('Task threw an exception: %r' % err)

Task threw an exception: ValueError()
CPU times: user 0 ns, sys: 3.34 ms, total: 3.34 ms
Wall time: 6.28 ms


In [37]:
%%time
from celery_examples.tasks import exc, add

try:
    result = add.delay(1,2)
    result.get()
except Exception as err:
    print('Task threw an exception: %r' % err)

CPU times: user 2.77 ms, sys: 0 ns, total: 2.77 ms
Wall time: 5.65 ms


# Lab

Open the [Celery Lab][celery-lab]

[celery-lab]: ./celery-lab.ipynb