![alt text](data/img/celery_512.png "Celery")

### What is Celery?
* an open source asynchronous task queue/job queue based on distributed message passing
* "a task queue with batteries included"
* execution units called *tasks* are executed concurrently on one or more worker nodes
* tasks can execute asynchronously (in the background) or synchronously (wait until ready)
* used by Instagram to process millions of tasks every day

# Python 3.7 warning!

Celery is not yet compatible with Python 3.7 because it uses the name `async` (which is now a reserved word in Python). Therefore we will need to run our examples in Python 3.6 or earlier.

### Choosing a Broker
* Celery requires a solution to send and receive messages
  * typically this is performed via a separate service called a *message broker*
* there are several choices of broker available
  * `RabbitMQ`: feature-complete, stable, durable and easy to install
    * http://www.rabbitmq.com/install-standalone-mac.html
  * `Redis`:  also feature-complete, but  more susceptible to data loss in the event of abrupt termination or power failures http://docs.celeryproject.org/en/latest/getting-started/brokers/redis.html#broker-redis
  * Databases: not recommended, but can be sufficient for very small installations and testing purposes:
    * `SQLAlchemy`: http://docs.celeryproject.org/en/latest/getting-started/brokers/sqlalchemy.html#broker-sqlalchemy
    * `Django`: http://docs.celeryproject.org/en/latest/getting-started/brokers/django.html#broker-django
* We'll use `RabbitMQ`, as it's very common, robust, etc.
 

### Installing Celery
* __`pip3 install celery`__
* to test, try __`from celery import Celery`__
* install __`RabbitMQ`__
  * complete installation instructions at http://www.rabbitmq.com/download.html
  * on a Mac, it's easiest to use `Homebrew` to perform the installation http://www.rabbitmq.com/install-standalone-mac.html
  * start server __`/usr/local/sbin/rabbitmq-server`__

If you're using Docker, you can also do the following:

```
$ docker pull rabbitmq
$ docker run --rm -p 5672:5672 rabbitmq
```

In [1]:
!pip3 install celery

Collecting celery
  Using cached https://files.pythonhosted.org/packages/e8/58/2a0b1067ab2c12131b5c089dfc579467c76402475c5231095e36a43b749c/celery-4.2.1-py2.py3-none-any.whl
Collecting pytz>dev (from celery)
  Using cached https://files.pythonhosted.org/packages/61/28/1d3920e4d1d50b19bc5d24398a7cd85cc7b9a75a490570d5a30c57622d34/pytz-2018.9-py2.py3-none-any.whl
Collecting billiard<3.6.0,>=3.5.0.2 (from celery)
[?25l  Downloading https://files.pythonhosted.org/packages/8b/b7/c2fe04f2522bb02d044347734eeda3ff5c7a632fa7d0401530a371ba73db/billiard-3.5.0.5.tar.gz (150kB)
[K    100% |████████████████████████████████| 153kB 1.3MB/s ta 0:00:01
[?25hCollecting kombu<5.0,>=4.2.0 (from celery)
[?25l  Downloading https://files.pythonhosted.org/packages/29/48/c709a54c8533ed46fd868e593782c6743da33614f8134b82bc0955455031/kombu-4.3.0-py2.py3-none-any.whl (183kB)
[K    100% |████████████████████████████████| 184kB 1.3MB/s ta 0:00:01
[?25hCollecting amqp<3.0,>=2.4.0 (from kombu<5.0,>=4.2.0->celery)


In [2]:
import celery

### The Celery Application
* first we need a Celery *instance*, i.e., the Celery application, or "app"
* this instance is used as the entry-point for everything you want to do in Celery, e.g., creating tasks and managing workers
  * therefore it must be possible for other modules to import it
  * for now, we'll put everything in a single module called `tasks.py`

In [4]:
cd data

/Users/rick446/src/arborian-classes/data


In [3]:
%%file \celery_examples/tasks.py
from celery import Celery

app = Celery('tasks', broker='amqp://guest@localhost//')

@app.task
def add(x, y):
    return x + y

Overwriting data/celery_examples/tasks.py


* first argument to Celery is the name of the current module–__`tasks`__–this is needed so that names can be automatically generated
* second argument is the broker keyword argument which specifies the URL of the message broker you want to use

### Next: Running the Celery Worker Server
* in a production system, the worker would be run in the background, e.g., as a daemon (see http://docs.celeryproject.org/en/latest/tutorials/daemonizing.html)
  * for demonstration purposes, we'll just run it by hand in a separate window
    * __`celery -A celery_examples.tasks worker --loglevel=info`__
    * make sure you run it from the directory which contains `tasks.py`
* for more information, try
  * __`celery worker --help`__
  * __`celery --help`__

### Calling the task
* we will use the [__`delay()`__](http://docs.celeryproject.org/en/latest/reference/celery.app.task.html#celery.app.task.Task.delay) method to invoke the task
* it's a shortcut for the fully-featured [__`apply_async()`__](http://docs.celeryproject.org/en/latest/reference/celery.app.task.html#celery.app.task.Task.apply_async) method

In [5]:
from importlib import reload
from celery_examples import tasks
tasks = reload(tasks)

In [6]:
r = tasks.add.delay(4, 12)


In [7]:
print(r), type(r)

d4ed937e-8f1c-450e-9601-ffc43f3ddf9f


(None, celery.result.AsyncResult)

In [8]:
r.get()

NotImplementedError: No result backend is configured.
Please see the documentation for more information.

In [9]:
cat celery_examples/tasks.py

from celery import Celery

app = Celery('tasks', broker='amqp://guest@localhost//')

@app.task
def add(x, y):
    return x + y


* as above, calling __`delay()`__ returns an `AsyncResult` instance
  * can be used to check the state of the task, wait for the task to finish or get its return value (or if the task failed, the exception and traceback)
  * however, this isn’t enabled by default
    * you must configure Celery to use a *result backend*

### Keeping Results
* if you want to keep track of the tasks’ states, Celery needs to store or send the states somewhere
* there are several built-in result backends to choose from: `SQLAlchemy/Django ORM`, `Memcached`, `Redis`, `AMQP (RabbitMQ)`, and `MongoDB`–or you can define your own
* we will use the `rpc` result backend, which sends states back as transient messages (but does not store them)
* the backend is specified via the `backend` argument to Celery
  * (or via the `CELERY_RESULT_BACKEND` setting if you choose to use a configuration module)
* so we'll update `tasks.py` to specify the backend and then try again to see the result...

In [10]:
%%file celery_examples/tasks.py
from celery import Celery

app = Celery('tasks', backend='rpc://', broker='amqp://guest@localhost')

@app.task
def add(x, y):
    return x + y

Overwriting celery_examples/tasks.py


In [11]:
# before we excute this, we must restart the Celery worker
tasks = reload(tasks)
from celery_examples.tasks import add
result = add.delay(3.94, 5.27)
result

SyntaxError: invalid syntax (rpc.py, line 20)

In [None]:
result.ready()

In [None]:
result.get(timeout=1)

### What if the task raises an exception?
* __`get()`__ will re-raise the exception (unless you tell it not to)
* let's try an example...

In [None]:
%%file celery_examples/tasks.py
from celery import Celery

app = Celery('tasks', backend='rpc://', broker='amqp://guest@localhost')

@app.task
def add(x, y):
    return x + y

@app.task
def exc():
    raise ValueError

In [None]:
tasks = reload(tasks)
from celery_examples.tasks import exc

result = exc.delay()
result.get()


In [None]:
from celery_examples.tasks import exc

try:
    result = exc.delay()
    result.get()
except Exception as err:
    print('Task threw an exception: %r' % err)

# Lab

Open the [Celery Lab][celery-lab]

[celery-lab]: ./celery-lab.ipynb