Example using the Eventlet Pool
This is a Celery application containing two example tasks.
First you need to install Eventlet, and also recommended is the dnspython module (when this is installed all name lookups will be asynchronous):
$ pip install eventlet $ pip install dnspython
Before you run any of the example tasks you need to start celeryd:
$ cd examples/eventlet $ celeryd -l info --concurrency=500 --pool=eventlet
As usual you need to have RabbitMQ running, see the Celery getting started guide if you haven't installed it yet.
This task simply makes a request opening the URL and returns the size of the response body:
$ cd examples/eventlet $ python >>> from tasks import urlopen >>> urlopen.delay("http://www.google.com/").get() 9980
To open several URLs at once you can do:
$ cd examples/eventlet $ python >>> from tasks import urlopen >>> from celery.task.sets import TaskSet >>> result = TaskSet(urlopen.subtask((url, )) ... for url in LIST_OF_URLS).apply_async() >>> for incoming_result in result.iter_native(): ... print(incoming_result, )
This is a simple recursive web crawler. It will only crawl URLs for the current host name. Please see comments in the webcrawler.py file.