GitHub - heizelnut/hawkloon: 🦅 Python framework to build synchronous workers.

Hawkloon (/'hɔːkluːn/) is a Python framework to build synchronous workers running on scalable infrastructures.

How it works

To understand how it works, you should understand the concepts of Worker and Job first.

Worker

A Worker is a piece of code that runs and completes a task, kinda like a school kid doing his homework.

Job

A Job is a resource, a piece of data given to the Worker. This is the homework I said before.

The procedure

You write a list of Jobs, and this is given to every Worker.
A Worker can split the tasks between some threads.
When a Job ends, the thread tells to the others, so they can skip it.

Hawkloon uses Redis (website) to keep track of the jobs' states between threads & workers (so you can start a worker on two different machines).

Installation

To install it, clone the repo and use the setup.py file.

git clone https://github.com/heizelnut/hawkloon
cd hawkloon/src
python3 setup.py install

Example

Let's write a dead simple program that downloads cat images from the internet.

First, write a list of links (jobs) the Worker will have to consume.

jobs = (
    "https://i.imgur.com/uvFEcJN.jpg",
    "https://i.imgur.com/6qL2HSN.jpg",
    "https://i.imgur.com/dRxnay8.jpg",
    "https://i.imgur.com/aAuTHLe.jpg",
    "https://i.imgur.com/SpCbHBI.jpg"
)

Now you can choose two ways to declare workers: with Class Inheritance or Decorators.

Class Inheritance

Inherit from the Worker class and choose the number of threads.

from hawk import Worker

class CatWorker(Worker):
    THREADS = 3 # Default is 2
    pass

Alright, now overwrite the Worker.consume method by adding the ability to actually download some images.

from hawk import Worker
import random, requests

class CatWorker(Worker):
    THREADS = 3
    
    # Requests the image and saves it in binary form
    def consume(self, job):
        img = requests.get(job, allow_redirects=True).content

        with open(f"{random.randint(0, 10000)}.jpg", "wb") as f:
            f.write(img)

Then, instiantiate the worker and connect it to a Redis server.

worker = CatWorker(jobs)

worker.connect("redis://localhost:6379")

After that, start it!

worker.start()

Decorator Style

Import the Worker class and instantiate it.

from hawk import Worker

cat_worker = Worker(jobs)

Then, create a function that downloads cat images, and decorate it with the new instantiated object passing the amount of threads to use.

@cat_worker(threads=3)
def download(job):
    img = requests.get(job, allow_redirects=True).content

    with open(f"{random.randint(0, 10000)}.jpg", "wb") as f:
        f.write(img)

The decorator function must have only one argument.

After that, connect and start crunching data.

worker.connect("redis://localhost:6379")

worker.start()

Contributing

If you'd like to contribute, you're free to do so! Fork my project and then pull request me.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
docs		docs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How it works

Worker

Job

The procedure

Installation

Example

Class Inheritance

Decorator Style

Contributing

About

Releases

Packages

Contributors 2

Languages

License

heizelnut/hawkloon

Folders and files

Latest commit

History

Repository files navigation

How it works

Worker

Job

The procedure

Installation

Example

Class Inheritance

Decorator Style

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages