Skip to content

Commit

Permalink
Updated CHANGELOG, README
Browse files Browse the repository at this point in the history
  • Loading branch information
codepr committed Apr 27, 2019
1 parent 628bb87 commit 312c43c
Show file tree
Hide file tree
Showing 3 changed files with 99 additions and 103 deletions.
89 changes: 37 additions & 52 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,113 +1,98 @@
### 1.1.8
(Apr 26, 2019)
## 0.10.2 - Apr 26, 2019

- Refactored CLI commands and logging format
- Added factory methods to supervisors
- Fixed bug in rabbitmq backend module

### 1.1.7
(Apr 26, 2019)
## 0.10.1 - Apr 26, 2019

- Moved factory logic for client creation to from_url method on client module
- Added TasqFuture result from clients result, to return more structured results
with additional informations about execution.

### 1.1.6
(Apr 22, 2019)
## 0.10.0 - Apr 22, 2019

- Added a TasqQueue class for more convenient uses
- Fixed some bugs

(Apr 22, 2019)

- Renamed `master` -> `supervisor`
- Added RabbitMQ to supported backends, still working on a common interface
- Refactored some parts on connection

### 1.1.0
(Mar 23, 2019)
## 0.9.0 - Mar 23, 2019

- Refactored log system
- Started backend broker support for job queues and persistence
- Add redis client

### 1.0.1
(Jul 15, 2018)
## 0.8.0 - Jul 15, 2018

- Added repeated jobs capabilities to process/thread queue workers too (Previously only Actor
worker could achieve that)
- Fixed some bugs, renamed `ProcessWorker` -> `QueueWorker` and `ProcessMaster` -> `QueueMaster`
- Added repeated jobs capabilities to process/thread queue workers too
(Previously only Actor worker could achieve that)
- Fixed some bugs, renamed `ProcessWorker` -> `QueueWorker` and
`ProcessMaster` -> `QueueMaster`

### 1.0.0
(Jul 14, 2018)
## 0.7.0 - Jul 14, 2018

- Added the possibility to choose the type of workers of each master process, can be either a pool
of actors or a pool of processes, based on the nature of the majority of the jobs that need to be
executed. A majority of I/O bound operations should stick to `ActorMaster` type workers, in case
of CPU bound tasks `QueueMaster` should give better results.
- Added the possibility to choose the type of workers of each master process,
can be either a pool of actors or a pool of processes, based on the nature of
the majority of the jobs that need to be executed. A majority of I/O bound
operations should stick to `ActorMaster` type workers, in case of CPU bound
tasks `QueueMaster` should give better results.

### 0.9.0
(May 18, 2018)
## 0.6.1 - May 18, 2018

- Decoupled connection handling from tasq.remote.master and tasq.remote.client into a dedicated
module tasq.remote.connection
- Decoupled connection handling from `tasq.remote.master` and `tasq.remote.client`
into a dedicated module tasq.remote.connection

### 0.8.0
(May 17, 2018)
## 0.6.0 - May 17, 2018

- Simple implementation of digital signed data sent through sockets, this way sender and receiver
have a basic security layer to check for integrity and legitimacy of received data
- Simple implementation of digital signed data sent through sockets, this way
sender and receiver have a basic security layer to check for integrity and
legitimacy of received data

### 0.7.0
(May 14, 2018)
## 0.5.0 - May 14, 2018

- Added a ClientPool implementation to schedule jobs to different workers by using routers
capabilities
- Added a ClientPool implementation to schedule jobs to different workers by
using routers capabilities

### 0.6.0
(May 6, 2018)
## 0.4.0 - May 6, 2018

- Refactored client code, now it uses a Future system to handle results and return a future even
while scheduling a job in a non-blocking manner
- Refactored client code, now it uses a Future system to handle results and
return a future even while scheduling a job in a non-blocking manner
- Improved logging
- Improved representation of a Job in string

### 0.5.0
(May 5, 2018)
## 0.3.0 - May 5, 2018

- Added first implementation of delayed jobs
- Added first implementation of interval-scheduled jobs
- Added a basic ActorSystem like and context to actors
- Refactored some parts, removed Singleton and Configuration classes from __init__.py

### 0.3.0:
(May 1, 2018)
- Refactored some parts, removed Singleton and Configuration classes from
__init__.py

## 0.2.1 - May 1, 2018

- Fixed minor bug in initialization of multiple workers on the same node
- Added support for pending tasks on the client side

### 0.2.0:
(Apr 30, 2018)
## 0.2.0 - Apr 30, 2018

- Renamed some modules
- Added basic logging to modules
- Defined a client supporting sync and async way of scheduling jobs
- Added routing logic for worker actors
- Refactored code

### 0.1.2
(Apr 29, 2018)
## 0.1.2 - Apr 29, 2018

- Added asynchronous way of handling communication on ZMQ sockets

### 0.1.1:
(Apr 28, 2018)
## 0.1.1 - Apr 28, 2018

- Switch to PUSH/PULL pattern offered by ZMQ
- Subclassed ZMQ sockets in order to handle cloudpickle serialization

### 0.1.0:

(Apr 26, 2018)
## 0.1.0 - Apr 26, 2018

- First unfinished version, WIP
103 changes: 56 additions & 47 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,25 @@
Tasq
====

Very simple distributed Task queue that allow the scheduling of job functions to be
executed on local or remote workers. Can be seen as a Proof of Concept leveraging ZMQ sockets and
cloudpickle serialization capabilities as well as a very basic actor system to handle different
loads of work from connecting clients.
Very simple distributed Task queue that allow the scheduling of job functions
to be executed on local or remote workers. Can be seen as a Proof of Concept
leveraging ZMQ sockets and cloudpickle serialization capabilities as well as a
very basic actor system to handle different loads of work from connecting
clients. Originally it was meant to be just a brokerless job queue, recently
I dove deeper on the topic and decided to add support for job persistence and
extensions for Redis/RabbitMQ middlewares as well.

Currently Tasq supports a brokerless approach through ZMQ sockets or Redis/RabbitMQ as backends.
The main advantage of using a brokerless task queue, beside latencies is the
lower level of complexity of the system. Additionally Tasq offer the
possibility of launching and forget some workers on a network and schedule jobs
to them without having them to know nothing about the code that they will run,
allowing to define tasks dinamically, without stopping the workers. Obviously
this approach open up more risks of malicious code to be injected to the
workers, currently the only security measure is to sign serialized data passed
to workers, but the entire system is meant to be used in a safe environment.

The main advantage of using a brokerless task queue, beside latencies is the possibility of launch
and forget some workers on a network and schedule jobs to them without having them to know nothing
about the code that they will run, allowing to define tasks dinamically, without stopping the
workers. Obviously this approach open up more risks of malicious code to be injected to the workers,
currently the only security measure is to sign serialized data passed to workers, but the entire
system is meant to be used in a safe environment.

**NOTE:** The project is still in development stage and it's not advisable to try it in
production enviroments.
**NOTE:** The project is still in development stage and it's not advisable to
try it in production enviroments.



Expand All @@ -34,7 +37,7 @@ In a python shell
**Using a queue object**

```
Python 3.7.3 (default, Mar 26 2019, 21:43:19)
Python 3.7.3 (default, Apr 26 2019, 21:43:19)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.4.0 -- An enhanced Interactive Python. Type '?' for help.
Warning: disable autoreload in ipython_config.py to improve performance.
Expand Down Expand Up @@ -92,17 +95,17 @@ Scheduling a task to be executed continously in a defined interval
In [15] tq.put(fib, 5, name='8_seconds_interval_fib', eta='8s')
In [16] tq.put(fib, 5, name='2_hours_interval_fib', eta='2h')
```

Delayed and interval tasks are supported even in blocking scheduling manner.

Tasq also supports an optional static configuration file, in the `tasq.settings.py` module is
defined a configuration class with some default fields. By setting the environment variable
`TASQ_CONF` it is possible to configure the location of the json configuration file on the
filesystem.
Tasq also supports an optional static configuration file, in the
`tasq.settings.py` module is defined a configuration class with some default
fields. By setting the environment variable `TASQ_CONF` it is possible to
configure the location of the json configuration file on the filesystem.

By setting the `-f` flag it is possible to also set a location of a configuration to follow on the
filesystem
By setting the `-c` flag it is possible to also set a location of a
configuration to follow on the filesystem

```
$ tq worker -c path/to/conf/conf.json
Expand All @@ -113,47 +116,53 @@ A worker can be started by specifying the type of sub worker we want:
```
$ tq rabbitmq-worker --worker-type process
```
Using `process` type subworker it is possible to use a distributed queue for parallel execution,
usefull when the majority of the jobs are CPU bound instead of I/O bound (actors are preferable in
that case).
Using `process` type subworker it is possible to use a distributed queue for
parallel execution, usefull when the majority of the jobs are CPU bound instead
of I/O bound (actors are preferable in that case).

If jobs are scheduled for execution on a disconnected client, or remote workers are not up at the
time of the scheduling, all jobs will be enqeued for later execution. This means that there's no
need to actually start workers before job scheduling, at the first worker up all jobs will be sent
and executed.
If jobs are scheduled for execution on a disconnected client, or remote workers
are not up at the time of the scheduling, all jobs will be enqeued for later
execution. This means that there's no need to actually start workers before job
scheduling, at the first worker up all jobs will be sent and executed.

### Security

Currently tasq gives the option to send pickled functions using digital sign in order to prevent
manipulations of the sent payloads, being dependency-free it uses `hmac` and `hashlib` to generate
digests and to verify integrity of payloads, planning to move to a better implementation probably
using `pynacl` or something similar.
Currently tasq gives the option to send pickled functions using digital sign in
order to prevent manipulations of the sent payloads, being dependency-free it
uses `hmac` and `hashlib` to generate digests and to verify integrity of
payloads, planning to move to a better implementation probably using `pynacl`
or something similar.

## Behind the scenes

Essentially it is possible to start workers across the nodes of a network without forming a cluster
and every single node can host multiple workers by setting differents ports for the communication.
Each worker, once started, support multiple connections from clients and is ready to accept tasks.
Essentially it is possible to start workers across the nodes of a network
without forming a cluster and every single node can host multiple workers by
setting differents ports for the communication. Each worker, once started,
support multiple connections from clients and is ready to accept tasks.

Once a worker receive a job from a client, it demand its execution to dedicated actor or process,
usually selected from a pool according to a defined routing strategy in the case of actor (e.g.
Round robin, Random routing or Smallest mailbox which should give a trivial indication of the
workload of each actor and select the one with minimum pending tasks to execute) or using a simple
Once a worker receive a job from a client, it demand its execution to dedicated
actor or process, usually selected from a pool according to a defined routing
strategy in the case of actor (e.g. Round robin, Random routing or Smallest
mailbox which should give a trivial indication of the workload of each actor
and select the one with minimum pending tasks to execute) or using a simple
distributed queue across a pool of process in producer-consumer way.

![Tasq master-workers arch](static/worker_model_2.png)

Another (pool of) actor(s) is dedicated to answering the clients with the result once it is ready,
this way it is possible to make the worker listening part unblocking and as fast as possible.
Another (pool of) actor(s) is dedicated to answering the clients with the
result once it is ready, this way it is possible to make the worker listening
part unblocking and as fast as possible.

The reception of jobs from clients is handled by `ZMQ.PULL` socket while the response transmission
handled by `ResponseActor` is served by `ZMQ.PUSH` socket, effectively forming a dual channel of
communication, separating ingoing from outgoing traffic.
The reception of jobs from clients is handled by `ZMQ.PULL` socket while the
response transmission handled by `ResponseActor` is served by `ZMQ.PUSH`
socket, effectively forming a dual channel of communication, separating ingoing
from outgoing traffic.

## Installation

Being a didactical project it is not released on Pypi yet, just clone the repository and install it
locally or play with it using `python -i` or `ipython`.
Being a didactical project it is not released on Pypi yet, just clone the
repository and install it locally or play with it using `python -i` or
`ipython`.

```
$ git clone https://github.com/codepr/tasq.git
Expand Down
10 changes: 6 additions & 4 deletions tasq/remote/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,10 @@ class BaseTasqClient(metaclass=ABCMeta):
:type port: int
:param port: The port associated with the host param
:type signkey: bool or False
:param signkey: Boolean flag, sign bytes passing around through sockets
if True
:type signkey: str or None
:param signkey: String representing a sign, marks bytes passing around
through sockets
"""

Expand Down Expand Up @@ -127,7 +128,8 @@ def connect(self):
self._client.connect()
self._is_connected = True
# Start gathering thread
self._gatherer.start()
if not self._gatherer.is_alive():
self._gatherer.start()
# Check if there are pending requests and in case, empty the queue
while self._pending:
job = self._pending.pop()
Expand Down

0 comments on commit 312c43c

Please sign in to comment.