Skip to content

Commit

Permalink
Apply suggestions from code review
Browse files Browse the repository at this point in the history
Co-authored-by: Rene Cordier <rene.cordier@gmail.com>
  • Loading branch information
chibenwa and Arsnael committed Jan 10, 2022
1 parent 376628f commit 7299094
Showing 1 changed file with 9 additions and 9 deletions.
18 changes: 9 additions & 9 deletions src/adr/0051-pulsar-mailqueue.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ delays, purging the queue, etc.

### Existing distributed MailQueue

Distributed James currently ship a distributed MailQueue composing the following software with the following
Distributed James currently ships a distributed MailQueue composing the following software with the following
responsibilities:

- **RabbitMQ** for messaging. A rabbitMQ consumer will trigger dequeue operations.
Expand All @@ -47,7 +47,7 @@ This implementation suffers from the following pitfall:
often performed. The driver is not cluster aware and would operate connected to a single host.
- The driver reliability is questionable: we experienced some crashed consumers that are never restarted.
- Throughput and scalability of RabbitMQ is questionable.
- The current implementation do not support priorities, delays.
- The current implementation does not support priorities, delays.
- The current implementation is known to be complex, hard to maintain, with some non-obvious tradeoffs.

### A few words about Apache Pulsar
Expand All @@ -70,7 +70,7 @@ Pulsar is however complex to deploy and relies on the following components:

This would make it suitable for large to very-large deployments or PaaS.

The Pulsar SDK is handy and handle natively reactive calls, retries, dead lettering, making implementation less
The Pulsar SDK is handy and handles natively reactive calls, retries, dead lettering, making implementation less
boiler plate.

## Decision
Expand All @@ -82,7 +82,7 @@ Package this mail queue in a simple artifact dedicated to distributed mail proce

## Consequences

We expect an easier to operate, cheaper, more reliable MailQueue.
We expect an easier way to operate a cheaper and more reliable MailQueue.

We expect delays being supported as well.

Expand All @@ -92,7 +92,7 @@ Pulsar technology would benefit from a broader adoption in James, eventually bec
backing Apache James messaging capabilities.

To reach this status the following work needs to be under-taken:
- The Pulsar MailQueue need to work on top of a deduplicated blob store. To do this we need to be able to list blobs
- The Pulsar MailQueue needs to work on top of a deduplicated blob store. To do this we need to be able to list blobs
referenced by the Pulsar MailQueue, see [JIRA-XXXX](TODO).
- The event bus (described in [ADR 37](0037-eventbus.md)) would benefit from a Pulsar implementation, replacing the
existing RabbitMQ one (described in [ADR-38](0038-distributed-eventbus.md)). See [JIRA-XXXX](TODO).
Expand Down Expand Up @@ -121,7 +121,7 @@ This work could be continued, for instance under the form of a Google Summer of

The MailQueue relies on the following topology:

- out topic : contains the mail that are ready to be dequeued.
- out topic : contains the mails that are ready to be dequeued.
- scheduled topic: emails that are delayed are first enqueued there.
- filter topic: Deletions (name, sender, recipients) prior a given sequence are synchronized between nodes using this topic.

Expand All @@ -132,7 +132,7 @@ Scheduled messages have their `deliveredAt` property set to the desired value. W
expired, the message will be consumed and thus moved to the out topic. Flushes simply copy content of the scheduled
topic to the out topic then reset the offset of the scheduled queue, atomically. Expired filters are removed.

note that in current versions of pulsar there is a scheduled job that handles scheduled messages, the accuracy of scheduling is limited by the frequency at which this job runs.
Note that in current versions of pulsar there is a scheduled job that handles scheduled messages, the accuracy of scheduling is limited by the frequency at which this job runs.


The size of the mail queue can be simply computed from the out and scheduled topics.
Expand All @@ -144,7 +144,7 @@ set of all deletions ever performed.
Upon dequeues, messages of the out topic are filtered using that in-memory data structure, then exposed as a reactive
publisher.

Upon browsing, both the out and scheduled topic are read from the consumption offset and filtering is applied.
Upon browsing, both the out and scheduled topics are read from the consumption offset and filtering is applied.

Upon clear, the out topic is deleted.

Expand All @@ -153,7 +153,7 @@ Miscellaneous remarks:

- The pulsar admin client is used to list existing queues and to move the current offset of scheduled message subscription upon flushes.
- Priorities are not yet supported.
- Only metadata transit through Pulsar. The general purpose James blobStore, backed by a S3 compatible API, is used to
- Only metadata transit through Pulsar. The general purpose of James blobStore, backed by a S3 compatible API, is used to
store the underlying email content. Saves on top of object-storage is latency prone and exposed to end SMTP clients.

## References
Expand Down

0 comments on commit 7299094

Please sign in to comment.