From 72990940f50f3dcce2440e6280f6940159b0800b Mon Sep 17 00:00:00 2001 From: Benoit TELLIER Date: Mon, 10 Jan 2022 14:26:40 +0700 Subject: [PATCH] Apply suggestions from code review Co-authored-by: Rene Cordier --- src/adr/0051-pulsar-mailqueue.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/src/adr/0051-pulsar-mailqueue.md b/src/adr/0051-pulsar-mailqueue.md index 370f5733e04..87d400e033e 100644 --- a/src/adr/0051-pulsar-mailqueue.md +++ b/src/adr/0051-pulsar-mailqueue.md @@ -29,7 +29,7 @@ delays, purging the queue, etc. ### Existing distributed MailQueue -Distributed James currently ship a distributed MailQueue composing the following software with the following +Distributed James currently ships a distributed MailQueue composing the following software with the following responsibilities: - **RabbitMQ** for messaging. A rabbitMQ consumer will trigger dequeue operations. @@ -47,7 +47,7 @@ This implementation suffers from the following pitfall: often performed. The driver is not cluster aware and would operate connected to a single host. - The driver reliability is questionable: we experienced some crashed consumers that are never restarted. - Throughput and scalability of RabbitMQ is questionable. - - The current implementation do not support priorities, delays. + - The current implementation does not support priorities, delays. - The current implementation is known to be complex, hard to maintain, with some non-obvious tradeoffs. ### A few words about Apache Pulsar @@ -70,7 +70,7 @@ Pulsar is however complex to deploy and relies on the following components: This would make it suitable for large to very-large deployments or PaaS. -The Pulsar SDK is handy and handle natively reactive calls, retries, dead lettering, making implementation less +The Pulsar SDK is handy and handles natively reactive calls, retries, dead lettering, making implementation less boiler plate. ## Decision @@ -82,7 +82,7 @@ Package this mail queue in a simple artifact dedicated to distributed mail proce ## Consequences -We expect an easier to operate, cheaper, more reliable MailQueue. +We expect an easier way to operate a cheaper and more reliable MailQueue. We expect delays being supported as well. @@ -92,7 +92,7 @@ Pulsar technology would benefit from a broader adoption in James, eventually bec backing Apache James messaging capabilities. To reach this status the following work needs to be under-taken: - - The Pulsar MailQueue need to work on top of a deduplicated blob store. To do this we need to be able to list blobs + - The Pulsar MailQueue needs to work on top of a deduplicated blob store. To do this we need to be able to list blobs referenced by the Pulsar MailQueue, see [JIRA-XXXX](TODO). - The event bus (described in [ADR 37](0037-eventbus.md)) would benefit from a Pulsar implementation, replacing the existing RabbitMQ one (described in [ADR-38](0038-distributed-eventbus.md)). See [JIRA-XXXX](TODO). @@ -121,7 +121,7 @@ This work could be continued, for instance under the form of a Google Summer of The MailQueue relies on the following topology: - - out topic : contains the mail that are ready to be dequeued. + - out topic : contains the mails that are ready to be dequeued. - scheduled topic: emails that are delayed are first enqueued there. - filter topic: Deletions (name, sender, recipients) prior a given sequence are synchronized between nodes using this topic. @@ -132,7 +132,7 @@ Scheduled messages have their `deliveredAt` property set to the desired value. W expired, the message will be consumed and thus moved to the out topic. Flushes simply copy content of the scheduled topic to the out topic then reset the offset of the scheduled queue, atomically. Expired filters are removed. -note that in current versions of pulsar there is a scheduled job that handles scheduled messages, the accuracy of scheduling is limited by the frequency at which this job runs. +Note that in current versions of pulsar there is a scheduled job that handles scheduled messages, the accuracy of scheduling is limited by the frequency at which this job runs. The size of the mail queue can be simply computed from the out and scheduled topics. @@ -144,7 +144,7 @@ set of all deletions ever performed. Upon dequeues, messages of the out topic are filtered using that in-memory data structure, then exposed as a reactive publisher. -Upon browsing, both the out and scheduled topic are read from the consumption offset and filtering is applied. +Upon browsing, both the out and scheduled topics are read from the consumption offset and filtering is applied. Upon clear, the out topic is deleted. @@ -153,7 +153,7 @@ Miscellaneous remarks: - The pulsar admin client is used to list existing queues and to move the current offset of scheduled message subscription upon flushes. - Priorities are not yet supported. - - Only metadata transit through Pulsar. The general purpose James blobStore, backed by a S3 compatible API, is used to + - Only metadata transit through Pulsar. The general purpose of James blobStore, backed by a S3 compatible API, is used to store the underlying email content. Saves on top of object-storage is latency prone and exposed to end SMTP clients. ## References