making publish asynchronous with a batching mechanism #32

DanielePalaia · 2023-02-15T09:20:54Z

This PR gives the possibility to make the publish() asynchronous with the implementation of a batching mechanism, similarly to what we do in our GO/.NET client.
the publish now takes in input another parameter "send_batch_enabled", that if it is set to True enables the _get_or_create_publisher to create an asyncio task that will send messages in batch using the publish_batch after a given interval of time. The publish in turn will put the messages in a buffer instead of sending them directly (still happening when send_batch_enabled is set to True)

qweeze · 2023-02-16T23:53:13Z

Thanks @DanielePalaia
I like the idea of adding this buffering mechanism, but I have a couple of thoughts about implementation

My main concern is that the new flag send_batch_enabled completely changes the publish method's behavior and makes it way more complex.
We can't return publishing_id with send_batch_enabled and this kinda breaks the semantics of this method. Returning either id or 0 depending on if the flag enabled or not feels a little confusing
There's also possible confusion between publish_batch(...) and publish(..., send_batch_enabled=True)

I think maybe a better way would be to make a separate method for publishing with buffering.
Keeping this async batching publishing mechanism separately from regular publishing should allow us to provide more clear API and simpler internal implementation. For example (just an idea):

async with producer.background_sender(stream="mystream", interval=0.2) as sender:
    # ...
    for msg in messages:
        sender.send(msg)  # no need for await here

This way Producer class won't need to have all the extra attributes (_buffered_messages, task, ...) and context manager makes managing background task more explicit. Also if we pass stream and publisher_name at initialization time we won't have to make a call to _get_or_create_publisher and acquire a lock on each message. So the send method can just do self.buffer.append(message)

Or we could additionally even make a separate public class that wraps producer:

sender = BackgroundSender(producer, stream)
await sender.start()
async for msg in messages:
    sender.send(msg)
await sender.stop()

DanielePalaia · 2023-02-17T11:12:19Z

Hi @qweeze yes I agree with your comment that is a good idea to separate the two versions but, the fact is that the main idea is to have the publish() run asynchronously by default and leave just the batch_send() run synchronous in line with the other clients.

For the moment I just left the send_batch_enabled parameter in order to be able to do some tests/performance profiling with older/newer version ecc of the publish() but I was planning to remove the synchronous part eventually.

Also another part of the task will be to not let the publish_batch blocked on_confirm but to be able to manage this on a separate thread on client side.

Also discussion with @Gsantomaggio while is a good idea to rely on the ContextManager to manage few things, we prefer the API to be similar to the other clients (having here both background_sender and a send may be too different from the other clients apis)

qweeze · 2023-02-25T13:09:58Z

have the publish() run asynchronously by default and leave just the batch_send() run synchronous in line with the other clients

Oh I see. As a reference, aiokafka provides two methods - send and send_and_wait, maybe we can do something similar?

to not let the publish_batch blocked on_confirm but to be able to manage this on a separate thread on client side.

I'm not sure I fully understand this, could you please explain a bit more? Or maybe provide a code snippet

Gsantomaggio · 2023-02-27T08:40:42Z

I'm not sure I fully understand this, could you please explain a bit more? Or maybe provide a code snippet

By protocol, the publish confirmation is asynchronous, see for example, the DotNet client where the confirmation is done by a separated thread. This increases the performance by not blocking the producing messages waiting for confirmation.

DotNet (like other clients) implements a backpressure pattern to avoid flooding the server, see:
https://rabbitmq.github.io/rabbitmq-stream-dotnet-client/stable/htmlsingle/index.html#_creating_a_producer
MaxInFlight parameter.

We'd like to implement this client in the same way, avoiding blocking the publisher by waiting for the confirmation.

So this client would have:

send: asynchronous, automatic buffer and send the messages
batch_send: Synchronous, the user buffers the messages and sends them
publish_confirm: A call-back where the user will receive the confirmation ids in a separate thread.

In this way the python will have the same behaviour like the other clients.

Let us know, so we can go ahead.

thank you

qweeze · 2023-02-27T13:17:48Z

Thanks for clarifying @Gsantomaggio
I'm totally fine with your plan, just a few comments to make sure there's no misunderstanding:

publish confirmation is asynchronous, and is done by a separated thread

Currently we also have asynchronous confirmations handling which are done in a separate coroutine (Client._listener), so the difference is that you want users to be able to pass their own handler function, right?

This increases the performance by not blocking the producing messages waiting for confirmation

We currently have sync flag, with sync=False, publish() doesn't wait for confirmation

for msg in messages:
    await producer.publish(stream, msg, sync=False)

(But in terms of performance it still waits for sending each message, that's the reason why batching is faster than consecutive calls to publish())

With automatic buffering, do you want to support publisher_name parameter? If yes, how the buffering will work in such case?
Another thing I just want to point out that if we allow handling confirmations only in a callback, then a scenario when a user wants to publish a message and make sure it is delivered will become much harder to implement. So maybe we can have both sync and async methods, like send and send_and_wait / send(wait=True)?

Gsantomaggio · 2023-02-27T14:24:09Z

With automatic buffering, do you want to support publisher_name parameter? If yes, how the buffering will work in such case?

publisher_name == Reference is needed only for deduplication, by default should not be used.

Java and DotNet use two different ways to handle it. For example, in DotNet there is a specific class for that where you explicitly pass the id. The idea would be to have something similar here.

send and send_and_wait / send(wait=True)?

Ok so:
1 - send is async - the client buffers and sends the messages.
2 - send_batch is synchronous

3 - send_wait- is synchronous ( that's mandatory to wait for the confirmation).

1 and 2 use the user callback for confirmation.

1- Easy to use and fast for throughput
2- Useful for latency; there is an interesting thread about that . (Go client batch vs send batch)

3- Different use cases where the user doesn't need performances but more control in confirmation. But here, we'd need to introduce some logic like:

What happens if the client does not receive the confirmation within X seconds? else, the wait is not helpful and unpredictable.

DanielePalaia · 2023-02-28T14:39:37Z

Hi @Gsantomaggio and @qweeze I updated the PR with the API discussed! Feel free to have a look to it. For the moment I updated the tests using send_wait and just added a test with send (asynchronous). Will add new ones afterwards.

Gsantomaggio · 2023-02-28T15:52:43Z

Ok great @DanielePalaia in this way have the send API like the other clients. You should update also the README with the basic example.

We will write the documentation with all the other methods.

Gsantomaggio

@DanielePalaia Can you please update the README?

DanielePalaia · 2023-03-01T08:47:02Z

@Gsantomaggio Done!

DanielePalaia · 2023-03-01T13:37:17Z

Added few more tests as well for send()

qweeze

@DanielePalaia added some comments, mostly minor issues (code style, etc)

rstream/producer.py

qweeze · 2023-03-01T15:01:54Z

rstream/producer.py

+        async with self._buffered_messages_lock:
+            self._buffered_messages[stream].append(message)
+
+        await asyncio.sleep(0)


Hmm do we really need it here? This statement only makes sense if there's no other __await__ statements in a function and we want to force a context switch anyway

@qweeze. I would like to discuss about this point because I think it's the most important one. I agree with you on this but apparently on the tests made it appears that if I don't force the context switch there, all the cpu get taken by the calling process which is continuously calling send() and the background thread don't get even activated. I'm not really an asyncio expert but I think it is related on what explained on this article here: https://towardsdatascience.com/asyncio-is-not-parallelism-70bfed470489.
Also I see around that using asyncio.sleep(0) is not considered a bad practice with this library after all (https://superfastpython.com/asyncio-sleep/).

I was wondering if starting the task before we start looping to send() messages may improve the situation without forcing us to use the sleep. Will do few tests.

No I'm having the same behavior starting the thread even before looping for send(), it seems like the sleep() is necessary.

Yes, that's the expected behavior because asyncio's concurrency model is different from threading (cooperative vs preemptive multitasking) - the event loop can only switch to another coroutine when the current coroutine is suspended waiting for future result. That's why any long-running sync code effectively blocks the event loop and considered a bad thing when writing asyncio apps

In general as a library authors we can't prevent users from blocking the event loop - for example, someone can just put time.sleep(5) in their code and it will block our background _timer task. But it's expected that a programmer is aware that they should avoid blocking code.

In this specific case with send method I think the problem is that the method is async, but it actually awaits IO only on the first call, when self._publishers and self._clients are not initialized. And on subsequent calls all the operations are sync, which makes the call effectively blocking. So I agree that we can use asyncio.sleep(0) here for consistency.

Or alternatively we can use asyncio.Queue instead of a list for buffering messages, that way we'll have "natural" context switch, and also a way to limit the number of messages in a buffer. (also _buffered_messages_lock will become unnecessary)

Thanks for the feedback.
Ok great we will try with asyncio.Queue. In case of problems with asyncio.Queue we will merge the current version if there isn't any other feedback.

I made a few tests trying to use asyncio.Queue but it seems like performance deteriorates even when removing the old lock we used for the list. Also the send uses internally the send_batch which takes in input a normal list[]. So some sort of conversion is needed. I think for the moment we can keep it with the sleep(0) and open a new issue to better investigate the usage of asyncio.Queue

rstream/producer.py

DanielePalaia · 2023-03-01T17:26:35Z

Hi @qweeze thanks a lot for your feedbacks. Yes I agree with you that we should definitively go for PEP8. I fixed a few straightforward suggestions already and I will mark them as resolved. I will come back to you tomorrow for the ones that may require some discussion.

DanielePalaia · 2023-03-02T16:08:47Z

Hi @qweeze I reviewed the PR with the input you provided, except the two open issues that maybe require more discussion!

DanielePalaia · 2023-03-06T09:32:09Z

Hi, I made the last fixes as suggested. For the moment sending 1 milion of messages in a stream with the old send (non-blocking) was taking around a minute while now with the buffering mechanism we are at around 38seconds. From this one I will open new issues regarding on_publish_confirmation we discussed earlier, the usage of asyncio.Queue, and maybe improve the locking mechanisms)

DanielePalaia marked this pull request as draft February 15, 2023 09:21

DanielePalaia force-pushed the implement_async_publish branch 4 times, most recently from 9dbd58c to d114b6c Compare February 15, 2023 10:14

DanielePalaia mentioned this pull request Feb 15, 2023

Making publish asynchronous using a batching mechanism internally #31

Closed

making publish asynchronous with a batching mechanism

b2036d4

DanielePalaia force-pushed the implement_async_publish branch from d114b6c to b2036d4 Compare February 15, 2023 12:52

Gsantomaggio mentioned this pull request Feb 28, 2023

Next steps #34

Closed

DanielePalaia force-pushed the implement_async_publish branch 6 times, most recently from e2c4845 to d979f15 Compare February 28, 2023 14:36

Gsantomaggio marked this pull request as ready for review March 1, 2023 08:18

Gsantomaggio requested changes Mar 1, 2023

View reviewed changes

DanielePalaia force-pushed the implement_async_publish branch 5 times, most recently from c1a4127 to 6b9b2bc Compare March 1, 2023 08:44

DanielePalaia force-pushed the implement_async_publish branch from 6b9b2bc to c8caabf Compare March 1, 2023 09:16

Gsantomaggio approved these changes Mar 1, 2023

View reviewed changes

DanielePalaia force-pushed the implement_async_publish branch 2 times, most recently from 1e12e18 to d86b7fe Compare March 1, 2023 13:35

qweeze reviewed Mar 1, 2023

View reviewed changes

DanielePalaia force-pushed the implement_async_publish branch 2 times, most recently from 6c28751 to 9bc4bce Compare March 1, 2023 17:07

Updating last modifications based on API convention discussed

f724339

DanielePalaia force-pushed the implement_async_publish branch from 9bc4bce to f724339 Compare March 1, 2023 17:23

DanielePalaia force-pushed the implement_async_publish branch 2 times, most recently from 9284c39 to 0e1741b Compare March 2, 2023 13:33

implement last review suggestions

1b85c2e

DanielePalaia force-pushed the implement_async_publish branch from 0e1741b to 1b85c2e Compare March 6, 2023 09:08

DanielePalaia merged commit 1896383 into qweeze:master Mar 7, 2023

DanielePalaia deleted the implement_async_publish branch March 7, 2023 08:16

DanielePalaia mentioned this pull request Jul 5, 2023

Performance issue on send() due to asyncio.sleep(0) #84

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

making publish asynchronous with a batching mechanism #32

making publish asynchronous with a batching mechanism #32

DanielePalaia commented Feb 15, 2023 •

edited

Loading

qweeze commented Feb 16, 2023

DanielePalaia commented Feb 17, 2023

qweeze commented Feb 25, 2023

Gsantomaggio commented Feb 27, 2023

qweeze commented Feb 27, 2023

Gsantomaggio commented Feb 27, 2023

DanielePalaia commented Feb 28, 2023

Gsantomaggio commented Feb 28, 2023

Gsantomaggio left a comment

DanielePalaia commented Mar 1, 2023

DanielePalaia commented Mar 1, 2023

qweeze left a comment

qweeze Mar 1, 2023

DanielePalaia Mar 2, 2023 •

edited

Loading

DanielePalaia Mar 2, 2023

DanielePalaia Mar 2, 2023

qweeze Mar 4, 2023

Gsantomaggio Mar 5, 2023

DanielePalaia Mar 6, 2023

DanielePalaia commented Mar 1, 2023

DanielePalaia commented Mar 2, 2023

DanielePalaia commented Mar 6, 2023

making publish asynchronous with a batching mechanism #32

making publish asynchronous with a batching mechanism #32

Conversation

DanielePalaia commented Feb 15, 2023 • edited Loading

qweeze commented Feb 16, 2023

DanielePalaia commented Feb 17, 2023

qweeze commented Feb 25, 2023

Gsantomaggio commented Feb 27, 2023

qweeze commented Feb 27, 2023

Gsantomaggio commented Feb 27, 2023

DanielePalaia commented Feb 28, 2023

Gsantomaggio commented Feb 28, 2023

Gsantomaggio left a comment

Choose a reason for hiding this comment

DanielePalaia commented Mar 1, 2023

DanielePalaia commented Mar 1, 2023

qweeze left a comment

Choose a reason for hiding this comment

qweeze Mar 1, 2023

Choose a reason for hiding this comment

DanielePalaia Mar 2, 2023 • edited Loading

Choose a reason for hiding this comment

DanielePalaia Mar 2, 2023

Choose a reason for hiding this comment

DanielePalaia Mar 2, 2023

Choose a reason for hiding this comment

qweeze Mar 4, 2023

Choose a reason for hiding this comment

Gsantomaggio Mar 5, 2023

Choose a reason for hiding this comment

DanielePalaia Mar 6, 2023

Choose a reason for hiding this comment

DanielePalaia commented Mar 1, 2023

DanielePalaia commented Mar 2, 2023

DanielePalaia commented Mar 6, 2023

DanielePalaia commented Feb 15, 2023 •

edited

Loading

DanielePalaia Mar 2, 2023 •

edited

Loading