Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support - Batching when producing #7

Open
blankensteiner opened this issue Oct 1, 2019 · 7 comments
Open

Support - Batching when producing #7

blankensteiner opened this issue Oct 1, 2019 · 7 comments
Labels
enhancement New feature or request

Comments

@blankensteiner
Copy link
Contributor

As described here:

@blankensteiner blankensteiner added the enhancement New feature or request label Oct 1, 2019
@blankensteiner blankensteiner changed the title Support - Batching Support - Batching when producing Oct 14, 2019
@blankensteiner
Copy link
Contributor Author

I was a bit surprised when I realized how batching is implemented.
When sending a batch to the server, the command is relatively straight forward:

  • Frame size
  • Command size
  • Command (Send)
  • Magic number
  • Checksum
  • MessageMetadata size
  • MessageMetadata
  • Payload is a sequence of:
    • SingleMessageMetadata size
    • SingleMessageMetadata
    • Payload

I was expecting the server to unwrap the command and store each message by itself in BookKeeper, but they are actually stored as one and therefore also delivered to the reader/consumer as one message. This means that:

  • All readers and consumers MUST be able to read batched messages.
  • The consumer's cursor is only moved forward when the consumer has acknowledged all the messages in a batch.
  • Messages with delayed delivery (DeliverAt) can not be batched.
  • Since PartitionKey and OrderingKey can be set pr SingleMessageMetadata, you can have a mix of these pr batch. We need to examine how this is handled.

We have implemented support for reading and consuming batched messages (from version 0.6.0), but producing batched messages is currently on hold.

@eaba
Copy link

eaba commented Jan 10, 2020

Hey @blankensteiner happy new year to you!
Any ETA on this?

@blankensteiner
Copy link
Contributor Author

Hi @eaba
Thanks and a happy new year to you too! :-D
Currently, all my time is spent on implementing OpenShift and after that, we will be looking into implementing Pulsar here at Danske Commodities. This will mean more development time and a lot of developers using DotPulsar and being able to contribute (we have close to 30 developers here).
So, the status right now is that it is not being worked on and we have no ETA. I doubt if this is a feature that will be requested within Danske Commodities and therefore I hope someone in the community will consider implementing it.

@eaba
Copy link

eaba commented Jan 10, 2020

Thanks for the response - just seeing this after posting a new issue!

@blankensteiner
Copy link
Contributor Author

@RobertIndie We need to add these methods to IProducerBuilder

/// <summary>
/// Set the maximum number of messages permitted in a batch. The default is 1000.
/// </summary>
IProducerBuilder BatchingMaxMessagesPerBatch(int maxMessagesPerBatch);

/// <summary>
/// Set the time period within which the messages sent will be batched. The default is 1 ms.
/// </summary>
IProducerBuilder BatchingMaxPublishDelay(TimeSpan maxPublishDelay);

/// <summary>
/// Control whether automatic batching of messages is enabled for the producer. The default is 'false'.
/// </summary>
IProducerBuilder BatchingEnabled(bool batchingEnabled);

This will require us to add these properties to ProducerOptions

/// <summary>
/// Set the maximum number of messages permitted in a batch. The default is 1000.
/// </summary>
public int BatchingMaxMessagesPerBatch { get; set; }

/// <summary>
/// Set the time period within which the messages sent will be batched. The default is 1 ms.
/// </summary>
public TimeSpan BatchingMaxPublishDelay { get; set; }

/// <summary>
/// Control whether automatic batching of messages is enabled for the producer. The default is 'false'.
/// </summary>
public bool BatchingEnabled { get; set; }

@blankensteiner
Copy link
Contributor Author

@RobertIndie We need to get 'max_message_size' from 'CommandConnected' to ensure we don't create batches that are too big.
The field is optional, so I guess we need to have a fallback to 5.242.880 bytes (5 MB)?

@RobertIndie
Copy link
Member

@RobertIndie We need to get 'max_message_size' from 'CommandConnected' to ensure we don't create batches that are too big.
The field is optional, so I guess we need to have a fallback to 5.242.880 bytes (5 MB)?

Ok, I agree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants