Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message exceeds the maximum size when using Compression #150

Closed
wkuranowski opened this issue Aug 28, 2014 · 11 comments
Closed

Message exceeds the maximum size when using Compression #150

wkuranowski opened this issue Aug 28, 2014 · 11 comments
Labels

Comments

@wkuranowski
Copy link

Hi,

I am using an asynchronous producer to send compressed messages to Kafka cluster. But it looks like those messages exceed defined size. I get this error on Kafka Broker:

kafka.common.MessageSizeTooLargeException: Message size is 3213102 bytes which exceeds the maximum configured message size of 1000012.

This is my producer config:

producerConfig := sarama.NewProducerConfig()
producerConfig.RequiredAcks = sarama.NoResponse
producerConfig.MaxBufferedBytes = 1000000
producerConfig.MaxBufferTime = 30 * time.Second
producerConfig.Compression = sarama.CompressionSnappy

Everything works great with compression disabled.

@eapache
Copy link
Contributor

eapache commented Aug 28, 2014

Which revision of Sarama are you using? Specifically, does the revision you run include 396afc5 ? If it does, you may want to set producerConfig.BackPressureThresholdBytes to match the broker's configured message.max.bytes (which it looks like you've decreased significantly from the default - is there a reason for that?).

@wkuranowski
Copy link
Author

I am using latest Sarama.
message.max.bytes is set to 1000000 by default, it's not decreased.

@wkuranowski
Copy link
Author

Setting producerConfig.BackPressureThresholdBytes to 1000000 (default value for message.max.bytes) solves the problem. I can even increase this value to about 15000000 and it works (strange, needs more testing).

I wonder about two things:

  1. What is the difference between producerConfig.MaxBufferedBytes and producerConfig.BackPressureThresholdBytes? Especially when using asynchronous producer (QueueMessage).
  2. How does compression works? Does it compress single message or compress all messages in a buffer at once? It's strange that I see this problem only with Snappy or GZip enabled.

@eapache
Copy link
Contributor

eapache commented Aug 29, 2014

message.max.bytes is set to 1000000 by default, it's not decreased.

Ah, sorry, I was confusing socket.request.max.bytes with message.max.bytes.

  1. MaxBufferedBytes is how many bytes have to accumulate in order for the producer to send the current batch to the broker. BackPressureThresholdBytes is how many bytes have to accumulate in order for the producer to stop accepting new messages entirely (until it can clear its queue). Both are only used for async (QueueMessage).
  2. It compresses all messages at once, then has to send that as a single "message" because Kafka's protocol is kind of weird. @wvanbergen should we reduce the default back-pressure threshold to 1MB to fix this case?

@eapache
Copy link
Contributor

eapache commented Aug 29, 2014

CC also @graemej would this affect us at all?

@wkuranowski
Copy link
Author

@eapache thanks for clarification. If i understand correctly, MaxBufferedBytes should be a maximum size of a batch to send and BackPressureThresholdBytes is a buffer for all messages to send?

If yes, then I am not sure if reducing BackPressureThresholdBytes is a good solution. I think that size of a back-pressure threshold should be much bigger than size of a single batch.

@eapache
Copy link
Contributor

eapache commented Aug 29, 2014

MaxBufferedBytes is just a trigger saying "when I have this many bytes, send another message". BackPressureThresholdBytes is, effectively, the maximum size of the buffer.

Perhaps adding a MaxMessageBytes which defaults to 1MB to match kafka would be a more appropriate, though also more complex fix.

@wkuranowski
Copy link
Author

MaxMessageBytes sounds good. I think it is the only way to not exceed limits defined on Kafka broker.

@wkuranowski
Copy link
Author

Any progress on that issue?

@eapache
Copy link
Contributor

eapache commented Sep 3, 2014

Unfortunately not, we've all been quite busy with other things and this isn't a terribly high priority for us since setting BackPressureThresholdBytes is a functional work-around. We're happy to accept pull requests if you want to do it yourself.

eapache added a commit that referenced this issue Oct 15, 2014
Should address #150 for producer-ng at least.
@eapache
Copy link
Contributor

eapache commented Nov 13, 2014

Just merged the new producer which should fix this issue among many others.

@eapache eapache closed this as completed Nov 13, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants