Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Construct produce requests incrementally #433

Closed
eapache opened this issue Apr 25, 2015 · 2 comments
Closed

Construct produce requests incrementally #433

eapache opened this issue Apr 25, 2015 · 2 comments

Comments

@eapache
Copy link
Contributor

eapache commented Apr 25, 2015

Between the messageAggregator and the flusher we actually re-shuffle all messages to produce three times: once into a single []*ProducerMessage for batching purposes, once into a map[string]map[int32][]*ProducerMessage to aid internal state transitions, and finally into the actual ProduceRequest structure. This whole process is rather silly and over-complicated.

Once #300 lands, we should be able to simplify this substantially. With the appropriate re-organization of state, the messageAggregator should be able to put messages directly into a ProduceRequest as they arrive, getting rid of both existing re-shuffling passes.


This change will also enable one other subtle optimization. Since compressed messages are wrapped together and sent as the payload of a single "message" in the protocol, the total size of compressed messages sent is not just limited by the MaxRequestSize, but also by MaxMessageBytes which is typically much smaller.

However, while MaxRequestSize is per-request, MaxMessageSize (as it applies to compressed message sets masquerading as single messages) is per partition. Unfortunately, the current messageAggregator enforces this limit per request because it doesn't have the state to calculate it per-partition. This has the effect of artificially limiting throughput when compression is enabled and multiple topics are being produced to the same broker (you end up with each request limited to 1MB instead of each partition of each request being limited to 1MB).

eapache added a commit that referenced this issue Aug 13, 2015
Put them in a map right up front in the aggregator, it only requires tracking
one exta piece of metadata (total messages in the map) and it means we don't
have to shuffle them into this form before constructing the request anyways.

One piece of #433.
@eapache
Copy link
Contributor Author

eapache commented Sep 25, 2015

#449 found a problem in this area which #538 fixed in a rather short-term hacky way. However this ends up re-organized must solve those problems as well.

@eapache
Copy link
Contributor Author

eapache commented Oct 16, 2015

Closed by #551

@eapache eapache closed this as completed Oct 16, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant