Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-5032: Update the docs for message size configs across the board #3374

Closed
wants to merge 2 commits into from

Conversation

apurvam
Copy link
Contributor

@apurvam apurvam commented Jun 19, 2017

Before 0.11, we used to have limits for maximum message size on the producer, broker, and consumer side.

From 0.11 onward, these limits apply to record batches as a whole. This patch updates the documentation of the configs to make this explicit.

A separate patch will have more extensive upgrade notes to tie all the changes together in one narrative.

@apurvam
Copy link
Contributor Author

apurvam commented Jun 19, 2017

cc @hachikuji @ijuma

I chose not to mention explicit versions in the changes, because the documentation page is versioned over all.

@asfgit
Copy link

asfgit commented Jun 19, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/5481/
Test PASSed (JDK 7 and Scala 2.11).

@asfgit
Copy link

asfgit commented Jun 19, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/5466/
Test PASSed (JDK 8 and Scala 2.12).

Copy link
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch. Left a few comments.

@@ -124,9 +124,9 @@
*/
public static final String FETCH_MAX_BYTES_CONFIG = "fetch.max.bytes";
private static final String FETCH_MAX_BYTES_DOC = "The maximum amount of data the server should return for a fetch request. " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should update MAX_PARTITION_FETCH_BYTES_DOC as well.

"This is not an absolute maximum, if the first message in the first non-empty partition of the fetch is larger than " +
"this value, the message will still be returned to ensure that the consumer can make progress. " +
"The maximum message size accepted by the broker is defined via <code>message.max.bytes</code> (broker config) or " +
"This is not an absolute maximum, if the first record batch in the first non-empty partition of the fetch is larger than " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the only place in the consumer documentation where we mention the concept of a record batch. I wonder if we should offer a brief explanation. For example: "Records are fetched in batches by the consumer. If the first record batch in the first non-empty.... "

@@ -74,7 +74,7 @@
"their data.";

public static final String MAX_MESSAGE_BYTES_CONFIG = "max.message.bytes";
public static final String MAX_MESSAGE_BYTES_DOC = "This is largest message size Kafka will allow to be " +
public static final String MAX_MESSAGE_BYTES_DOC = "This is largest record batch size Kafka will allow to be " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Given the name of this config and the fact that we continue to support the v0/v1 formats, I wonder if we're ready to jump to the "record batch" language just yet. Maybe we could add something like this: "Messages are grouped in batches for efficiency. In previous releases, this applied only if compression was enabled, but it is generally true in the latest message format version. This configuration limits the maximum size of of a single uncompressed message for older formats and the size of a single message batch otherwise."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the config name is annoying. I would document the current behaviour first and then have additional sentences for the older formats (and probably a brief reference to the unintuitive naming, which should come naturally after the older formats are explained).

Also, we should probably change KafkaConfig.MessageMaxBytesDoc to refer to this variable so that the docs remain consistent.

@@ -109,8 +109,8 @@

/** <code>max.request.size</code> */
public static final String MAX_REQUEST_SIZE_CONFIG = "max.request.size";
private static final String MAX_REQUEST_SIZE_DOC = "The maximum size of a request in bytes. This is also effectively a cap on the maximum record size. Note that the server "
+ "has its own cap on record size which may be different from this. This setting will limit the number of record "
private static final String MAX_REQUEST_SIZE_DOC = "The maximum size of a request in bytes. This is also effectively a cap on the maximum record batch size. Note that the server "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should move "This is also effectively a cap on ... which may be different from this." to after "This setting will limit ... sending huge requests.". It seems like the latter describes the purpose of the setting while the former is an additional implication.

@@ -74,7 +74,7 @@
"their data.";

public static final String MAX_MESSAGE_BYTES_CONFIG = "max.message.bytes";
public static final String MAX_MESSAGE_BYTES_DOC = "This is largest message size Kafka will allow to be " +
public static final String MAX_MESSAGE_BYTES_DOC = "This is largest record batch size Kafka will allow to be " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the config name is annoying. I would document the current behaviour first and then have additional sentences for the older formats (and probably a brief reference to the unintuitive naming, which should come naturally after the older formats are explained).

Also, we should probably change KafkaConfig.MessageMaxBytesDoc to refer to this variable so that the docs remain consistent.

@@ -542,7 +542,7 @@ object KafkaConfig {
val ReplicaSocketTimeoutMsDoc = "The socket timeout for network requests. Its value should be at least replica.fetch.wait.max.ms"
val ReplicaSocketReceiveBufferBytesDoc = "The socket receive buffer for network requests"
val ReplicaFetchMaxBytesDoc = "The number of bytes of messages to attempt to fetch for each partition. This is not an absolute maximum, " +
"if the first message in the first non-empty partition of the fetch is larger than this value, the message will still be returned " +
"if the first record batch in the first non-empty partition of the fetch is larger than this value, the record batch will still be returned " +
"to ensure that progress can be made. The maximum message size accepted by the broker is defined via " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to update "The maximum message size.."?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also update ReplicaFetchResponseMaxBytesDoc.

Apurva Mehta added 2 commits June 22, 2017 15:20
… since everything is written and read in batches in the current version.
@apurvam apurvam force-pushed the KAFKA-5032-message-size-docs branch from 8f404db to c4b4f69 Compare June 22, 2017 22:20
@asfgit
Copy link

asfgit commented Jun 22, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk7-scala2.11/5643/
Test PASSed (JDK 7 and Scala 2.11).

@asfgit
Copy link

asfgit commented Jun 22, 2017

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-jdk8-scala2.12/5629/
Test PASSed (JDK 8 and Scala 2.12).

Copy link
Contributor

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, LGTM. I made a couple of small changes before merging to 0.11.0 and trunk. If @hachikuji has additional feedback, we can do that in a follow-up.

@@ -416,7 +416,8 @@ object KafkaConfig {
"start from " + MaxReservedBrokerIdProp + " + 1."
val MessageMaxBytesDoc = "The maximum message size that the server can receive. Note that this limit also applies " +
"to the total size of a compressed batch of messages (when compression is enabled). Additionally, in versions " +
"0.11 and later, all messages are written as batches and this setting applies to the total size of the batch."
"0.11 and later, all messages are written as batches and this setting applies to the total size of the batch. " +
"This can be set per topic with the topic level <code>max.message.bytes</code> config."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this to refer to the TopicConfig definition concatenated with the last sentence before merging.

"they can fetch messages this large. </p> " +
"<p>Messages are grouped in batches for efficiency. In previous releases, this applied only if compression was " +
"enabled, but it is generally true in the latest message format version. This configuration limits the maximum " +
"size of of a single uncompressed message for older formats and the size of a single message batch otherwise. </p>";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tweaked this a little before merging.

asfgit pushed a commit that referenced this pull request Jun 27, 2017
Before 0.11, we used to have limits for maximum message size on the producer, broker, and consumer side.

From 0.11 onward, these limits apply to record batches as a whole. This patch updates the documentation of the configs to make this explicit.

A separate patch will have more extensive upgrade notes to tie all the changes together in one narrative.

Author: Apurva Mehta <apurva@confluent.io>

Reviewers: Jason Gustafson <jason@confluent.io>, Ismael Juma <ismael@juma.me.uk>

Closes #3374 from apurvam/KAFKA-5032-message-size-docs

(cherry picked from commit f1cc800)
Signed-off-by: Ismael Juma <ismael@juma.me.uk>
@asfgit asfgit closed this in f1cc800 Jun 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants