New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-15307: update/note deprecated configs #14360
Conversation
@@ -259,7 +259,7 @@ <h4><a class="toc-backref" href="#id23">num.standby.replicas</a><a class="header | |||
</tr> | |||
<tr class="row-even"><td>cache.max.bytes.buffering</td> | |||
<td>Medium</td> | |||
<td colspan="2">Maximum number of memory bytes to be used for record caches across all threads.</td> | |||
<td colspan="2">Maximum number of memory bytes to be used for record caches across all threads. Deprecated.</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add "Deprecated" to the "head line" in L260 instead? (Similar below)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes! done
Not sure if we need to put this into the docs? If we want to say something about it, we could add a link to the corresponding KIP, ie, "Deprecated via KIP-xxx" -- the KIP will contain the necessary information. What we should do in any case is, link to configs that should be used instead of the deprecated ones (seems we cannot do this in this PR though, as some new config are still missing in the doc -- or we add them in this PR directly)
Yes, very good idea to extend this code and also log an error if other deprecated configs are used. |
happy to add those new config in this PR -- shall I also add that |
Great.
Guess both ways work, but I tend to think splitting out a new PR might be better. |
Here's the new PR: #14448 |
This PR is being marked as stale since it has not had any activity in 90 days. If you would like to keep this PR alive, please ask a committer for review. If the PR has merge conflicts, please update it with the latest from trunk (or appropriate release branch) If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed. |
<tr class="row-even"><td>cache.max.bytes</td> | ||
<td>Medium</td> | ||
<td colspan="2">Maximum number of memory bytes to be used for record caches across all threads. Note that at the debug level you can use cache.size to monitor the actual size of the Kafka Streams cache.</td> | ||
<td>10485760 bytes</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<td>10485760 bytes</td> | |
<td>10485760</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description already say Maximum number of memory bytes
so seems we can drop bytes
here?
<tr class="row-even"><td>cache.max.bytes.buffering</td> | ||
<tr class="row-even"><td>cache.max.bytes</td> | ||
<td>Medium</td> | ||
<td colspan="2">Maximum number of memory bytes to be used for record caches across all threads. Note that at the debug level you can use cache.size to monitor the actual size of the Kafka Streams cache.</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<td colspan="2">Maximum number of memory bytes to be used for record caches across all threads. Note that at the debug level you can use cache.size to monitor the actual size of the Kafka Streams cache.</td> | |
<td colspan="2">Maximum number of memory bytes to be used for record caches across all threads. Note that at the debug level you can use <code>cache.size</code> to monitor the actual size of the Kafka Streams cache.</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% sue what
Note that at the debug level you can use cache.size to monitor the actual size of the Kafka Streams cache
actually mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not done it myself, basing it on what is said in the release https://www.confluent.io/blog/apache-kafka-3-4-0-new-features-and-updates/
@@ -257,7 +258,12 @@ <h4><a class="toc-backref" href="#id23">num.standby.replicas</a><a class="header | |||
<td colspan="2">The maximum number of records to buffer per partition.</td> | |||
<td>1000</td> | |||
</tr> | |||
<tr class="row-even"><td>cache.max.bytes.buffering</td> | |||
<tr class="row-even"><td>cache.max.bytes</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<tr class="row-even"><td>cache.max.bytes</td> | |
<tr class="row-even"><td>statestore.cache.max.bytes</td> |
Serde for the inner class of a windowed record. Must implement the org.apache.kafka.common.serialization.Serde interface. | ||
</p> | ||
<p> | ||
Note that setting this config in KafkaStreams application would result in an error as it is meant to be used only from Plain consumer client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence confuses me. I realize that it's also in the Java code, but not sure what it actually means? Maybe @ableegoldman who reviewed the original PR can help out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bear with me here because it's kind of a long story in full and I only half remember it...and also this statement is only half true. A more accurate statement would be something like this:
"Note that this config is only used by plain consumer/producer clients that set a windowed de/serializer type via configs. For Kafka Streams applications that deal with windowed types you must pass in the inner serde type when you instantiate the windowed serde object for your topology"
In other words, there's no concept of a "default windowed serde" type in Streams because you always have to pass in the actual Serde object, and need to pass in the window size and inner serde type at that time. The same can also be done for plain producer/consumer clients but someone pointed out that the console clients have to be configured via properties, which means they can only invoke the default constructor based on the class name passed in for the key/value de/serializer configs. Which meant we needed to add the window size and inner serde type as configs that could be used to configure a windowed Serde that was instantiated via default constructor
I guess we should update the docs string in StreamsConfig as well. I assume this error slipped in because of the window.size config, which is only needed for the consumer client (since the producer client throws away the window size when encoding a windowed type). But both producer and consumer need to know the inner serde type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you Sophie! Changed the sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Sophie.
Note that this config is only used by plain consumer/producer clients
For this case, why are we documenting is in KS docs -- should it not be in clients docs? (Also, this applies to window.size.ms
introduced in 2.8 via https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size, right, but not to windowed.inner.serde.class
which is a KS config added in 3.0 via https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=177047930
In the end, KS module provides the window-serdes and thus it does not make sense to add window.size.ms
to ClientConfig
-- especially, as you pointed out, because is only necessary for console consumer.
Thus, while StreamsConfig#WINDOW_SIZE_MS_CONFIG
must exist as a variable name, I am wondering if it's actually correct that we added it as a StreamsConfig, ie, via define(...)
? Mabye we should do a small KIP and remove it? -- For use, we should not mention window.size.ms
in KS docs on the web-page (at least not for "top level config" -- we should either add it to a "windowed serde" section, or to (console) consumer config section where it belong to)?
window.inner.serde.class
is a KS config and should just be documented in the regular way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess window.size.ms
is just in a weird place somewhere between a client config and a streams config. Given that it's not for use in an actual Kafka Streams application, it makes sense to me that we should not include it in the Streams config docs. Are you proposing to move it to ClientConfig or just not .define
it in any class? In the end, we added it via KIP and would need to move/remove it via KIP, and I'm not sure it's worth doing a KIP over. We can just leave it out of the Streams config docs
Also -- this applies to both window.size.ms
and window.inner.serde.class
, no? I don't understand this bit:
window.inner.serde.class is a KS config and should just be documented in the regular way.
They are essentially the same kind of config, they just refer to the two different parameters of a windowed serde. If window.size.ms
is not a KS config, then neither is window.inner.serde.class
-- and vice versa
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you proposing to move it to ClientConfig or just not .define it in any class? In the end, we added it via KIP and would need to move/remove it via KIP, and I'm not sure it's worth doing a KIP over. We can just leave it out of the Streams config docs
Yes, the idea was to maybe do a small KIP and remove it from StreamsConfig
as it does not belong there. If Lucia is interested in doing it -- I agree it's not super critical, and just removing from the docs is also fine with me.
About window.inner.serde.class
it is indeed used by Kafka Streams itself (cf TimeWindowedSerializer
and others). If one would set default serde to TimeWindowedSerializer
it would be used, so I think it's ok to have it in the docs, and also correct to have it as StreamsConfig
registered via define()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just did some more digging, and now I actually think that @ableegoldman is right, we might want to treat windowed.inner.serde.class
similar to window.size
... (ie, maybe remove from StreamsConfig -- we could add this to the KIP Lucia started).
I also understand now, why the docs says, using it would result in an error (for both configs): Kafka Streams will always pass window-size and inner-serde via the constructor and we will also verify that we don't get an parameter set twice (or zero time), and throw an error inside configure()
method of the windowed serdes...
Thus, we might want to not add windowed.inner.serde.class
to the docs in this PR to begin with...
Sorry for the back and forth. Reading and understanding code is hard...
streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java
Outdated
Show resolved
Hide resolved
streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java
Outdated
Show resolved
Hide resolved
streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java
Outdated
Show resolved
Hide resolved
streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java
Outdated
Show resolved
Hide resolved
add utils CSS class and resolve conflict
space + javascript
Co-authored-by: Matthias J. Sax <mjsax@apache.org>
Co-authored-by: Matthias J. Sax <mjsax@apache.org>
14fb849
to
5a258b4
Compare
Serde for the inner class of a windowed record. Must implement the org.apache.kafka.common.serialization.Serde interface. | ||
</p> | ||
<p> | ||
Note that setting this config in KafkaStreams application would result in an error as it is meant to be used only from Plain consumer client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just did some more digging, and now I actually think that @ableegoldman is right, we might want to treat windowed.inner.serde.class
similar to window.size
... (ie, maybe remove from StreamsConfig -- we could add this to the KIP Lucia started).
I also understand now, why the docs says, using it would result in an error (for both configs): Kafka Streams will always pass window-size and inner-serde via the constructor and we will also verify that we don't get an parameter set twice (or zero time), and throw an error inside configure()
method of the windowed serdes...
Thus, we might want to not add windowed.inner.serde.class
to the docs in this PR to begin with...
Sorry for the back and forth. Reading and understanding code is hard...
tagging @mjsax here, made some edits in response to the last roung |
<td colspan="2">Maximum number of memory bytes to be used for record caches across all threads.</td> | ||
<td>10485760</td> | ||
</tr> | ||
<tr class="row-even"><td>cache.max.bytes.buffering (Deprecated. Use cache.max.bytes instead.)</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<tr class="row-even"><td>cache.max.bytes.buffering (Deprecated. Use cache.max.bytes instead.)</td> | |
<tr class="row-even"><td>cache.max.bytes.buffering (Deprecated. Use statestore.cache.max.bytes instead.)</td> |
@@ -326,8 +331,15 @@ <h4><a class="toc-backref" href="#id23">num.standby.replicas</a><a class="header | |||
the <code>org.apache.kafka.streams.state.DslStoreSuppliers</code> interface. | |||
</td> | |||
<td><code>BuiltInDslStoreSuppliers.RocksDBDslStoreSuppliers</code></td> | |||
<td colspan="2">Default serializer/deserializer for the inner class of windowed keys, implementing the <code class="docutils literal"><span class="pre">Serde</span></code> interface.</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this line does not belong to dsl.store.suppliers.class
config?
</tr> | ||
<tr class="row-even"><td>max.task.idle.ms</td> | ||
<tr class="row-even"><td>default.windowed.value.serde.inner (Deprecated.)</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems default.windowed.value.serde.inner
exist already? Why do we add it here?
@@ -493,7 +493,8 @@ public class StreamsConfig extends AbstractConfig { | |||
public static final String BUILT_IN_METRICS_VERSION_CONFIG = "built.in.metrics.version"; | |||
private static final String BUILT_IN_METRICS_VERSION_DOC = "Version of the built-in metrics to use."; | |||
|
|||
/** {@code cache.max.bytes.buffering} */ | |||
/** {@code cache.max.bytes.buffering} | |||
* @deprecated since 3.4.0 Use {@link #CACHE_MAX_BYTES_CONFIG "cache.max.bytes"} instead. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @deprecated since 3.4.0 Use {@link #CACHE_MAX_BYTES_CONFIG "cache.max.bytes"} instead. */ | |
* @deprecated since 3.4.0 Use {@link #STATESTORE_CACHE_MAX_BYTES_CONFIG "statestore.cache.max.bytes"} instead. */ |
@SuppressWarnings("WeakerAccess") | ||
@Deprecated | ||
public static final String DEFAULT_WINDOWED_VALUE_SERDE_INNER_CLASS = "default.windowed.value.serde.inner"; | ||
private static final String DEFAULT_WINDOWED_VALUE_SERDE_INNER_CLASS_DOC = "Default serializer / deserializer for the inner class of a windowed value. Must implement the " + | ||
"<code>org.apache.kafka.common.serialization.Serde</code> interface."; | ||
|
||
public static final String WINDOWED_INNER_CLASS_SERDE = "windowed.inner.class.serde"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we cannot just remove it, but also marked as deprecated only instead? Also, given that deprecation is part of KIP-1020, should this be only done in the KIP-1020 PR, and this PR would only do the docs cleanup?
@@ -647,7 +645,8 @@ public class StreamsConfig extends AbstractConfig { | |||
@SuppressWarnings("WeakerAccess") | |||
public static final String METRICS_SAMPLE_WINDOW_MS_CONFIG = CommonClientConfigs.METRICS_SAMPLE_WINDOW_MS_CONFIG; | |||
|
|||
/** {@code auto.include.jmx.reporter} */ | |||
/** {@code auto.include.jmx.reporter | |||
* @deprecated and will removed in 4.0.0 Use {@link JMX_REPORTER "jmx.reporter"} instead.} */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find! Did not even know about this one.
@@ -300,7 +305,7 @@ <h4><a class="toc-backref" href="#id23">num.standby.replicas</a><a class="header | |||
set by the user or all serdes must be passed in explicitly (see also default.key.serde).</td> | |||
<td><code class="docutils literal"><span class="pre">null</span></code></td> | |||
</tr> | |||
<tr class="row-even"><td>default.windowed.key.serde.inner</td> | |||
<tr class="row-even"><td>default.windowed.key.serde.inner (Deprecated.)</td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below is default.window.value.serde.inner
which was also deprecated, right? (L308 original, new L313)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had one last comment, but I think you can address it via #14448.
<td>Medium</td> | ||
<td colspan="2">Default serializer/deserializer for the inner class of windowed keys, implementing the <code class="docutils literal"><span class="pre">Serde</span></code> interface.</td> | ||
<td><code class="docutils literal"><span class="pre">null</span></code></td> | ||
</tr> | ||
<tr class="row-odd"><td>default.windowed.value.serde.inner</td> | ||
<tr class="row-odd"><td>default.windowed.value.serde.inner (Deprecated. Use windowed.inner.class.serde instead.)</td> | ||
<td>Medium</td> | ||
<td colspan="2">Default serializer/deserializer for the inner class of windowed values, implementing the <code class="docutils literal"><span class="pre">Serde</span></code> interface.</td> | ||
<td><code class="docutils literal"><span class="pre">null</span></code></td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just see by change, that we use a different formatting for marking a deprecated config:
<tr class="row-even"><td>default.dsl.store</td>
<td>Low</td>
<td colspan="2">
DEPRECATED] The default state store type used by DSL operators. Deprecated in
favor of <code>dsl.store.suppliers.class</code>
</td>
I think we should use a unified formatting -- don't have a preference which one.
Thanks for the PR @Cerchie! Merged to |
First pass at identifying deprecated configs listed in https://kafka.apache.org/21/documentation/streams/developer-guide/config-streams
There might be some discussion to be had regarding what to say about why these are deprecated.
Also, in the StreamsConfig.java line 1355, some logic throws errors when EXACTLY_ONCE, EXACTLY_ONCE_BETA, and RETRIES are used. Should that be updated as well, considering that there are new pieces of config with the
@Deprecated
annotation?Committer Checklist (excluded from commit message)