New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-19247][docs-zh] Update Chinese documentation after removal of Kafka 0.10 and 0.11 #13410
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit d785011 (Thu Sep 17 12:29:10 UTC 2020) ✅no warnings Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
Hi, @klion26 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Shawn-Hx thanks for your contribution, left some comments, please take a look
docs/dev/connectors/kafka.zh.md
Outdated
@@ -23,90 +23,32 @@ specific language governing permissions and limitations | |||
under the License. | |||
--> | |||
|
|||
Flink 提供了 [Apache Kafka](https://kafka.apache.org) 连接器,用于向 Kafka topic 中读取或者写入数据,可提供精确一次的处理语义。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
用于向 Kafka topic 中读取或者写入
这里需要把这句话拆开成 从xxx读取
向xxx 写入
吗?现在读起来更像 向xxx读取
有一点奇怪
要使用通用的 Kafka 连接器,请为它添加依赖关系: | ||
Apache Flink 集成了通用的 Kafka 连接器,它会尽力与 Kafka client 的最新版本保持同步。 | ||
该连接器使用的 Kafka client 版本可能会在 Flink 版本之间发生变化。 | ||
当前 Kafka client 向后兼容 0.10.0 或更高版本的 Kafka broker。 |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里我不是很确定。
英文文档中的原文是:"Modern Kafka clients are backwards compatible with broker versions 0.10.0 or later"。"backwards compatible" 直译的话应该是 “向后兼容” ?
docs/dev/connectors/kafka.zh.md
Outdated
|
||
要使用通用的 Kafka 连接器,请为它添加依赖关系: | ||
Apache Flink 集成了通用的 Kafka 连接器,它会尽力与 Kafka client 的最新版本保持同步。 | ||
该连接器使用的 Kafka client 版本可能会在 Flink 版本之间发生变化。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议这些放到一行,否则渲染之后,两行中间会有空格
docs/dev/connectors/kafka.zh.md
Outdated
|
||
然后,实例化 source( `FlinkKafkaConsumer`)和 sink( `FlinkKafkaProducer`)。除了从模块和类名中删除了特定的 Kafka 版本外,这个 API 向后兼容 Kafka 0.11 版本的 connector。 | ||
Flink 目前的流连接器还不是二进制发行版的一部分。 | ||
[在此处]({{ site.baseurl }}/zh/dev/project-configuration.html)可以了解到如何链接它们以实现在集群中执行。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 链接建议使用 {% link %}
的方式,参考邮件列表 [1]
2 如何链接它们以实现在集群中执行
这个还能否优化下呢,现在知道是什么意思,但是读起来还有点拗口
docs/dev/connectors/kafka.zh.md
Outdated
用户如果要自己去实现一个`DeserializationSchema`,需要自己去实现 `getProducedType(...)`方法。 | ||
|
||
为了访问 Kafka 消息的 key、value 和元数据,`KafkaDeserializationSchema` 具有以下反序列化方法 `T deserialize(ConsumerRecord<byte[], byte[]> record)`。 | ||
Flink Kafka Consumer 需要知道如何将 Kafka 中的二进制数据转换为 Java 或者 Scala 对象。`KafkaDeserializationSchema` 允许用户指定这样的 schema,为每条 Kafka 消息调用 `T deserialize(ConsumerRecord<byte[], byte[]> record)` 方法,传递来自 Kafka 的值。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
“为每条 Kafka 消息调用 T deserialize(ConsumerRecord<byte[], byte[]> record)
方法,传递来自 Kafka 的值” 这里如果改成 “每条 Kafka 中的消息会调用 T deserialize(ConsumerRecord<byte[], byte[]> record)
反序列化” 这样会好一些吗?
docs/dev/connectors/kafka.zh.md
Outdated
在内部,每个 Kafka 分区执行一个 assigner 实例。当指定了这样的 assigner 时,对于从 Kafka 读取的每条消息,调用 `extractTimestamp(T element, long previousElementTimestamp)` 来为记录分配时间戳,并为 `Watermark getCurrentWatermark()`(定期形式)或 `Watermark checkAndGetNextWatermark(T lastElement, long extractedTimestamp)`(打点形式)以确定是否应该发出新的 watermark 以及使用哪个时间戳。 | ||
|
||
**请注意**:如果 watermark assigner 依赖于从 Kafka 读取的消息来上涨其 watermark (通常就是这种情况),那么所有主题和分区都需要有连续的消息流。否则,整个应用程序的 watermark 将无法上涨,所有基于时间的算子(例如时间窗口或带有计时器的函数)也无法运行。单个的 Kafka 分区也会导致这种反应。这是一个已在计划中的 Flink 改进,目的是为了防止这种情况发生(请见[FLINK-5479: Per-partition watermarks in FlinkKafkaConsumer should consider idle partitions](https://issues.apache.org/jira/browse/FLINK-5479))。同时,可能的解决方法是将*心跳消息*发送到所有 consumer 的分区里,从而上涨空闲分区的 watermark。 | ||
**请注意**:如果 watermark assigner 依赖于从 Kafka 读取的消息来上涨其 watermark (通常就是这种情况),那么所有主题和分区都需要有连续的消息流。否则,整个应用程序的 watermark 将无法上涨,所有基于时间的算子(例如时间窗口或带有计时器的函数)也无法运行。单个的 Kafka 分区也会导致这种反应。考虑设置适当的 [idelness timeouts]({{ site.baseurl }}/dev/event_timestamps_watermarks.html#dealing-with-idle-sources) 来缓解这个问题。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
链接换成 {% link %}
的形式
@@ -302,8 +217,6 @@ Flink Kafka Consumer 支持发现动态创建的 Kafka 分区,并使用精准 | |||
|
|||
默认情况下,是禁用了分区发现的。若要启用它,请在提供的属性配置中为 `flink.partition-discovery.interval-millis` 设置大于 0 的值,表示发现分区的间隔是以毫秒为单位的。 | |||
|
|||
<span class="label label-danger">局限性</span> 当从 Flink 1.3.x 之前的 Flink 版本的 savepoint 恢复 consumer 时,分区发现无法在恢复运行时启用。如果启用了,那么还原将会失败并且出现异常。在这种情况下,为了使用分区发现,请首先在 Flink 1.3.x 中使用 savepoint,然后再从 savepoint 中恢复。 | |||
|
|||
#### Topic 发现 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
既然更像这个文档了,也按照 wiki 的方式,给所有标题添加一下 标签吧,可以仿照英文版的 url 进行添加
docs/dev/connectors/kafka.zh.md
Outdated
|
||
默认情况下,如果没有为 Flink Kafka Producer 指定自定义分区程序,则 producer 将使用 `FlinkFixedPartitioner` 为每个 Flink Kafka Producer 并行子任务映射到单个 Kafka 分区(即,接收子任务接收到的所有消息都将位于同一个 Kafka 分区中)。 | ||
用户可以对如何将数据写到Kafka进行细粒度的控制。你可以通过 producer record: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
用户可以对如何将数据写到Kafka进行细粒度的控制。你可以通过 producer record: | |
用户可以对如何将数据写到 Kafka 进行细粒度的控制。你可以通过 producer record: |
docs/dev/connectors/kafka.zh.md
Outdated
|
||
### Kafka Producer 和容错 | ||
|
||
启用 Flink 的checkpointing 后,`FlinkKafkaProducer` 可以提供精确一次的语义保证。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
启用 Flink 的checkpointing 后,`FlinkKafkaProducer` 可以提供精确一次的语义保证。 | |
启用 Flink 的 checkpointing 后,`FlinkKafkaProducer` 可以提供精确一次的语义保证。 |
docs/dev/connectors/kafka.zh.md
Outdated
@@ -512,6 +434,17 @@ Flink 通过 Kafka 连接器提供了一流的支持,可以对 Kerberos 配置 | |||
|
|||
有关 Kerberos 安全性 Flink 配置的更多信息,请参见[这里]({{ site.baseurl }}/zh/ops/config.html)。你也可以在[这里]({{ site.baseurl }}/zh/ops/security-kerberos.html)进一步了解 Flink 如何在内部设置基于 kerberos 的安全性。 | |||
|
|||
## 升级到最近的连接器版本 | |||
|
|||
通用的升级步骤概述见 [升级 Jobs 和 Flink 版本指南]({{ site.baseurl }}/ops/upgrading.html)。对于 Kafka,你还需要遵循这些步骤: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
链接替换成 {% link %}
Hi, @klion26 |
* `Semantic.NONE`:Flink 不会有任何语义的保证,产生的记录可能会丢失或重复。 | ||
* `Semantic.AT_LEAST_ONCE`(默认设置):可以保证不会丢失任何记录(但是记录可能会重复) | ||
* `Semantic.EXACTLY_ONCE`:使用 Kafka 事务提供精确一次语义。无论何时,在使用事务写入 Kafka 时,都要记得为所有消费 Kafka 消息的应用程序设置所需的 `isolation.level`(`read_committed` 或 `read_uncommitted` - 后者是默认值)。 | ||
|
||
##### 注意事项 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里是不是漏了 标签
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your reminding. Have added the tag.
Hi, @klion26 . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of the change
Update Chinese documentation after removal of Kafka 0.10 and 0.11.
Brief change log
Verifying this change
This change is a trivial rework / code cleanup without any test coverage.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (yes / no)Documentation