Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hotfix][kafka][docs] Add warning regarding data losses when writing … #4631

Closed
wants to merge 2 commits into from
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
22 changes: 21 additions & 1 deletion docs/dev/connectors/kafka.md
Original file line number Diff line number Diff line change
Expand Up @@ -475,7 +475,13 @@ are other constructor variants that allow providing the following:

### Kafka Producers and Fault Tolerance

With Flink's checkpointing enabled, the Flink Kafka Producer can provide
#### Kafka 0.8

Before 0.9 Kafka did not provide any mechanisms to guarantee at-least-once or exactly-once semantics.

#### Kafka 0.9 and 0.10

With Flink's checkpointing enabled, the `FlinkKafkaProducer010` can provide
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be FlinkKafkaProducer09 and FlinkKafkaProducer010?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can fix when merging.

at-least-once delivery guarantees.

Besides enabling Flink's checkpointing, you should also configure the setter
Expand All @@ -499,6 +505,20 @@ we recommend setting the number of retries to a higher value.
**Note**: There is currently no transactional producer for Kafka, so Flink can not guarantee exactly-once delivery
into a Kafka topic.

<div class="alert alert-warning">
<strong>Attention:</strong> Depending on your Kafka configuration, even after Kafka acknowledges
writes you can still experience data losses. In particular keep in mind about following properties
in Kafka config:
<ul>
<li><tt>acks</tt></li>
<li><tt>log.flush.interval.messages</tt></li>
<li><tt>log.flush.interval.ms</tt></li>
<li><tt>log.flush.*</tt></li>
</ul>
Default values for above options are easily prone to data losses. Please refer to Kafka documentation
for more explanation.
</div>

## Using Kafka timestamps and Flink event time in Kafka 0.10

Since Apache Kafka 0.10+, Kafka's messages can carry [timestamps](https://cwiki.apache.org/confluence/display/KAFKA/KIP-32+-+Add+timestamps+to+Kafka+message), indicating
Expand Down