Skip to content

Conversation

@swuferhong
Copy link
Contributor

Purpose

This pr is aims to change writerState flush from async to sync while rolling log to avoid conflicts with the writerState flush during logTablet close.

In the LogTablet#roll() method, the current implementation writes the writerState while holding the lock, but the file flush operation is performed outside the lock. However, this could potentially conflict with the writerState write operation in the LogTablet#close() method. Considering that the writerState flush operation itself is not very resource-intensive and the roll() method isn't called frequently, we consider change it to a sync flush approach.

Brief change log

Tests

API and Format

Documentation

…g to avoid conflicts with the writerState flush during logTablet close
@wuchong
Copy link
Member

wuchong commented Apr 12, 2025

It seems Kafka introduced the async on purpose to address a performance problem when flushing the writer state takes too long. What exception/problem when using async?

https://issues.apache.org/jira/browse/KAFKA-15046
apache/kafka@d71d063

@swuferhong
Copy link
Contributor Author

It seems Kafka introduced the async on purpose to address a performance problem when flushing the writer state takes too long. What exception/problem when using async?

https://issues.apache.org/jira/browse/KAFKA-15046 apache/kafka@d71d063

When testing the Fluss cluster update, the Fluss tabletServer shutdown also needs to call writerStateManager.takeSnapshot(). If this conflicts with the takeSnapshot in the roll() method, it may happen that the shutdown process acquires the lock before the roll() has flushed the snapshot and proceeds to write the snapshot, which will cause errors. Therefore, a better approach than async flushing should be used here to address the issue of blocking writes.

@wuchong wuchong merged commit f3d924e into apache:main Apr 14, 2025
3 checks passed
polyzos pushed a commit to polyzos/fluss that referenced this pull request Apr 27, 2025
…gTablet#roll (apache#729)

This aims to change the writerState flush from async to sync while rolling log to avoid conflicts with the writerState flush during logTablet close.
ZmmBigdata pushed a commit to ZmmBigdata/fluss that referenced this pull request Jun 20, 2025
…gTablet#roll (apache#729)

This aims to change the writerState flush from async to sync while rolling log to avoid conflicts with the writerState flush during logTablet close.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants