KAFKA-3968: fsync the parent directory of a segment file when the file is created #10405

ccding · 2021-03-25T23:10:17Z

Kafka does not call fsync() on the directory when a new log segment is created and flushed to disk.

The problem is that following sequence of calls doesn't guarantee file durability:

fd = open("log", O_RDWR | O_CREATE); // suppose open creates "log"
write(fd);
fsync(fd);

If the system crashes after fsync() but before the parent directory has been flushed to disk, the log file can disappear.

This PR is to flush the directory when flush() is called for the first time.

…le is created

ijuma · 2021-03-26T03:07:45Z

Thanks for the PR. Did we check the performance impact of this change?

junrao

@ccding : Thanks for the PR. A couple of comments below.

junrao · 2021-03-26T16:12:41Z

clients/src/main/java/org/apache/kafka/common/record/FileRecords.java

@@ -427,7 +445,7 @@ public static FileRecords open(File file,
                                   boolean preallocate) throws IOException {
        FileChannel channel = openChannel(file, mutable, fileAlreadyExists, initFileSize, preallocate);
        int end = (!fileAlreadyExists && preallocate) ? 0 : Integer.MAX_VALUE;
-        return new FileRecords(file, channel, 0, end, false);
+        return new FileRecords(file, channel, 0, end, false, mutable && !fileAlreadyExists);


The condition mutable && !fileAlreadyExistsdoesn't seem complete. When a broker is restarted, all existing log segments are opened with mutable and fileAlreadyExists. However, segments beyond the recovery point may not have been flushed before. When they are flushed, we need to also flush the parent directory.

fixed with passing in the hadCleanShutdown flag.

junrao · 2021-03-26T16:23:23Z

clients/src/main/java/org/apache/kafka/common/record/FileRecords.java

@@ -249,6 +266,7 @@ public void renameTo(File f) throws IOException {
        } finally {
            this.file = f;
        }
+        needFlushParentDir.set(true);


Hmm, this seems problematic. For example, when we do log cleaning, the steps are (1) write cleaned data to a new segment with .clean suffix; (2) flush the new segment; (3) rename the .clean file to .swap; (4) rename .swap to .log. There is no additional flush called after renaming. So, this flag won't trigger the flushing of the parent directory.

One way is to add a method that explicitly forces the flushing of the parent directory after renaming and add the call after step 4.

Also, it seems that we also need the logic to flush the parent directory of topic-partition. This is needed when new topic partition is added/deleted in a broker or when moving partition across disks in JBOD. The latter has the following steps: (1) copy log segment in directory topic-partition in one disk to directory topic-partition-future in another disk; (2) once the copying is done, rename topic-partition-future to topic-partition. Here, after step (2) it seems that we need the logic to flush the parent directory in both the old and the new disk.

Fixed both.

junrao

@ccding : Thanks for the updated PR. A few more comments.

junrao · 2021-03-31T23:57:36Z

clients/src/main/java/org/apache/kafka/common/record/FileRecords.java

@@ -195,6 +199,17 @@ public int append(MemoryRecords records) throws IOException {
     */
    public void flush() throws IOException {
        channel.force(true);
+        if (needFlushParentDir.getAndSet(false)) {


Ideally, we want to flush the parent dir first before setting needFlush to false.

Per our offline discussion, we leave it unchanged. If the flush causes an IOException, the partition will go offline and doesn't have further chances to call flush again.

junrao · 2021-03-31T23:57:44Z

clients/src/main/java/org/apache/kafka/common/record/FileRecords.java

+     * Flush the parent directory of a file to the physical disk, which makes sure the file is accessible after crashing.
+     */
+    public void flushParentDir() throws IOException {
+        needFlushParentDir.set(false);


Ideally, we want to flush the parent dir first before setting needFlush to false.

Same as above.

Setting flag first to prevent other threads from calling flush concurrently.

junrao · 2021-04-01T00:02:11Z

clients/src/main/java/org/apache/kafka/common/record/FileRecords.java

@@ -427,7 +442,7 @@ public static FileRecords open(File file,
                                   boolean preallocate) throws IOException {
        FileChannel channel = openChannel(file, mutable, fileAlreadyExists, initFileSize, preallocate);
        int end = (!fileAlreadyExists && preallocate) ? 0 : Integer.MAX_VALUE;
-        return new FileRecords(file, channel, 0, end, false);
+        return new FileRecords(file, channel, 0, end, false, mutable);


This is ok but not the most accurate. We only need to set the flush flag to true if it's mutable and log recovery is needed.

fixed with passing in hadCleanShutdown

junrao · 2021-04-01T00:12:23Z

core/src/main/scala/kafka/log/LogSegment.scala

@@ -490,11 +490,14 @@ class LogSegment private[log] (val log: FileRecords,
   * Change the suffix for the index and log files for this log segment
   * IOException from this method should be handled by the caller
   */
-  def changeFileSuffixes(oldSuffix: String, newSuffix: String): Unit = {
+  def changeFileSuffixes(oldSuffix: String, newSuffix: String, needFlushParentDir: Boolean = true): Unit = {


Hmm, we need to pass needFlushParentDir to each of log.renameTo and index.renameTo to disable flushing, right?

set the flag to false for rename.

junrao · 2021-04-01T00:34:27Z

core/src/main/scala/kafka/log/LogManager.scala

@@ -848,6 +849,7 @@ class LogManager(logDirs: Seq[File],
      val dir = new File(logDirPath, logDirName)
      try {
        Files.createDirectories(dir.toPath)
+        Utils.flushParentDir(dir.toPath)


I am wondering if we should flush the parent dir when we delete a log too. This is not strictly required for every delete. So one option is to flush every parent dir when closing the LogManager.

Per our offline discussion, we decided not to flush at deletion. Deletions are async and can be retried after rebooting.

junrao

@ccding : Thanks for the updated PR. Just a couple of minor comments.

Also, as Ismael mentioned, it would be useful to run some per tests (e.g. ProducerPerformance) to see if there is any noticeable performance degradation with the PR.

junrao · 2021-04-01T21:27:31Z

core/src/main/scala/kafka/log/Log.scala

@@ -320,7 +320,7 @@ class Log(@volatile private var _dir: File,
    initializeLeaderEpochCache()
    initializePartitionMetadata()

-    val nextOffset = loadSegments()
+    val nextOffset = loadSegments(hadCleanShutdown)


There is no need to pass hadCleanShutdown in since it's already accessible from loadSegments().

junrao · 2021-04-01T21:35:43Z

clients/src/main/java/org/apache/kafka/common/record/FileRecords.java

    }

    public static FileRecords open(File file,
                                   boolean fileAlreadyExists,
                                   int initFileSize,
-                                   boolean preallocate) throws IOException {
-        return open(file, true, fileAlreadyExists, initFileSize, preallocate);
+                                   boolean preallocate, boolean hadCleanShutdown) throws IOException {


It's probably more intuitive to change hadCleanShutdown to needsRecovery and pass in the negation of the flag. Then, the default value of false makes more sense.

Also, could we put the change in a separate line to match the existing format?

junrao

@ccding : Thanks for the update PR. A few more minor comments below.

junrao · 2021-04-02T01:02:21Z

clients/src/main/java/org/apache/kafka/common/record/FileRecords.java

@@ -433,7 +440,8 @@ public static FileRecords open(File file,
    public static FileRecords open(File file,
                                   boolean fileAlreadyExists,
                                   int initFileSize,
-                                   boolean preallocate) throws IOException {
+                                   boolean preallocate,
+                                   boolean needsRecovery) throws IOException {


This change seems unneeded?

junrao · 2021-04-02T01:03:46Z

core/src/main/scala/kafka/log/LogSegment.scala

@@ -59,7 +60,8 @@ class LogSegment private[log] (val log: FileRecords,
                               val baseOffset: Long,
                               val indexIntervalBytes: Int,
                               val rollJitterMs: Long,
-                               val time: Time) extends Logging {
+                               val time: Time,
+                               val needsFlushParentDir: Boolean = false) extends Logging {


Could we add the new param to the javadoc?

junrao · 2021-04-02T01:04:34Z

core/src/main/scala/kafka/log/LogSegment.scala

@@ -95,6 +97,9 @@ class LogSegment private[log] (val log: FileRecords,
  /* the number of bytes since we last added an entry in the offset index */
  private var bytesSinceLastIndexEntry = 0

+  /* whether or not we need to flush the parent dir during flush */


during flush => during the first flush ?

Not really. We changed the value of atomicNeedsFlushParentDir after the first flush. The value of needsFlushParentDir in the construction function is for during the first flush. Do you have any suggestions on how to comment on them?

Changed it to during the next flush.

junrao · 2021-04-02T01:11:31Z

core/src/main/scala/kafka/log/LogSegment.scala

@@ -657,17 +668,19 @@ class LogSegment private[log] (val log: FileRecords,
 object LogSegment {

  def open(dir: File, baseOffset: Long, config: LogConfig, time: Time, fileAlreadyExists: Boolean = false,
-           initFileSize: Int = 0, preallocate: Boolean = false, fileSuffix: String = ""): LogSegment = {
+           initFileSize: Int = 0, preallocate: Boolean = false, fileSuffix: String = "",
+           needsRecovery: Boolean = true): LogSegment = {


It seems that needsRecovery should default to false?

Yeah, I changed this but didn't push. Thanks

ccding · 2021-04-02T02:29:44Z

Fixed comments from @junrao

Also addressed the two problems we discussed offline:

flush the parent of new segments during its first flush: added the needsFlushParentDir = needsRecovery || !fileAlreadyExists check
flush the parent directory after flushing the log file and all index files

Please take a look @junrao

ccding · 2021-04-02T03:13:47Z

Ran bin/kafka-producer-perf-test.sh with default settings and 1KB record size.

The result before applying this PR:

1205625 records sent, 241125.0 records/sec (235.47 MB/sec), 127.4 ms avg latency, 204.0 ms max latency.
1177965 records sent, 235593.0 records/sec (230.07 MB/sec), 130.2 ms avg latency, 202.0 ms max latency.
1198860 records sent, 239772.0 records/sec (234.15 MB/sec), 128.2 ms avg latency, 197.0 ms max latency.
1182420 records sent, 236484.0 records/sec (230.94 MB/sec), 129.8 ms avg latency, 201.0 ms max latency.
1172340 records sent, 234468.0 records/sec (228.97 MB/sec), 131.0 ms avg latency, 204.0 ms max latency.
1197570 records sent, 239514.0 records/sec (233.90 MB/sec), 128.3 ms avg latency, 203.0 ms max latency.
1178820 records sent, 235764.0 records/sec (230.24 MB/sec), 130.2 ms avg latency, 225.0 ms max latency.
1173870 records sent, 234774.0 records/sec (229.27 MB/sec), 130.8 ms avg latency, 201.0 ms max latency.
1152990 records sent, 230598.0 records/sec (225.19 MB/sec), 133.1 ms avg latency, 212.0 ms max latency.

The result after applying this PR:

1147650 records sent, 229530.0 records/sec (224.15 MB/sec), 133.9 ms avg latency, 216.0 ms max latency.
1184085 records sent, 236817.0 records/sec (231.27 MB/sec), 129.7 ms avg latency, 213.0 ms max latency.
1213275 records sent, 242655.0 records/sec (236.97 MB/sec), 126.5 ms avg latency, 204.0 ms max latency.
1176105 records sent, 235221.0 records/sec (229.71 MB/sec), 130.5 ms avg latency, 211.0 ms max latency.
1143045 records sent, 228609.0 records/sec (223.25 MB/sec), 134.4 ms avg latency, 231.0 ms max latency.
1113390 records sent, 222678.0 records/sec (217.46 MB/sec), 138.0 ms avg latency, 236.0 ms max latency.
1133850 records sent, 226770.0 records/sec (221.46 MB/sec), 135.3 ms avg latency, 208.0 ms max latency.
1063410 records sent, 212682.0 records/sec (207.70 MB/sec), 144.6 ms avg latency, 216.0 ms max latency.
1128195 records sent, 225639.0 records/sec (220.35 MB/sec), 136.2 ms avg latency, 235.0 ms max latency.

The performance decreases a little, but not a significant difference.

junrao

@ccding : Thanks for the updated PR and the performance results. The changes look good to me.

Are the jenkins test failures related to this PR?

For the performance results, what's the log segment size you used? It would be useful to try a smaller segment size (e.g. 10MB) to see the impact of this PR.

ccding · 2021-04-02T18:00:15Z

@junrao Thanks for the reiview. The previous test was run with default settings in config/server.properties, where log.segment.bytes=1073741824 (1GB). I changed it to log.segment.bytes=10737418 (~10MB) and re-ran the test. I, again, used 1KB record size.

The result before applying this PR:

384286 records sent, 76749.8 records/sec (74.95 MB/sec), 361.0 ms avg latency, 568.0 ms max latency.
486450 records sent, 96575.3 records/sec (94.31 MB/sec), 318.9 ms avg latency, 346.0 ms max latency.
476100 records sent, 94802.9 records/sec (92.58 MB/sec), 322.1 ms avg latency, 368.0 ms max latency.
473010 records sent, 94602.0 records/sec (92.38 MB/sec), 328.8 ms avg latency, 370.0 ms max latency.
462570 records sent, 92514.0 records/sec (90.35 MB/sec), 329.6 ms avg latency, 363.0 ms max latency.
462405 records sent, 92481.0 records/sec (90.31 MB/sec), 331.4 ms avg latency, 373.0 ms max latency.
475485 records sent, 95097.0 records/sec (92.87 MB/sec), 322.9 ms avg latency, 353.0 ms max latency.
475980 records sent, 95157.9 records/sec (92.93 MB/sec), 322.4 ms avg latency, 380.0 ms max latency.
476190 records sent, 95238.0 records/sec (93.01 MB/sec), 323.9 ms avg latency, 366.0 ms max latency.
474345 records sent, 94869.0 records/sec (92.65 MB/sec), 326.8 ms avg latency, 386.0 ms max latency.
488115 records sent, 96752.2 records/sec (94.48 MB/sec), 314.0 ms avg latency, 344.0 ms max latency.
485220 records sent, 97044.0 records/sec (94.77 MB/sec), 320.9 ms avg latency, 358.0 ms max latency.
487740 records sent, 97548.0 records/sec (95.26 MB/sec), 311.4 ms avg latency, 353.0 ms max latency.
493755 records sent, 98751.0 records/sec (96.44 MB/sec), 313.8 ms avg latency, 348.0 ms max latency.

The result after applying this PR:

253786 records sent, 50757.2 records/sec (49.57 MB/sec), 542.9 ms avg latency, 1099.0 ms max latency.
439665 records sent, 87862.7 records/sec (85.80 MB/sec), 351.7 ms avg latency, 487.0 ms max latency.
458580 records sent, 91716.0 records/sec (89.57 MB/sec), 337.6 ms avg latency, 417.0 ms max latency.
477015 records sent, 95403.0 records/sec (93.17 MB/sec), 322.0 ms avg latency, 359.0 ms max latency.
492705 records sent, 97584.7 records/sec (95.30 MB/sec), 313.8 ms avg latency, 344.0 ms max latency.
492240 records sent, 98448.0 records/sec (96.14 MB/sec), 314.3 ms avg latency, 358.0 ms max latency.
495810 records sent, 99162.0 records/sec (96.84 MB/sec), 308.9 ms avg latency, 357.0 ms max latency.
483675 records sent, 96735.0 records/sec (94.47 MB/sec), 317.1 ms avg latency, 365.0 ms max latency.
478230 records sent, 95646.0 records/sec (93.40 MB/sec), 319.8 ms avg latency, 365.0 ms max latency.
482295 records sent, 95560.7 records/sec (93.32 MB/sec), 321.1 ms avg latency, 427.0 ms max latency.
491430 records sent, 98286.0 records/sec (95.98 MB/sec), 315.5 ms avg latency, 373.0 ms max latency.
489615 records sent, 97923.0 records/sec (95.63 MB/sec), 314.0 ms avg latency, 367.0 ms max latency.
405855 records sent, 81154.8 records/sec (79.25 MB/sec), 374.0 ms avg latency, 485.0 ms max latency.
455400 records sent, 91061.8 records/sec (88.93 MB/sec), 338.5 ms avg latency, 416.0 ms max latency.
470325 records sent, 94065.0 records/sec (91.86 MB/sec), 327.5 ms avg latency, 424.0 ms max latency.
444465 records sent, 88893.0 records/sec (86.81 MB/sec), 343.3 ms avg latency, 426.0 ms max latency.
410010 records sent, 81789.3 records/sec (79.87 MB/sec), 374.5 ms avg latency, 485.0 ms max latency.
460455 records sent, 92091.0 records/sec (89.93 MB/sec), 338.6 ms avg latency, 411.0 ms max latency.

We can clearly see some differences. Is this a concern?

ccding · 2021-04-02T18:06:41Z

Failed tests are unrelated (connect and streams) and passed on my local run.

junrao · 2021-04-02T20:48:13Z

@ccding : Thanks for the experimental results. It seems there is a 5-10% throughput drop with the new PR for 10MB segment. This may not be a big concern since it's an uncommon setting. It's interesting that the absolute throughput dropped significantly with 10MB segments compared with 1GB segment.

Could you redo the tests for 100MB segments? Thanks.

ccding · 2021-04-02T22:45:05Z

With 100MB segment size:

Before this PR:

422206 records sent, 84441.2 records/sec (82.46 MB/sec), 333.0 ms avg latency, 997.0 ms max latency.
1016955 records sent, 203391.0 records/sec (198.62 MB/sec), 152.7 ms avg latency, 230.0 ms max latency.
986760 records sent, 197352.0 records/sec (192.73 MB/sec), 155.5 ms avg latency, 293.0 ms max latency.
1025070 records sent, 205014.0 records/sec (200.21 MB/sec), 150.0 ms avg latency, 231.0 ms max latency.
1034265 records sent, 206853.0 records/sec (202.00 MB/sec), 148.5 ms avg latency, 212.0 ms max latency.
1025280 records sent, 205056.0 records/sec (200.25 MB/sec), 149.2 ms avg latency, 222.0 ms max latency.
1033485 records sent, 206697.0 records/sec (201.85 MB/sec), 148.6 ms avg latency, 212.0 ms max latency.
1036230 records sent, 207246.0 records/sec (202.39 MB/sec), 148.2 ms avg latency, 220.0 ms max latency.
1034385 records sent, 206877.0 records/sec (202.03 MB/sec), 148.4 ms avg latency, 216.0 ms max latency.
1013655 records sent, 201401.7 records/sec (196.68 MB/sec), 151.5 ms avg latency, 247.0 ms max latency.
1035300 records sent, 206481.9 records/sec (201.64 MB/sec), 149.1 ms avg latency, 213.0 ms max latency.
1035585 records sent, 207117.0 records/sec (202.26 MB/sec), 148.4 ms avg latency, 217.0 ms max latency.
1035015 records sent, 205197.3 records/sec (200.39 MB/sec), 149.4 ms avg latency, 231.0 ms max latency.

After this PR:

363796 records sent, 72759.2 records/sec (71.05 MB/sec), 389.1 ms avg latency, 1005.0 ms max latency.
992910 records sent, 198582.0 records/sec (193.93 MB/sec), 154.5 ms avg latency, 281.0 ms max latency.
989655 records sent, 197931.0 records/sec (193.29 MB/sec), 156.4 ms avg latency, 250.0 ms max latency.
1026900 records sent, 205380.0 records/sec (200.57 MB/sec), 149.6 ms avg latency, 217.0 ms max latency.
1033515 records sent, 206703.0 records/sec (201.86 MB/sec), 148.5 ms avg latency, 205.0 ms max latency.
1034775 records sent, 206955.0 records/sec (202.10 MB/sec), 148.5 ms avg latency, 201.0 ms max latency.
1035420 records sent, 207084.0 records/sec (202.23 MB/sec), 148.3 ms avg latency, 210.0 ms max latency.
1013130 records sent, 202626.0 records/sec (197.88 MB/sec), 151.6 ms avg latency, 216.0 ms max latency.
1010295 records sent, 202059.0 records/sec (197.32 MB/sec), 150.9 ms avg latency, 215.0 ms max latency.
1022640 records sent, 204528.0 records/sec (199.73 MB/sec), 151.2 ms avg latency, 219.0 ms max latency.
1015950 records sent, 203190.0 records/sec (198.43 MB/sec), 151.2 ms avg latency, 232.0 ms max latency.
1033725 records sent, 206745.0 records/sec (201.90 MB/sec), 148.5 ms avg latency, 208.0 ms max latency.
1024905 records sent, 204981.0 records/sec (200.18 MB/sec), 149.9 ms avg latency, 213.0 ms max latency.
1035720 records sent, 207144.0 records/sec (202.29 MB/sec), 148.3 ms avg latency, 203.0 ms max latency.
998625 records sent, 199725.0 records/sec (195.04 MB/sec), 153.8 ms avg latency, 214.0 ms max latency.

ccding · 2021-04-02T22:48:22Z

It's interesting that the absolute throughput dropped significantly with 10MB segments compared with 1GB segment.

I am thinking it is the cost of the extra flush. We have one extra flush per segment, which is 1 extra flush per 10,000 records for 10MB segments and 1KB records. If it were 1GB segments, there is 1 extra flush per 1,000,000 records: 1/100 of the amortized extra cost.

junrao

@ccding : Thanks for the last experiment. It seem the performance impact there is minimal. So, the PR LGTM.

ijuma · 2021-04-03T03:10:22Z

Would the perf impact of this change be more significant with a larger number of partitions?

…e-allocations-lz4 * apache-github/trunk: (243 commits) KAFKA-12590: Remove deprecated kafka.security.auth.Authorizer, SimpleAclAuthorizer and related classes in 3.0 (apache#10450) KAFKA-3968: fsync the parent directory of a segment file when the file is created (apache#10405) KAFKA-12283: disable flaky testMultipleWorkersRejoining to stabilize build (apache#10408) MINOR: remove KTable.to from the docs (apache#10464) MONOR: Remove redudant LocalLogManager (apache#10325) MINOR: support ImplicitLinkedHashCollection#sort (apache#10456) KAFKA-12587 Remove KafkaPrincipal#fromString for 3.0 (apache#10447) KAFKA-12426: Missing logic to create partition.metadata files in RaftReplicaManager (apache#10282) MINOR: Improve reproducability of raft simulation tests (apache#10422) KAFKA-12474: Handle failure to write new session keys gracefully (apache#10396) KAFKA-12593: Fix Apache License headers (apache#10452) MINOR: Fix typo in MirrorMaker v2 documentation (apache#10433) KAFKA-12600: Remove deprecated config value `default` for client config `client.dns.lookup` (apache#10458) KAFKA-12952: Remove deprecated LogConfig.Compact (apache#10451) Initial commit (apache#10454) KAFKA-12575: Eliminate Log.isLogDirOffline boolean attribute (apache#10430) KAFKA-8405; Remove deprecated `kafka-preferred-replica-election` command (apache#10443) MINOR: Fix docs for end-to-end record latency metrics (apache#10449) MINOR Replaced File with Path in LogSegmentData. (apache#10424) KAFKA-12583: Upgrade netty to 4.1.62.Final ...

…en the file is created (#10680) (reverted #10405). #10405 has several issues, for example: It fails to create a topic with 9000 partitions. It flushes in several unnecessary places. If multiple segments of the same partition are flushed at roughly the same time, we may end up doing multiple unnecessary flushes: the logic of handling the flush in LogSegments.scala is weird. Kafka does not call fsync() on the directory when a new log segment is created and flushed to disk. The problem is that following sequence of calls doesn't guarantee file durability: fd = open("log", O_RDWR | O_CREATE); // suppose open creates "log" write(fd); fsync(fd); If the system crashes after fsync() but before the parent directory has been flushed to disk, the log file can disappear. This PR is to flush the directory when flush() is called for the first time. Did performance test which shows this PR has a minimal performance impact on Kafka clusters. Reviewers: Jun Rao <junrao@gmail.com>

[KAFKA-3968] fsync the parent directory of a segment file when the fi…

459facb

…le is created

ccding changed the title ~~[KAFKA-3968] fsync the parent directory of a segment file when the file is created~~ KAFKA-3968: fsync the parent directory of a segment file when the file is created Mar 25, 2021

move import

b4c8284

junrao reviewed Mar 26, 2021

View reviewed changes

ccding added 11 commits March 29, 2021 10:00

address comments (except the topic partition one)

b260169

remove import

ba086e9

reuse the function in utils.java

2a19e0e

simplify logic

40a1abe

default changeFileSuffixes flush to true

1ac80b6

flush when mkdirs

09cac0b

revert accidential change

5be95aa

atomicMoveWithFallback

c9448c8

no flush parent dir in test

daeb698

check null pointer

0d4800b

fix unit test error

95a6c3f

ccding requested a review from junrao March 31, 2021 16:52

junrao reviewed Apr 1, 2021

View reviewed changes

ccding added 5 commits March 31, 2021 20:55

set flag after flush

8c859f3

disable flushing on renameTo

fdc1faa

address comments based on offline discussion with Jun

6795ec9

Merge branch 'trunk' into fsync

55ae3bc

check hadCleanShutdown for open FileRecord

e653af4

junrao reviewed Apr 1, 2021

View reviewed changes

ccding added 6 commits April 1, 2021 15:36

address comments

85861ee

fix default values

1578678

more default value

fffc353

do flush in the LogSegment class

f66c545

Merge branch 'trunk' into fsync

56be9d8

remove parameter from FileRecord.open

1ecf94b

junrao reviewed Apr 2, 2021

View reviewed changes

ccding added 3 commits April 1, 2021 19:21

default to false

080a79a

add param to javadoc

61eee4a

during flush -> during the next flush

7543938

ccding requested a review from junrao April 2, 2021 03:15

junrao reviewed Apr 2, 2021

View reviewed changes

junrao approved these changes Apr 3, 2021

View reviewed changes

junrao merged commit 66b0c5c into apache:trunk Apr 3, 2021

ccding mentioned this pull request May 12, 2021

Rework on KAFKA-3968: fsync the parent directory of a segment file when the file is created #10680

Merged

3 tasks

KAFKA-3968: fsync the parent directory of a segment file when the file is created #10405

KAFKA-3968: fsync the parent directory of a segment file when the file is created #10405

Conversation

ccding commented Mar 25, 2021

ijuma commented Mar 26, 2021

junrao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junrao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ccding Apr 1, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ccding Apr 1, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junrao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

junrao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ccding commented Apr 2, 2021 • edited Loading

ccding commented Apr 2, 2021 • edited Loading

junrao left a comment

Choose a reason for hiding this comment

ccding commented Apr 2, 2021

ccding commented Apr 2, 2021 • edited Loading

junrao commented Apr 2, 2021

ccding commented Apr 2, 2021

ccding commented Apr 2, 2021 • edited Loading

junrao left a comment

Choose a reason for hiding this comment

ijuma commented Apr 3, 2021

ccding Apr 1, 2021 •

edited

Loading

ccding Apr 1, 2021 •

edited

Loading

ccding commented Apr 2, 2021 •

edited

Loading

ccding commented Apr 2, 2021 •

edited

Loading

ccding commented Apr 2, 2021 •

edited

Loading

ccding commented Apr 2, 2021 •

edited

Loading