[IOTDB-2723] Fix sequence inner space compaction lose data by THUMarkLau · Pull Request #5248 · apache/iotdb

THUMarkLau · 2022-03-15T12:26:30Z

When executing inner space compaction for sequence files, the chunks of a series are read into memory one by one, and the program uses three ways to determine how to process a chunk:

If the chunk is small or part of data in the chunk is deleted, the chunk will be deserialized into points and rewritten into chunk writer. The following chunk will be written into chunk writer util the size of chunk writer is large enough to flush.
If the chunk is too large, the program just flush it to the disk.
If the chunk is neither too small nor too large, the program just caches it in memory and merges it with the chunk following. The cached chunk will not be flush util its size is large enough.

Of course, these are rough descriptions. When the program reads a chunk that satisfies the condition of deserialization, if there is already a cached chunk in memory, the program will deserialize the cached chunk into chunk writer first, after which the freshly read chunk will be deserialized. Before the program deserializes the cached chunk, it will call the flip function of the cached chunk to make sure the chunk reader can read it correctly. However, in some cases, the cached chunk is the first cached chunk, which means it is a chunk directly read from TsFile using readMemChunk function in TsFileSequenceReader, and hasn't merged with any chunk yet. The chunk read by readMemChunk has already called flip function, while the chunk generated by mergeChunk hasn't. The program only needs to call the flip function for the later. So if the program call the flip function for the former, the flip function is called twice actually, which accounts for the error of variable position and limit in the data buffer of the chunk. Consequently, the chunk reader cannot read the data in the cached chunk correctly and the data is lost.

This bug actually has nothing to do with deletion

coveralls · 2022-03-15T13:38:31Z

Coverage decreased (-0.001%) to 65.705% when pulling 765c612 on THUMarkLau:IOTDB-2723 into c3d34b6 on apache:master.

fix bug when executing inner space compaction

765c612

THUMarkLau mentioned this pull request Mar 15, 2022

[To rel/0.13][IOTDB-2723] Fix sequence inner space compaction loses data #5249

Merged

JackieTien97 approved these changes Mar 15, 2022

View reviewed changes

JackieTien97 merged commit 9f04de9 into apache:master Mar 15, 2022

THUMarkLau deleted the IOTDB-2723 branch March 15, 2022 13:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[IOTDB-2723] Fix sequence inner space compaction lose data#5248

[IOTDB-2723] Fix sequence inner space compaction lose data#5248
JackieTien97 merged 1 commit intoapache:masterfrom
THUMarkLau:IOTDB-2723

THUMarkLau commented Mar 15, 2022 •

edited

Loading

Uh oh!

coveralls commented Mar 15, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

THUMarkLau commented Mar 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coveralls commented Mar 15, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

THUMarkLau commented Mar 15, 2022 •

edited

Loading