Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Checkpoint manager #341
Project Design: Checkpoint & Recovery
This PR implements checkpointing, logging and recovery. Checkpointing and recovery are new functionalities. There is also a major re-write of log manager because the master branch's logging is too simple to function right. Since catalog support is under implementation, we reverted all commits related to catalog to keep this PR clean. Logging infrastructure still uses data tables to cater to existing APIs, and should be migrated to SqlTable in the future.
Parts that our project will rely on:
Parts that our project will modify or create:
The scanning and writing process is illustrated in the diagram here
The layout of a checkpoint block is as here
Handle crashes during checkpointing
save tuples as Columns v.s. Rows:
Where to store varlen in checkpoints:
We did benchmark for these choices.
We pick the 512K + block combination at last.
We created a new SqlTableTestObject class to build tables and run transactions on sql table level to test our code. We open GC thread, logging thread along with transactions to ensure our implementation works well under real scenarios.
Trade-offs and Potential Problems
The main trade-off in this project is between performance during checkpointing and during recovery. We prefer performance during checkpointing over during recovery. So our design decisions are inclined towards faster runtime.
Two major parts that are blocked by other components are:
There are some redundancies in testing code, and can be further compacted.
Possible performance optimization in the future include:
@@ Coverage Diff @@ ## master #341 +/- ## ========================================== - Coverage 87.62% 87.57% -0.05% ========================================== Files 212 217 +5 Lines 8257 8546 +289 ========================================== + Hits 7235 7484 +249 - Misses 1022 1062 +40
What is your plan for ensuring the catalog is persisted in such a way that it is the first set of tables recovered during a recovery? Specifically, your logic for