-
Notifications
You must be signed in to change notification settings - Fork 0
02_On Disk B Tree Design
Data file's offset 0 - 4095 is always 'header_page'
All page's size is fixed as 4096 bytes.
So, the start offset of page in data file is 4096 * pagenum.

All page sizes are 4096 bytes. 4 kinds of pages.




Internal Page Header

Leaf Page Header

union is used to integrate each page structure.
typedef struct page_t {
union {
headerPage h;
freePage f;
page p;
};
} page_t;Each page is structured in different way.
typedef struct headerPage {
pagenum_t free_pagenum;
pagenum_t root_pagenum;
uint64_t num_pages;
// HEADER_PAGE_RESERVED == 4096 - 24
char reserved[HEADER_PAGE_RESERVED];
} headerPage;typedef struct freePage {
pagenum_t next_free_pagenum;
// FREE_PAGE_RESERVED == 4096 - 8
char reserved[FREE_PAGE_RESERVED];
} freePage;Internal & Leaf page are structured in same struct.
There are 2 differences between Internal page and Leaf page.
- one_more_pagenum (8 bytes) & right_sibling_pagenum (8 bytes)
- 248 i_records (3968 bytes) & 31 l_records (3968 bytes)
union is used to integrate them.
typedef struct page {
// Start of page header
pagenum_t parent_pagenum;
uint32_t is_leaf;
uint32_t num_keys;
// PAGE_HEADER_RESERVED == 128 - 24
char reserved[PAGE_HEADER_RESERVED];
union {
// Internal page
pagenum_t one_more_pagenum;
// Leaf page
pagenum_t right_sibling_pagenum;
};
// End of page header
union {
// Internal page : 248 key-pagenum pairs
internalRecord i_records[NUM_INTERNAL_RECORD];
// Leaf page : 31 key-value pairs
leafRecord l_records[NUM_LEAF_RECORD];
};
} page;2 kinds of record.


typedef struct internalRecord {
int64_t key;
pagenum_t pagenum;
} internalRecord;typedef struct leafRecord {
int64_t key;
// VALUE_SIZE == 120
char value[VALUE_SIZE];
} leafRecord;3 cases
When opening the data file from disk, I/O is happening.
When reading page from the data file, I/O is happening.
(Read page, when the content of page is needed)
When writing page to the data file, I/O is happening.
(Write page, when the content of page is changed)
disk_manager.c is a Disk Manager.
'Only' Disk Manager calls system call.

file.c is a File Manager.
File Manager manages the data file.
File Manager calls Disk Manager to access data file.

bpt.c is Index Manager.
Index Manager manages index of data file by maintaining b+ tree structure.

Index Manager calls File Manager in below 6 cases.
→ During Insertion, if new page is needed, get usable new page number by calling File Manager (file_alloc_page()).
→ If page gets empty during a deletion, free a page by calling File Manager (file_free_page(pagenum)).
→ If page reading is needed, call the File Manger(file_read_page(pagenum, dest)) .
→ If page writing is needed, call the File Manger (file_write_page(pagenum, src)).
→ Open the data file by calling the File Manager (open_file(pathname)).
db_api.c is DB APIs.
DB APIs are used as interface between user and data file.
User can open, insert, delete, find or print data file through below DB APIs.
DB APIs calls Index Manager.


Do not merge page until page's number of keys become 0.
2 cases
/* Case : key_page is leaf page */
if (key_page->p.is_leaf)
return coalesce_nodes(parent, key_page, neighbor, neighbor_flag, k_prime);
/* Case : key_page is internal page */
else {
if (neighbor->p.num_keys == internal_order - 1)
return redistribute_nodes(parent, key_page, neighbor, neighbor_flag, k_prime_index, k_prime);
else
return coalesce_nodes(parent, key_page, neighbor, neighbor_flag, k_prime);
}Only do the coalesce.
Because, leaf page can coalesce regardless of neighbor page's number of keys.
2 cases
-
Do the redistribute, when neighbor page's number of keys is internal_order - 1.
It means neighbor has no room for page number in key_page's one_more_pagenum. So, page number can not move to neighbor page.
In this case, redistribute is happened.
Neighbor page gives half of record to key_page.
-
Do the coalesce, when neighbor page's number of keys is less than internal_order - 1.
It means neighbor has room for page number in key_page's one_more_pagenum.