Skip to content

01_Analysis B Tree

youngSSS edited this page Oct 30, 2020 · 1 revision

Possible Call Path

Insertion

Every insertion is starting with insert function in main.c file.

// 'insert funtion' located in 'bpt.c'
node * insert( node * root, int key, int value ) {...}

In each cases below, the insert function does different operation.

Case 1 : Insert duplicate value

If a duplicate value exists, ignore its insertion.

Duplicate value is checked by find function.

find function gets node which has proper range for the value by find_leaf function.

If there is no record having the value, find function returns NULL else it returns record having the value.

Summary for call path in this case.

image

Case 2 : Insert to empty tree

First, it checks for the duplicate value, even though it is empty tree. (Case 1)

From now on Case 2's unique function call.

When tree does not exist yet, first it makes a new record for the value by make_record function and it builds a new tree by start_new_tree function with new record.

In start_new_tree function, it makes leaf node by make_leaf and make_node function.

Summary for call path in this case.

image

Case 3 : Insert into leaf node which has room for key and pointer

Suppose tree is not empty. It means we don't have to call functions to make new tree like Case 2.

First, do the duplicate value check. (Case 1)

Second, make a new record for the value. (Case 2)

From now on Case 3's unique function call.

It gets node which has appropriate position for the value by find_leaf function.

Node's num_keys is less than order - 1 means there is room for new record. So, just insert new record into node without splitting by insert_into_leaf function.

Summary for call path in this case.

image

Case 4 : Insert into leaf node which has no room for key and pointer

Suppose tree is not empty. It means we don't have to call functions to make new tree like Case 2.

First, check the duplicate value. (Case 1)

Second, make a new record for the value. (Case 2)

Third, find appropriate leaf node for new record. (Case 3)

From now on Case 4's unique function call.

Node's num_key is not less then order - 1 means there is no room for new record. So, split node into two (original node + new node) and insert new record into appropriate node among them by insert_into_leaf_after_splitting function.

Make a new leaf node for splitting by make_leaf function.

We can know where to split in original node for splitting by cut function.

After split into two nodes, to add new node's key to parent node, call insert_into_parent function.

Summary for call path until now.

image

In Case 4, there are 3 different cases.

Case 4 - 1 : Node does not have a parent

It means that tree needs new root. So, make a new root, insert new node's key into new root and let root's pointer points new children by insert_into_new_root function.

Summary for call path in this case.

image

Case 4 - 2 : Node has parent that has room for a new key

Find original node's index in parent by get_left_index function to insert new node into right position of parent.

Parent's num_keys is less than order - 1 means there is room for new key. So, insert new key into parent without splitting by insert_into_node function.

Summary for call path in this case.

image

Case 4 - 3 : Node has parent that has no room for a new key

Find original node's index in parent by get_left_index function to insert new node into right position of parent.

Parent's num_keys is not less than order - 1 means there is no room for new key. So, split parent node and insert new key into right parent node by insert_into_node_after_splitting function.

In this case, we need recursive function call between insert_into_parent and insert_into_node_after_splitting. Because, split parents's parent can be split because of k_prime (no room for new key). If parents's parent has room for k_prime, we call insert_into_node function and recursive call ends.

Summary for call path in this case.

image

Deletion

Every deletion is starting with delete and destroy_tree function in main.c file.

// 'deletion funtion' located in 'bpt.c'
node * delete(node * root, int key) {...}
node * destroy_tree(node * root) {...}

1. Deletion starts with destroy_tree function

destroy_tree function calls destroy_tree_nodes function and destroy_tree_nodes calls itself recursively until leaf node appears.

Summary for call path in this case.

image

2. Deletion starts with delete function

It calls find and find_leaf functions to find key_record and key_leaf. If both of them are not NULL, call delete_entry function.

Remove record by remove_entry_from_node function.

Summary for call path until now.

image

Case 1 : Record is in root

Because it is root, we don't have to modify structure. Just call adjust_root function.

Summary for call path in this case.

image

Case 2 : Node's num_keys is equal or more than min_keys

min_keys is the value that limits the minimum value of num_keys in node. It is determined by cut function. If num_keys is equal or more than this, just return root in delete_entry function.

Summary for call path in this case.

image

Case 3 : Node's num_keys is less than min_keys

Get neighbor_index by get_neighbor_index function. neighbor_index is nearest neighbor (sibling) index of parameter node. If neighbor is not exist, it returns -1 for special case.

Case 3 - 1 : coalesce_nodes (no recursion)

Assume that just one coalesce_nodes function call can modify tree well.

If sum of neighbor node and me is less than capacity, do the merge by coalesce_nodes function. We should delete node which key is deleted from, because its all keys move to neighbor node, there is no keys. This can be dealt by delete_entry function.

In delete_entry function, deletion will end with Case 1 or 2.

Summary for call path in this case.

  • Ends with Case 1

image

  • Ends with Case 2

image

Case 3 - 2 : redistribute_nodes

If sum of neighbor node and me is equal or more than capacity, do the split by redistribute_nodes function.

Summary for call path in this case.

image

Case 3 - 3 : coalesce_nodes (recursion)

In Case 3 - 1, if parent node's num_keys is less than min_keys, parent node needs adjust too by coalesce_nodes or redistribute_nodes function.

This case is terminated by 1, 2, 3 and be recursive by 4.

  1. adjust_root
  2. num_keys โ‰ฅ min_keys
  3. redistribute_nodes
  4. coalesce_nodes โ†’ recursion

Summary for call path in this case.

image

Detailed Flow

Split

Case 1 : leaf node์—์„œ split์ด ํ•œ๋ฒˆ๋งŒ ๋ฐœ์ƒํ•˜๋Š” ๊ฒฝ์šฐ

insertion์ด ๋ฐœ์ƒํ•˜๋ฉด ์ œ์ผ ๋จผ์ €, find_leaf ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•ด insert๋˜๋Š” record๊ฐ€ ๋“ค์–ด๊ฐ€์•ผ ํ•˜๋Š” ์ ์ ˆํ•œ leaf node๋ฅผ ์ฐพ์Šต๋‹ˆ๋‹ค.

๋งŒ์•ฝ ํ•ด๋‹น leaf node๊ฐ€ ๊ฐ€๋“ ์ฐจ ์žˆ๋‹ค๋ฉด leaf node๋ฅผ ๋‘๊ฐœ๋กœ ๋ถ„ํ• ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

leaf node๋ฅผ ๋ถ„ํ• ํ•˜๊ณ  record๋ฅผ insertํ•˜๊ธฐ ์œ„ํ•ด์„œ insert_into_leaf_after_splitting ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•ฉ๋‹ˆ๋‹ค.

์œ„ ํ•จ์ˆ˜์—์„œ ์ƒˆ๋กœ์šด leaf node๋ฅผ ์ƒ์„ฑํ•˜๊ณ  record๊ฐ€ ๋“ค์–ด๊ฐ€์•ผ ํ•˜๋Š” ์œ„์น˜์ธ insertion_index๋ฅผ ์ฐพ์Šต๋‹ˆ๋‹ค. insertion_index๋Š” record์˜ key ๋Œ€์†Œ๊ด€๊ณ„ ๋น„๊ต๋ฅผ ํ†ตํ•ด ์ฐพ์•„ ์ง‘๋‹ˆ๋‹ค.

temp_node์— ๊ธฐ์กด leaf node์˜ record๋“ค๊ณผ insert๋œ record๋ฅผ insertion_index๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ˆœ์„œ์— ๋งž๊ฒŒ ๋ชจ๋‘ ์‚ฝ์ž…ํ•œ ๋’ค ๊ธฐ์กด leaf node์˜ record๋“ค์„ ๋ชจ๋‘ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.

์ดํ›„ cut ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•ด temp_node ์•ˆ์˜ record๋“ค์„ ๊ธฐ์กด์˜ leaf node์™€ ์ƒˆ๋กœ ๋งŒ๋“  leaf node์— ์ ์ ˆํžˆ ๋ถ„๋ฐฐํ•˜๊ณ  ์ƒˆ๋กœ์šด leaf node์˜ parent๋ฅผ ๊ธฐ์กด์˜ leaf node์˜ parent์™€ ๊ฐ™๊ฒŒ ๋งŒ๋“ญ๋‹ˆ๋‹ค. ์œ„์˜ ์ ์ ˆํžˆ ๋ถ„๋ฐฐ๋Š” cut ํ•จ์ˆ˜๋กœ ์–ป์–ด์ง„ split ์œ„์น˜ ์ด์ „์˜ temp node record๋“ค์€ ๊ธฐ์กด์˜ leaf node์— ์‚ฝ์ž…ํ•˜๊ณ  split ์œ„์น˜๋ถ€ํ„ฐ๋Š” ์ƒˆ๋กœ์šด leaf node์— ์‚ฝ์ž…ํ•˜๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

์œ„์˜ ๊ณผ์ •์„ ๊ฑฐ์น˜๋ฉด ์ƒˆ๋กœ์šด key์ธ new_key๊ฐ€ ๋งŒ๋“ค์–ด ์ง‘๋‹ˆ๋‹ค.

Case 1 - 1 : ๊ธฐ์กด leaf node์˜ parent๊ฐ€ ์—†๋Š” ๊ฒฝ์šฐ

์ด ๊ฒฝ์šฐ๋Š” ๊ธฐ์กด์˜ leaf node๊ฐ€ root node์ธ ๊ฒƒ์„ ๋งํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ insert_into_new_root ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•ด ์ƒˆ๋กœ์šด root node๋ฅผ ์ƒ์„ฑํ•œ ๋’ค ๊ธฐ์กด์˜ leaf node์™€ ์ƒˆ๋กœ์šด leaf node๋ฅผ parent node์— ์—ฐ๊ฒฐํ•ฉ๋‹ˆ๋‹ค. ์œ„์—์„œ ์–ธ๊ธ‰ํ•œ new_key๋Š” root node์—์„œ ๊ธฐ์กด์˜ leaf node์™€ ์ƒˆ๋กœ์šด leaf node๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋Š” pointer ์‚ฌ์ด์˜ key๊ฐ€ ๋ฉ๋‹ˆ๋‹ค. ์ดํ›„ split์ด ์ข…๋ฃŒ ๋ฉ๋‹ˆ๋‹ค.

Case 1 - 2 : ๊ธฐ์กด leaf node์˜ parent๊ฐ€ ์กด์žฌํ•˜๋Š” ๊ฒฝ์šฐ

Case 1์˜ ์กฐ๊ฑด์—์„œ leaf node์—์„œ split์ด ํ•œ๋ฒˆ๋งŒ ๋ฐœ์ƒํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜์˜€๊ธฐ ๋•Œ๋ฌธ์— insert_into_node ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด new_key๋ฅผ parent ๋…ธ๋“œ์— ์‚ฝ์ž…ํ•˜๊ณ  parent์˜ pointer์— ์ƒˆ๋กœ์šด leaf node๋ฅผ ์—ฐ๊ฒฐํ•œ ๋’ค split์ด ์ข…๋ฃŒ ๋ฉ๋‹ˆ๋‹ค.

Case 2 : split์ด ํ•œ๋ฒˆ ์ด์ƒ ๋ฐ˜๋ณต์ ์œผ๋กœ ์ผ์–ด๋‚˜๋Š” ๊ฒฝ์šฐ

Case 1 - 2์—์„œ ๊ธฐ์กด leaf node์˜ parent์— key์™€ ์ƒˆ๋กœ์šด leaf node๊ฐ€ ๋“ค์–ด๊ฐˆ ๊ณต๊ฐ„์ด ์—†๋‹ค๋ฉด (= parent node๊ฐ€ ๊ฐ€๋“ ์ฐจ ์žˆ๋‹ค๋ฉด), parent node๋ฅผ splitํ•œ ๋’ค new_key๊ฐ€ parent node์— ์‚ฝ์ž…๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” insert_into_node_after_splitting ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด ์ด๋ฃจ์–ด ์ง‘๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์„œ ๋ถ€ํ„ฐ ์œ„์˜ parent node๋ฅผ old node, ๊ธฐ์กด์˜ leaf node๋ฅผ left node, ์‚ฝ์ž…๋˜์–ด์•ผ ํ•˜๋Š” new_key๋ฅผ key, ์ƒˆ๋กœ์šด leaf node๋ฅผ right node๋ผ๊ณ  ๋ถ€๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.

insert_into_node_after_splitting ํ•จ์ˆ˜์—์„œ split๋œ old node์˜ record๋“ค์ด ๋“ค์–ด๊ฐˆ new_node๋ฅผ ๋งŒ๋“ค๊ณ  temp_keys, temp_pointers๋ฅผ ๋งŒ๋“ค์–ด ์ค๋‹ˆ๋‹ค.

old node์˜ key, pointer๋“ค๊ณผ new_key, right node๋ฅผ temp_keys์™€ temp_pointers๋กœ ์ˆœ์„œ์— ๋งž๊ฒŒ ์˜ฎ๊ฒจ ์ค๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ ์ˆœ์„œ์— ๋งž๊ฒŒ๋Š” old node์˜ left node๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋Š” pointer์˜ ๋ฐ”๋กœ ๋‹ค์Œ pointer ์œ„์น˜์— right node๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋Š” pointer๊ฐ€, left node์™€ right node๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋Š” pointer ์‚ฌ์ด์— key๊ฐ€ ์˜ค๋„๋ก (= temp_keys์—์„œ key๊ฐ€ left node pointer index ์œ„์น˜์— ์˜ค๋„๋ก) ๋ฐฐ์น˜ํ•˜๋Š” ๊ฒƒ์„ ๋งํ•ฉ๋‹ˆ๋‹ค.

cut ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด split ์œ„์น˜๋ฅผ ์–ป์Šต๋‹ˆ๋‹ค.

temp_keys์˜ split - 1 ์œ„์น˜์˜ key๋Š” k_prime์ด ๋˜๊ณ  k_prime ๊ธฐ์ค€ ์™ผ์ชฝ์€ old node๋กœ ์˜ค๋ฅธ์ชฝ์€ new node๋กœ ์‚ฝ์ž…ํ•œ ๋’ค new node์˜ child๋“ค์˜ parent๋ฅผ new node๋กœ ๊ฐฑ์‹ ํ•˜์—ฌ ์ค๋‹ˆ๋‹ค. (k_prime์„ ๊ธฐ์ค€์œผ๋กœ ์žก์€ ๊ทผ๊ฑฐ๋Š”, pointer์™€ key๋“ค์˜ ๋ฐฐ์น˜๋ฅผ pointer[0] - key[0], pointer[1] - key[1], ... , pointer[x] - key[x] (=k_prime), ... , pointer[n-1] - key[n-1], pointer[n] ์œผ๋กœ ๋ณด์•˜์„ ๋•Œ ๊ธฐ์ค€ ์ž…๋‹ˆ๋‹ค)

์œ„์˜ ๊ณผ์ •์ด ์ข…๋ฃŒ๋˜๋ฉด insert_into_parent ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•ฉ๋‹ˆ๋‹ค.

๋งŒ์•ฝ old node๊ฐ€ root๋ผ๋ฉด insert_into_new_root ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ insertion์ด ์ข…๋ฃŒ๋ฉ๋‹ˆ๋‹ค.

๋งŒ์•ฝ old node์˜ parent์— k_prime๊ณผ new node๊ฐ€ ์‚ฝ์ž…๋  ์ž๋ฆฌ๊ฐ€ ์žˆ๋‹ค๋ฉด insert_into_node ํ•จ์ˆ˜๋ฅผ ํ†ตํ•ด insertion์ด ์ข…๋ฃŒ๋ฉ๋‹ˆ๋‹ค.

๋งŒ์•ฝ old node์˜ parent๊ฐ€ ๊ฐ€๋“ ์ฐฌ ์ƒํƒœ๋ผ๋ฉด insert_into_node_after_splitting ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ ์œ„์˜ ๊ณผ์ •์„ ๋ฐ˜๋ณต ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Merge

Redistribution

key๋ฅผ node์—์„œ ์‚ญ์ œ ํ›„ node์˜ key ๊ฐœ์ˆ˜์™€ neighbor์˜ key ๊ฐœ์ˆ˜์˜ ํ•ฉ์ด capacity๋ณด๋‹ค ์ปค์ง€๋Š” ๊ฒฝ์šฐ Redistribution์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Redistribution์€ 2๊ฐ€์ง€ ๊ฒฝ์šฐ๋กœ ๋‚˜๋ˆ„์–ด ์ง‘๋‹ˆ๋‹ค.

Case 1 : node๊ฐ€ left-most child์ธ ๊ฒฝ์šฐ

์ด ๊ฒฝ์šฐ node์˜ neighbor๋Š” node์˜ ์˜ค๋ฅธ์ชฝ sibling์ด ๋ฉ๋‹ˆ๋‹ค.

2๊ฐ€์ง€ ๊ฒฝ์šฐ๊ฐ€ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค

Case 1 - 1 : node is leaf node

node์˜ ๋งˆ์ง€๋ง‰ key์™€ pointer์˜ ๋‹ค์Œ ์œ„์น˜์— neighbor์˜ 0๋ฒˆ์งธ key์™€ pointer๋ฅผ ๊ฐ€์ ธ์˜จ ๋’ค node์˜ parent์˜ k_prime_index ์œ„์น˜์˜ key์— neighbor์˜ 1๋ฒˆ์งธ key ๊ฐ’์„ ๋ณต์‚ฌ ํ•ฉ๋‹ˆ๋‹ค.

neighbor๋Š” 0๋ฒˆ์งธ key์™€ pointer๋ฅผ node์—๊ฒŒ ์ฃผ์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ž์‹ ์˜ key์™€ pointer๋“ค์„ ์•ž์œผ๋กœ ํ•œ ์นธ์”ฉ ์ „์ง„ ์‹œํ‚ต๋‹ˆ๋‹ค.

Case 1 - 2 : node is internal node

node์˜ ๋งˆ์ง€๋ง‰ key ๋‹ค์Œ ์œ„์น˜์— k_prime์„ ๊ฐ€์ ธ์˜ค๊ณ , ๋งˆ์ง€๋ง‰ pointer์˜ ๋‹ค์Œ ์œ„์น˜์— neighbor์˜ 0๋ฒˆ์งธ ํฌ์ธํ„ฐ๋ฅผ ๊ฐ€์ ธ์˜จ ๋’ค ๊ฐ€์ ธ์˜จ ํฌ์ธํ„ฐ์˜ ๋ถ€๋ชจ๋ฅผ node๋กœ ๋ณ€๊ฒฝํ•ฉ๋‹ˆ๋‹ค. node์˜ parent์˜ k_prime_index ์œ„์น˜์˜ key ๊ฐ’์„ neighbor์˜ 0๋ฒˆ์งธ key ๊ฐ’์œผ๋กœ ๋ณ€๊ฒฝํ•ฉ๋‹ˆ๋‹ค.

neighbor๋Š” 0๋ฒˆ์งธ pointer๋ฅผ node์—๊ฒŒ, 0๋ฒˆ์งธ key๋ฅผ parent์—๊ฒŒ ์ฃผ์—ˆ์œผ๋ฏ€๋กœ ์ž์‹ ์˜ key์™€ pointer๋“ค์„ ์•ž์œผ๋กœ ํ•œ์นธ์”ฉ ์ „์ง„์‹œํ‚ต๋‹ˆ๋‹ค.

Case 2 : node is not left-most child

์ด ๊ฒฝ์šฐ node์˜ neighbor๋Š” node์˜ ์™ผ์ชฝ sibling์ด ๋ฉ๋‹ˆ๋‹ค.

2๊ฐ€์ง€ ๊ฒฝ์šฐ๊ฐ€ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.

Case 2 - 1 : node is leaf node

node์˜ key์™€ pointer๋“ค์„ ํ•œ ์นธ์”ฉ ๋’ค๋กœ ์˜ฎ๊ธด ๋’ค neighbor์˜ ๋งˆ์ง€๋ง‰ key์™€ pointer๋ฅผ node์˜ 0๋ฒˆ์งธ key์™€ pointer๋กœ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค.

์ดํ›„ node์˜ parent์˜ k_prime_index์˜ key ๊ฐ’์„ node์˜ 0๋ฒˆ์งธ key ๊ฐ’์œผ๋กœ ๋ณต์‚ฌํ•ฉ๋‹ˆ๋‹ค.

Case 2 - 2 : node in internal node

node์˜ key์™€ pointer๋“ค์„ ํ•œ ์นธ์”ฉ ๋’ค๋กœ ์˜ฎ๊ธด ๋’ค neighbor์˜ ๋งˆ์ง€๋ง‰ pointer๋ฅผ node์˜ 0๋ฒˆ์งธ pointer๋กœ ๊ฐ€์ ธ์˜ค๊ณ  k_prime ๊ฐ’์„ node์˜ 0๋ฒˆ์งธ key๋กœ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค. ๊ฐ€์ ธ์˜จ pointer์˜ parent๋ฅผ node๋กœ ๋ณ€๊ฒฝํ•ฉ๋‹ˆ๋‹ค.

์ดํ›„ parent์˜ k_prime_index ์œ„์น˜์˜ key์— neighbor์˜ ๋งˆ์ง€๋ง‰ key๋ฅผ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค.

Coalesce

key๋ฅผ node์—์„œ ์‚ญ์ œ ํ›„ node์˜ key ๊ฐœ์ˆ˜์™€ neighbor์˜ key ๊ฐœ์ˆ˜์˜ ํ•ฉ์ด capacity๋ณด๋‹ค ์ž‘์•„์ง€๋Š” ๊ฒฝ์šฐ Coalesce์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

๋งŒ์•ฝ node์˜ neighbor๊ฐ€ node์˜ ์˜ค๋ฅธ์ชฝ sibling์ธ ๊ฒฝ์šฐ, node๊ฐ€ neighbor๋ฅผ ๊ฐ€๋ฆฌํ‚ค๊ฒŒ neighbor๊ฐ€ node๋ฅผ ๊ฐ€๋ฆฌํ‚ค๊ฒŒ ๋ฐ”๊พธ์–ด ์ค๋‹ˆ๋‹ค.

Coalesce๋Š” 2๊ฐ€์ง€ ๊ฒฝ์šฐ๋กœ ๋‚˜๋ˆ„์–ด ์ง‘๋‹ˆ๋‹ค.

Case 1 : node is leaf node

neighbor ๋งˆ์ง€๋ง‰ key์™€ pointer ๋’ค์— node์˜ key์™€ pointer๋ฅผ ์ˆœ์„œ๋Œ€๋กœ ์˜ฎ๊ธฐ๊ณ  neighbor์˜ right sibling์„ ๊ฐ€๋ฆฌํ‚ค๋Š” pointer๋ฅผ node๊ฐ€ ๊ธฐ๋ฆฌํ‚ค๊ณ  ์žˆ๋Š” right sibling์œผ๋กœ ๋ณ€๊ฒฝํ•ด ์ค๋‹ˆ๋‹ค.

์ดํ›„ delete_entry ๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ node์˜ parent์—์„œ k_prime ์‚ญ์ œ๋ฅผ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Case 2 : node is internal node

neighbor์˜ ๋งˆ์ง€๋ง‰ key ๋‹ค์Œ ์œ„์น˜์— k_prime์„ ์‚ฝ์ž…ํ•ฉ๋‹ˆ๋‹ค.

neighbor์˜ ๋งˆ์ง€๋ง‰ key์™€ pointer ๋‹ค์Œ ์œ„์น˜์— node์˜ key์™€ pointer๋ฅผ ์ˆœ์„œ๋Œ€๋กœ ์˜ฎ๊ธด ๋’ค ์˜ฎ๊ธด pointer๊ฐ€ ๊ฐ€๋ฆฌํ‚ค๊ณ  ์žˆ๋Š” node๋“ค์˜ parent๋ฅผ neighbor๋กœ ๋ณ€๊ฒฝํ•˜์—ฌ ์ค๋‹ˆ๋‹ค.

์ดํ›„ delete_entry ๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ node์˜ parent์—์„œ k_prime ์‚ญ์ œ๋ฅผ ์ง„ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Case 1, 2์—์„œ delete_entry ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•˜๋ฉด 4๊ฐ€์ง€์˜ ๊ฒฝ์šฐ๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ ์ค‘ 3๊ฐ€์ง€๋Š” deletion์ด ์ข…๋ฃŒ๋˜๊ฒŒ ๋˜๊ณ , 1๊ฐ€์ง€๋Š” ๋‹ค์‹œ ์œ„์˜ Coalesce๋ฅผ ๋ฐœ์ƒ ์‹œํ‚ต๋‹ˆ๋‹ค.

deletion์ด ์ข…๋ฃŒ๋˜๋Š” 3๊ฐ€์ง€ ๊ฒฝ์šฐ

  1. adjust_root ํ•จ์ˆ˜ ํ˜ธ์ถœ

    โ†’ node์˜ parent๊ฐ€ root์ธ ๊ฒฝ์šฐ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” adjust_root ํ•จ์ˆ˜ ๋‚ด๋ถ€์—์„œ root๋ฅผ ์žฌ์กฐ์ •ํ•œ ๋’ค deletion์ด ์ข…๋ฃŒ ๋ฉ๋‹ˆ๋‹ค

  2. node์˜ parent์˜ num_keys๊ฐ€ min_keys๋ณด๋‹ค ํฐ ๊ฒฝ์šฐ

    โ†’ ์ด ๊ฒฝ์šฐ๋Š” tree๊ฐ€ modify๋  ํ•„์š”๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— deletion์ด ์ข…๋ฃŒ ๋ฉ๋‹ˆ๋‹ค.

  3. redistribute_nodes ํ•จ์ˆ˜ ํ˜ธ์ถœ

    โ†’ redistribute_nodes ํ•จ์ˆ˜ ๋‚ด๋ถ€์—์„œ ์žฌ๋ถ„๋ฐฐ ํ›„ deletion์ด ์ข…๋ฃŒ๋ฉ๋‹ˆ๋‹ค.

Coalesce๊ฐ€ ๋‹ค์‹œ ๋ฐœ์ƒํ•˜๋Š” 1๊ฐ€์ง€ ๊ฒฝ์šฐ

  1. coalesce_nodes ํ•จ์ˆ˜ ํ˜ธ์ถœ

    โ†’ ์ด ๊ฒฝ์šฐ node์˜ parent์—์„œ k_prime์„ ์‚ญ์ œํ•˜๋ฉด, parent์˜ num_keys์™€ parent ์ด์›ƒ์˜ num_keys๊ฐ€ capacity ์ดํ•˜๋กœ ๋–จ์–ด์ง„๋‹ค๋Š” ์˜๋ฏธ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์œ„์˜ coalesce ๊ณผ์ •์ด ๋ฐ˜๋ณต๋ฉ๋‹ˆ๋‹ค.

Designs for On-Disk B+ Tree

Page Layout

๊ธฐ์กด์˜ b+ tree์—์„œ ์‚ฌ์šฉ๋˜์—ˆ๋˜ node๋ฅผ page๋ผ๋Š” ์ƒˆ๋กœ์šด ๊ตฌ์กฐ์ฒด๋กœ ๋ณ€๊ฒฝํ•˜์—ฌ ์‚ฌ์šฉํ•  ๊ฒƒ ์ž…๋‹ˆ๋‹ค.

page struct๋Š” 4096 bytes์˜ ๊ณ ์ •๋œ ํฌ๊ธฐ๋ฅผ ๊ฐ€์ง€๋ฉฐ, 4๊ฐ€์ง€ ์ข…๋ฅ˜์˜ ํŽ˜์ด์ง€๊ฐ€ ์กด์žฌ ํ•ฉ๋‹ˆ๋‹ค.

  1. Header page

    โ†’ 8 bytes ํฌ๊ธฐ์˜ Free page number, Root page number, Number of pages๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐ๊ฐ ์ฒซ๋ฒˆ์งธ Free page number, Root page number, Page ๊ฐœ์ˆ˜๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

  2. Free page

    โ†’ 8 bytes ํฌ๊ธฐ์˜ Next free page number๋ฅผ ๊ฐ–๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ์ž์‹ ์˜ ๋‹ค์Œ free page number๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

  3. Internal page

    โ†’ 128 bytes ํฌ๊ธฐ์˜ page header 1๊ฐœ, 16 bytes์˜ leaf record 248๊ฐœ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

  4. Leaf page

    โ†’ 128 bytes ํฌ๊ธฐ์˜ page header 1๊ฐœ, 128 bytes์˜ leaf record 31๊ฐœ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

Record Layout

Internal page์™€ Leaf page์—๋Š” ๊ฐ๊ฐ page header์™€ record๊ฐ€ ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.

  1. Page header

    โ†’ 128 bytes์˜ ํฌ๊ธฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

    โ†’ 8 bytes ํฌ๊ธฐ์˜ Parent page number, Number of keys, 4 bytes์˜ is leaf ๊ฐ€ ์กด์žฌ ํ•ฉ๋‹ˆ๋‹ค. ๊ฐ๊ฐ ์ž์‹ ์˜ parent page number, ๊ฐ€์ง€๊ณ  ์žˆ๋Š” key์˜ ๊ฐœ์ˆ˜, ์ž์‹ ์ด leaf page์ธ์ง€ ์•„๋‹Œ์ง€๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

    โ†’ Internal page์˜ ๊ฒฝ์šฐ 8 bytes ํฌ๊ธฐ์˜ One more page number๋ฅผ ๊ฐ€์ง€๋ฉฐ, ์ด๋Š” ์ž์‹ ์˜ record๊ฐ€ ๋ชจ๋‘ ์ฑ„์›Œ์กŒ์„ ๋•Œ ๋งˆ์ง€๋ง‰ key๋ณด๋‹ค ํฌ๊ฑฐ๋‚˜ ๊ฐ™์€ key๋ฅผ ๊ฐ€์ง€๋Š” page์˜ page number๋ฅผ ๊ฐ€๋ฆฌํ‚ค๊ธฐ ์œ„ํ•จ ์ž…๋‹ˆ๋‹ค.

    โ†’ Leaf page์˜ ๊ฒฝ์šฐ 8 bytes ํฌ๊ธฐ์˜ Right sibling page number๋ฅผ ๊ฐ€์ง€๋ฉฐ, ์ด๋Š” ์ž์‹ ์˜ ์˜ค๋ฅธ์ชฝ ํ˜•์ œ์˜ page number๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ๋Š” scan์˜ ํšจ์œจ์„ฑ์„ ๋†’์ด๊ธฐ ์œ„ํ•จ ์ž…๋‹ˆ๋‹ค.

  2. Internal record

    โ†’ 8 bytes ํฌ๊ธฐ์˜ key์™€ page number๋ฅผ ๋‹ด๊ณ  ์žˆ๋Š” ๊ตฌ์กฐ์ฒด ์ž…๋‹ˆ๋‹ค.

  3. Leaf record

    โ†’ 8 bytes ํฌ๊ธฐ์˜ key์™€ 120 bytes ํฌ๊ธฐ์˜ value๋ฅผ ๋‹ด๊ณ  ์žˆ๋Š” ๊ตฌ์กฐ์ฒด ์ž…๋‹ˆ๋‹ค.

Page Layout์„ ์œ„ํ•œ Design

๊ฐ๊ฐ์˜ page๋Š” ๋ชจ๋‘ 4096 bytes์˜ ํฌ๊ธฐ๋ฅผ ๊ฐ€์ง‘๋‹ˆ๋‹ค.

4๊ฐ€์ง€ ์ข…๋ฅ˜์˜ page๋ฅผ ๊ฐ๊ฐ struct๋กœ ๊ตฌํ˜„ ํ›„ struct page_t ๋‚ด๋ถ€์—์„œ union์„ ์‚ฌ์šฉํ•˜์—ฌ 4๊ฐ€์ง€ page๋“ค์„ page_t ํƒ€์ž…์œผ๋กœ ํ•œ๋ฒˆ์— ์‚ฌ์šฉ ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

internal page์™€ leaf page๋Š” ๋™์ผํ•œ struct page๋กœ ๊ตฌํ˜„ ํ›„ ๋‚ด๋ถ€์ ์œผ๋กœ ๋‹ค๋ฅธ ๋ถ€๋ถ„๋งŒ union์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‘˜์„ page ํƒ€์ž…์œผ๋กœ ํ•œ๋ฒˆ์— ์‚ฌ์šฉํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‚ด๋ถ€์ ์œผ๋กœ ๋‹ค๋ฅธ ๋ถ€๋ถ„์ด๋ž€ one_more_pagenum๊ณผ right_sibling_pagenum, internal record์™€ leaf record๋ฅผ ๋งํ•ฉ๋‹ˆ๋‹ค.

record์˜ ๊ฒฝ์šฐ internal record์™€ leaf record์˜ struct๋ฅผ ๊ฐ๊ฐ ๋งŒ๋“ค์–ด์„œ ์‚ฌ์šฉํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์š”์•ฝ : ์•„๋ž˜์™€ ๊ฐ™์€ ์ฝ”๋“œ๋กœ page layout์ด ๋งŒ๋“ค์–ด ์ง‘๋‹ˆ๋‹ค.

/* struct which represents all kind of pages
 * (header, free, internal, leaf)
 */
typedef struct page_t {
	union {
		// header page struct
		headerPage h;
		// free page struct
		freePage f;
		// struct for internal page and leaf page
		page p;
	};
} page_t;

Disk File Layout

page๊ฐ€ ์ž์‹ ์˜ page number์— ๋งž๋Š” Disk file ์œ„์น˜์— ์žˆ๋„๋ก ๊ตฌ์„ฑํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ disk file์˜ ์‹œ์ž‘ ์œ„์น˜๋กœ ๋ถ€ํ„ฐ Page size (4096) * page number ๋งŒํผ ๋–จ์–ด์ง„ ์œ„์น˜์— page๊ฐ€ ์กด์žฌํ•˜๊ฒŒ ๊ตฌ์„ฑํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ํŠน๋ณ„ํžˆ header page๋Š” ํ•ญ์ƒ disk file์˜ offset 0๋ฒˆ ์œ„์น˜์—์„œ ์‹œ์ž‘ํ•˜๋„๋ก ๊ตฌ์„ฑ ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

I/O ์ˆ˜ํ–‰ ์‹œ์ 

๊ฐ๊ฐ์˜ I/O๋Š” Disk Manager APIs๋ฅผ ํ†ตํ•ด ์ˆ˜ํ–‰ ๋˜๋ฉฐ, File APIs ๋‚ด๋ถ€์—์„œ page๋ฅผ ์ฝ์–ด ์˜ค๊ฑฐ๋‚˜, ์ฝ์–ด ์˜จ page์˜ ๋‚ด์šฉ์— ๋ณ€๊ฒฝ์ด ์ƒ๊ธฐ๋Š” ์ฆ‰์‹œ Disk Manager APIs ํ˜ธ์ถœํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ฆ‰, page๋ฅผ ์ฝ์–ด ์™€์„œ page ๋‚ด๋ถ€ ๋ฐ์ดํ„ฐ๋ฅผ ๋ด์•ผํ•˜๋Š” ๊ฒฝ์šฐ, ์ฝ์–ด ์˜จ page ๋‚ด๋ถ€ ๊ฐ’์„ ๋ณ€๊ฒฝํ•˜์—ฌ disk์™€ in-memory ์ƒ์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ๊ฐ™์ง€ ์•Š์€ ๊ฒฝ์šฐ I/O๋ฅผ ์ˆ˜ํ–‰ ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

DB APIs

DB APIs๋Š” user์™€ db๊ฐ„์˜ interaction์„ ์œ„ํ•œ ํ•จ์ˆ˜๋“ค์„ ์„ค๊ณ„ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

int open_table(char* pathname);
int db_insert(int64_t key, char* value);
int db_delete(int64_t key);
int db_find(int64_t key, char* ret_val);

์œ„์˜ ํ•จ์ˆ˜๋“ค์„ ํ†ตํ•ด์„œ file API๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ db์— ์ ‘๊ทผ ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•  ๊ฒƒ ์ž…๋‹ˆ๋‹ค.

DB_API ์‚ฌ์šฉ์ž์˜ ๋ช…๋ น์„ ๋ฐ›์•„์˜ค๋Š” API ์ž…๋‹ˆ๋‹ค.

File APIs

File APIs๋Š” file์— ์žˆ๋Š” db ๋‚ด์šฉ์„ ๋‹ด์•„ ์˜ค๊ธฐ ์œ„ํ•œ page์— ๋Œ€ํ•œ ๊ตฌ์กฐ์ฒด๊ฐ€ ์ •์˜๋˜์–ด ์žˆ์œผ๋ฉฐ, in-memory ์ƒ์—์„œ DB APIs์˜ ์š”์ฒญ์— ์•Œ๋งž๋Š” ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ํ•จ์ˆ˜๊ฐ€ ์ •์˜๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ file์—์„œ page๋กœ ๊ฐ’์„ ์ฝ์–ด ์˜ค๊ฑฐ๋‚˜, page์—์„œ file๋กœ ๊ฐ’์„ ์“ธ ๋•Œ์™€ ๊ฐ™์ด file๊ณผ page ๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ์ด ํ•„์š”ํ•  ๋•Œ ์‚ฌ์šฉ ๋˜๋Š” ํ•จ์ˆ˜๋“ค์ด ์„ค๊ณ„๋  ๊ฒƒ ์ž…๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์„œ Disk Manager APIs๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ ๋ณ€๊ฒฝ๋œ ๊ฐ’์„ ์“ฐ๊ฑฐ๋‚˜, ์ฝ์–ด์•ผ ํ•  ๊ฐ’์„ ์ฝ์–ด ์˜ค๋Š” ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

File API๋Š” DB APIs์˜ ์š”์ฒญ์„ ๋ฐ›์•„ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๊ณ , ๊ฒฐ๊ณผ๋ฅผ ๋ฐ˜์˜ํ•˜๊ธฐ ์œ„ํ•ด Disk Manager APIs๋ฅผ ํ˜ธ์ถœํ•˜๋Š” API ์ž…๋‹ˆ๋‹ค.

Disk Manager APIs

Disk Manager APIs๋Š” memory ์ƒ์—์„œ ์ˆ˜์ •๋œ ์ •๋ณด๋“ค์„ ์‹ค์ œ disk์— ์จ์ฃผ๋Š” ์ž‘์—…์„ ํ•˜๋Š” ํ•จ์ˆ˜๋“ค์„ ์„ค๊ณ„ํ•  ๊ฒƒ ์ž…๋‹ˆ๋‹ค.

์˜ค์ง Disk Manager APIs๋ฅผ ํ†ตํ•ด์„œ๋งŒ disk์— ์ ‘๊ทผ ๊ฐ€๋Šฅํ•˜๋„๋ก ๋ถ„๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ž…๋‹ˆ๋‹ค.

Disk Manager APIs๋Š” File APIs๋กœ ๋ถ€ํ„ฐ ์ฝ๊ธฐ ์“ฐ๊ธฐ ์ž‘์—…์„ ์š”์ฒญ ๋ฐ›์•„ disk์— ์ง์ ‘ ์ ‘๊ทผํ•˜์—ฌ ์ž‘์—…์„ ์ง„ํ–‰ํ•˜๋Š” API ์ž…๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ User โ†’ DB APIs โ†’ File APIs โ†’ Disk Manager APIs โ†’ Disk์˜ ๊ตฌ์กฐ๋กœ top-down operation์ด ์ผ์–ด๋‚˜๊ฒŒ ๋” ๋””์ž์ธ ํ•  ๊ณ„ํš์ž…๋‹ˆ๋‹ค.

Clone this wiki locally