Join

Jump to bottom

Ham Ji Seong edited this page Mar 8, 2019 · 1 revision

Multicolumn Join

Changes from the previous project

Modify the declaration of the API function from "int delete (int table_id, keynum_t key)" to "int remove (int table_id, keynum_t key)"
Change the file extensions of all source files in the source folder(project4/bpt/src/) into ".cpp".
Enhance the code readability by introducing OOP concepts in C++.
Modify the current record format for supporting multiple columns.
Support simple command parsing
Support join features
Parallel Execution

1. Modify the declaration of the API function

Modification of the signatures of 'delete' API call from "int delete (int table_id, keynum_t key)" to "int remove (int table_id, keynum_t key)"

2. Change the file extensions of all source files into ".cpp"

Change the current file extensions into cpp(C++).

3. Enhance the code readability

Using object oriented architectures, the readability and productivity of codes get better.

4. Modify the current record format for supporting multiple columns.

Supply the multi-column feature by replacing previously used data space.
Add a column copying function into "class Page" for utilization of convenient access.

5. Support simple command parsing

Add a simple command parser on the join query module. (ParseTree)
This feature is supported by C++ String.

6. Support join operation feature

Use just Naive Hash Join.
- In detail, first read the entire table from the disk.
- Second, hash the data into the an in-memory hash table.
- Third, for each key, join the two distinct tables.
- If there are other tables to join, proceed the same procedure to the other tables.
- Otherwise, get the summation of all join keys.

7. Parallel Execution

The hashing procedure is implemented in a parallel way (using two threads hashing its own table).
The joining two distinct tables is also implemented in a parallel way.
Calculating the sum of all keys derived from the result of the final join operation is being calculated by atomic operations with multiple threads.