Skip to content
Ham Ji Seong edited this page Mar 8, 2019 · 1 revision

Multicolumn Join

Changes from the previous project

  1. Modify the declaration of the API function from "int delete (int table_id, keynum_t key)" to "int remove (int table_id, keynum_t key)"
  2. Change the file extensions of all source files in the source folder(project4/bpt/src/) into ".cpp".
  3. Enhance the code readability by introducing OOP concepts in C++.
  4. Modify the current record format for supporting multiple columns.
  5. Support simple command parsing
  6. Support join features
  7. Parallel Execution

1. Modify the declaration of the API function

  • Modification of the signatures of 'delete' API call from "int delete (int table_id, keynum_t key)" to "int remove (int table_id, keynum_t key)"

2. Change the file extensions of all source files into ".cpp"

  • Change the current file extensions into cpp(C++).

3. Enhance the code readability

  • Using object oriented architectures, the readability and productivity of codes get better.

4. Modify the current record format for supporting multiple columns.

  • Supply the multi-column feature by replacing previously used data space.
  • Add a column copying function into "class Page" for utilization of convenient access.

5. Support simple command parsing

  • Add a simple command parser on the join query module. (ParseTree)
  • This feature is supported by C++ String.

6. Support join operation feature

  • Use just Naive Hash Join.
    • In detail, first read the entire table from the disk.
    • Second, hash the data into the an in-memory hash table.
    • Third, for each key, join the two distinct tables.
    • If there are other tables to join, proceed the same procedure to the other tables.
    • Otherwise, get the summation of all join keys.

7. Parallel Execution

  • The hashing procedure is implemented in a parallel way (using two threads hashing its own table).
  • The joining two distinct tables is also implemented in a parallel way.
  • Calculating the sum of all keys derived from the result of the final join operation is being calculated by atomic operations with multiple threads.