This project was made during my studies in UOA University and especially for the course Implementation of Database Systems.
There are comments all over my code if there is something that i didn't covered here.
Κάνουμε BF_init() μόνο μία φορά στην main.
Includes Makefile which compiles all C files with the make command. The program is executed with the command ./prog [-arg1]
- arg1: Select a record file (1,5,10,15 for records \ * K.txt)
- Bytes 0-480: Records
- Bytes 504-508: BlockId of the next Block
- Bytes 508-512: Current number of records in Block
The Hash File is implemented via Block File. Each block represents an entry of the Hash Table.
The Secondary Hash File is implemented via Block File. Each block represents an entry in the Secondary Hash Table.
- Reads the records from the file selected through the argument (1,5,10,15) and enter the record for each line of the record file first in the primary HASH TABLE with key id and then in SECONDARY HASH TABLE with key surname
- Then we print all the entries that were entered.
- And we print all the statistics for both hash tables.
Contains auxiliary functions for retrieving / initializing / changing / printing information in a record or block.
It contains auxiliary functions to create the hash of id through the hash function ** SHA1 ** as well as to calculate the corresponding index in the Hash Table.
Create File first checks if the filename given to them through access() exists and if it already exists does not continue to create the HP File. After creating and opening a Block File and allocating the first Block we copy the basic information of the Heap File to a struct structure firstBlockInfo. After copying the struct via memcpy() to the first block we call BF_WriteBlock and close the file with BF_CloseFile.
OpenFile opens the Block File, copies and returns a struct HP_info with the information of the first block. The Heap File remains open until HP_CloseFile.
CloseFile closes the heap file and releases the struct header_info.
InsertEntry, after checking the uniqueness of the Record id given, distinguishes 3 cases:
- There is no record in the Heap (except the 1st block of information) _: Bind the second block, copy the record with memcpy () and increase the number of records of this block.
- The last block has less than 5 records_: We move the pointer to the last record and copy the record having increased the number of records.
- The last block is full (contains 5 entries) : In a similar way to (1) we commit a new block, we copy the record and we change the NextNumber of the previous block to show in the blockId of the new block.
HP_InsertEntrySec does what it does and HP_InsertEntry simply inserts a secondary record instead of a record.
Runs the entries of each block until it finds the corresponding record id and replaces the record data with 0.
- Case value == NULL: We print all the records of all the blocks
- Case value == id: We print only the record with the corresponding id (1 if we checked the insert for uniqueness id).
HP_GetAllEntriesSurname is the same as HP_GetAllEntries just looking for surname instead of id.
Create Index checks if the filename given to them through access () exists and if it already exists does not proceed to the creation of HT File. After creating and opening a Block File and allocating the first block we copy the basic information of the Hash File to a struct structure firstBlockInfoHT. After copying the struct via memcpy () in the first block we initialize each entry (block) of the hash table with the hashTableEntry structure. Call BF_WriteBlock and close the file with BF_CloseFile.
OpenFile opens the Block File, copies and returns a struct HT_info with the information of the first block. The Hash File remains open until HT_CloseFile.
CloseFile closes the heap file for each entry (block) of the hash table, closes the same hash file (bf_CloseFile) and releases the struct header_info.
InsertEntry uses the functions in hashing.c to create the hash and calculate the hash table index and then one of the following:
- A bucket already exists (heapExists == 1) so it simply calls HP_InsertEntry to that bucket
- There is no bucket (heapExists == 0) so it creates a heap file, copies to the specific entry of the hash table information about the heap file and imports the entry via HP_InsertEntry.
DeleteEntry calculates the hashIndex of the given record id, goes to the corresponding entry (block) of the hash table and only if there is a bucket calls HP_DeleteEntry.
- Case value == NULL: For each entry of the block file ** in which there is a bucket ** we print all the entries of all the blocks through HP_GetAllEntries.
- Case value == id: Calculate the hashIndex of the record id, go to the corresponding entry (block) and call HP_InsertEntry for the specific id.
We run through the hash table blocks that have a bucket and:
-
calculate the total number of blocks with totalNumOfBlocks
-
Use HP_print_Min_Av_Max_Records (util.c) to calculate the minimum, average and maximum number of records contained in each bucket of the file.
-
Calculate the average number of blocks contained in the hash table buckets (totalNumOfBlocks / buckets).
-
We use HP_get_Overflow_Blocks (util.c) to calculate the number of overflow buckets and how many blocks there are in each bucket.
SHT_CreateSecondaryIndex checks if the sfilename given to them via access () exists and if it already exists does not proceed to the creation of SET File. After creating and opening a Block File and allocating the first block we copy the basic information of the Secondary Hash File to a struct structure firstBlockInfoSHT. After copying the struct via memcpy () in the first block we initialize each entry (block) of the hash table with the hashTableEntry structure. Call BF_WriteBlock and close the file with BF_CloseFile.
SHT_OpenSecondaryIndex opens the Block File, copies and returns a struct SHT_info with the information of the first block. The Secondary Hash File remains open until SHT_CloseSecondaryIndex.
SHT_CloseSecondaryIndex closes the heap file for each entry (block) of the secondary hash table, closes the same secondary hash file (bf_CloseFile) and releases the struct header_info.
We do not check here for uniqueness as in the Hash Table because there may be two registrations with the same surname.
SHT_SecondaryInsertEntry uses the functions in hashing.c to create the hash and compute the index of the secondary hash table and then one of the following:
- A bucket already exists (heapExists == 1) so it simply calls HP_InsertEntrySec to that bucket
- There is no bucket (heapExists == 0) so it creates a heap file, copies to the specific entry of the hash table information about the heap file and enters the entry via HP_InsertEntrySec.
- Case value == NULL: For each entry of the block file ** in which there is a bucket ** Call HT_GetAllEntries with NULL.
- Case value == id: We calculate the hashIndex of the secondary record id, go to the corresponding entry (block) and run the Bucket looking for the surname we are interested in, retrieve the block id that shows in the primary index and call HP_GetAllEntriesSurname () with arguments to the block file information from the primary index bucket and surname.
Depending on the filename we choose the way we control the records (+96 bytes for primary index and +32 bytes for secondary index)
We run through the hash table blocks that have a bucket and:
- calculate the total number of blocks with totalNumOfBlocks
- We use HP_print_Min_Av_Max_Records (util.c) to calculate the minimum, average and maximum number of records contained in each bucket of the file.
- Calculate the average number of blocks contained in the hash table buckets (totalNumOfBlocks / buckets).
- We use HP_get_Overflow_Blocks (util.c) to calculate the number of overflow buckets and how many blocks there are in each bucket.