This repository shares all the Ingestion Tasks and queries used to conduct all the test scenarios presented in the master thesis and in the submitted paper.
There are two main folders in this repository:
- Ingestion Tasks: its subdivided into two subfolders, named "all_attributes" and "necessary_attributes". The first subfolder contains ingestion tasks to create the different tables based in a denormalized model with all the attributes from SSB model and applying different segment granularities and hashed partitions. The second one contains ingestions tasks to create the different tables based in a denormalized model, which considers just the necessary attributes to answer SSB queries. This ingestion task applies different segment granularities, query granularities and hashed partitions;
- Querying: its subdivided into two subfolders, named "Druid" and "Hive". The subfolder "Druid" contains scripts to run all 13 SSB queries (single user, multiuser, with cache and without cache), four times each, and to save the processing time. The subfolder "Hive" is similar to the first one, with the difference that the scripts are modified to run in Hive. It's important to note that this queries were adapted from the original SSB queries to query the denormalized tables.