These are course files (instructional materials) for both the BBT3104: Advanced Database Systems and the MIT8107: Advanced Database Systems courses.
Key | Value |
---|---|
Course Code | BBT3104 and MIT8107 |
Course Name | Advanced Database Systems (same name for both courses) |
URL | https://elearning.strathmore.edu/course/view.php?id=3932 and https://elearning.strathmore.edu/course/view.php?id=3277 respectively |
Semester Duration | To be confirmed |
Lecturer | Allan Omondi |
Contact | aomondi_at_strathmore_dot_edu |
IMDb captures more than 2.5 million movie titles produced over 133 years by 234,997 different companies with over 4 million actors. It subsequently contains many join-crossing correlations thus making it challenging for cardinality estimators. This is unlike TPC-H and TPC-DS which are considered trivial for cardinality estimators.
-
Recommended: The version of IMDb used in Leis et al. (2018) can be downloaded from here: http://homepages.cwi.nl/~boncz/job/imdb.tgz. This version was created in May 2013.
-
The current version of IMDb can be downloaded from here: http://www.imdb.com/interfaces
-
Create the PostgreSQL Docker container using Docker-Compose.yaml
-
Create the IMDb database in Postgres using schema.sql
-
Download the
.tgz
file, decompress it, and import each.csv
file into its respective table. You can use DBeaver (recommended) or any other similar database tool when importing. A slightly similar video tutorial is available here on YouTube.
Alternatively, you can use cinemagoer, formerly IMDbPy, to automate this step.
- Proceed to execute the Join Order Benchmark (JOB) queries available here on the IMDb database in PostgreSQL.
- Execute the q-error computation code to quantitatively determine the error between the actual cardinality and the estimated cardinality per node for each query here.
This code repository has been forked from join-order-benchmark by Greg Rahn, Moritz Eyssen, maahl, and Max Halford.
Additional code has been sourced from learnedcardinalities by Andreas Kipf
Kipf, A., Kipf, T., Radke, B., Leis, V., Boncz, P., & Kemper, A. (2019). Learned cardinalities: Estimating correlated joins with deep learning. 9th Biennial Conference on Innovative Data Systems Research (CIDR ‘19), CA, USA. https://arxiv.org/abs/1809.00677
Leis, V., Radke, B., Gubichev, A., Mirchev, A., Boncz, P., Kemper, A., & Neumann, T. (2018). Query optimization through the looking glass, and what we found running the Join Order Benchmark. The VLDB Journal, 27(5), 643–668. https://doi.org/10.1007/s00778-017-0480-7