GitHub - mohammadzainabbas/database-system-architecture-project: Database System Architecture's Project - Selectivity & Join estimations for Range Types in PostgreSQL ✨

Database System Architecture Project 👨🏻‍💻

Range types are data types representing a range of values of some element type (called the range's subtype). For example, ranges of timestamp might be used to represent the ranges of time that a meeting room is reserved. In this case the data type is tsrange (short for “timestamp range”), and timestamp is the subtype. The subtype must have a total order so that it is well-defined whether element values are within, before, or after a range of values.

2. Project Overview

The aim of this project is to improve the overall scheme of statistics collection and cardinality estimation for range types in PostgreSQL.

So, project can be divided up into three parts:

Statistics
Selectivity Estimations
Join Estimations

Note: We will be working with PostgreSQL 13 stable version for the development purpose.

2.1. Statistics

In the current implementation, you have range_typeanalyze function which is called whenever you do vacuum analyze on some relation/table having a range type. You can find the function definition for range_typeanalyze here.

This function sets few configurations/settings and sets a handle for computing range type stats

stats->compute_stats = compute_range_stats;

Here, we are setting compute_range_stats function as a handle to be called later for computing stats.

When you will look into compute_range_stats function (see here), you will notice that three different kinds of histograms are being calculated.

Lower Bound Histogram
Upper Bound Histogram
Length based Histogram

And we store all these into the pg_statistics for later usage.

Note: These stats are usually calculated once per million or so inserts. Or when you do vacuum analyze statement.

The first goal of our project is to determine which statistics should we calculate during the analysis phase. Do we need something extra for doing proper/better estimations for selectivity and joins for range type.

2.2. Selectivity Estimations

The second goal of our project is to implement selectivity estimation functions for the overlaps && and the strictly left of << predicates/operators.

2.3. Join Estimations

The final and main goal of our project is to implement join cardinality estimation for the overlaps && predicate/operator for the range type.

3. Debugging Guide

Please refer to debugging guide for more details.

4. Benchmark Guide

To run the benchmarks on different range_type, follow the below mentioned steps:

Clone this repo

git clone https://github.com/mohammadzainabbas/database-system-architecture-project.git
cd database-system-architecture-project

Run the benchmark script

sh scripts/run_benchmark.sh -d test

Note: Replace test with the name of your database

Note: If you see an error Binary 'psql' not found', run the following command and re-try:

echo 'export PATH="/usr/local/pgsql/bin:$PATH"' >> ~/.bashrc && source ~/.bashrc

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
benchmarking_queries		benchmarking_queries
docs		docs
scripts		scripts
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Database System Architecture Project 👨🏻‍💻

Table of contents

1. Introduction

2. Project Overview

2.1. Statistics

2.2. Selectivity Estimations

2.3. Join Estimations

3. Debugging Guide

4. Benchmark Guide

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

mohammadzainabbas/database-system-architecture-project

Folders and files

Latest commit

History

Repository files navigation

Database System Architecture Project 👨🏻‍💻

Table of contents

1. Introduction

2. Project Overview

2.1. Statistics

2.2. Selectivity Estimations

2.3. Join Estimations

3. Debugging Guide

4. Benchmark Guide

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages