Skip to content

Optimizing database queries with array programming

Notifications You must be signed in to change notification settings

tomzhang/HorsePower

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HorsePower

HorsePower is designed for optimizing database queries with modern hardware. At its core is HorseIR, which is a well-designed array-based intermediate representation (IR) for database queries. Based on HorseIR, sophisticated compiler optimizations can be applied for database operations. Moreover, using array programming offers a promising option for performance speedup with fine-grained parallelism.

Project Overview

Figure 1. The workflow of the HorsePower framework.

In summer 2017, we started this project from scratch. The workflow of the HorsePower framework can be found in Figure 1. A candidate of the source language is our HorseIR language which is an extension of standard SQL. The Horse language is designed for data analytics with extended SQL features. At the current stage, we adopt execution plans from standard database SQL queries and MATLAB code. We provide a front end for parsing and transforming source code to HorseIR. After the optimization phases, multiple back-ends are supported. Static analyses and code optimizations are performed before the target code is generated. On the other hand, we provide an interpreter which allows running programs directly.

In HorsePower, we focus on the following parts.

- Design and implementation of array-based intermediate representation (IR)
- Static analysis for an array-based IR (i.e. HorseIR)
- Query optimizations with compiler optimizations
- Fine-grained primitive functions and highly tuned libraries

Installation

Download the repository

git clone git@github.com:Sable/HorsePower.git

Setup the environment variable HORSE_BASE

cd HorsePower && export HORSE_BASE=$PWD

Installation with the following command line (About 13 mins)

(cd libs && sh deploy_linux.sh)

After installation, new folders created as follows.

- libs/include
- libs/lib
- libs/pcre2

Note, it is recommended to use gcc 8.1.0 or higher and additional library uuid-dev may be required during the installation.

Build and Run

There are multiple versions developing under src/horseir/. For each version, you can find a running script run.sh which builds an executable and runs it with proper parameters. You are recommended to use the latest version as this project is still under active development.

To learn how to run, type

(cd src/horseir/v3 && ./run.sh)      # show usage

A Brief Summary

Name Notes
Platform Cross-platform
Tools C/C++, Flex & Bison
Parallelism OpenMP/Pthread/CUDA/OpenCL
Conventions docs/conventions

Quick Entries

IR design

Database TPC-H

Implementation

Publications

Copyright and License

Copyright © 2017-2020, Hanfeng Chen, Laurie Hendren and McGill University.

About

Optimizing database queries with array programming

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 46.6%
  • C 44.3%
  • Python 4.9%
  • Yacc 1.2%
  • Shell 1.1%
  • Lex 0.5%
  • Other 1.4%