TPC-DS benchmark kit with some modifications/fixes
Clone or download
Latest commit 9d01e73 Oct 26, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
answer_sets Import v2.9.0 changes Jul 22, 2018
query_templates Import v2.10.0 changes Oct 2, 2018
query_variants Fix missing comma in query77a Oct 22, 2018
specification Import v2.10.0 changes Oct 2, 2018
tests Import v2.4.0 Mar 3, 2017
tools Rename site to page based on v2.10.0 specification Oct 25, 2018
.gitattributes Update .gitattributes Sep 15, 2016
.gitignore Update .gitignore Mar 3, 2017
EULA.txt Import v2.7.0 changes Jan 25, 2018
README.md Update README.md Oct 25, 2018

README.md

tpcds-kit

The official TPC-DS tools can be found at tpc.org.

This version is based on v2.10.0 and has been modified to:

  • Allow compilation under macOS (commit 2ec45c5)
  • Address obvious query template bugs like
  • Rename s_web_returns column wret_web_site_id to wret_web_page_id to match specification. See #22 & #42.

To see all modifications, diff the files in the master branch to the version branch. Eg: master vs v2.10.0.

Setup

Linux

Make sure the required development tools are installed:

Ubuntu:

sudo apt-get install gcc make flex bison byacc git

CentOS/RHEL:

sudo yum install gcc make flex bison byacc git

Then run the following commands to clone the repo and build the tools:

git clone https://github.com/gregrahn/tpcds-kit.git
cd tpcds-kit/tools
make OS=LINUX

macOS

Make sure the required development tools are installed:

xcode-select --install

Then run the following commands to clone the repo and build the tools:

git clone https://github.com/gregrahn/tpcds-kit.git
cd tpcds-kit/tools
make OS=MACOS

Using the TPC-DS tools

Data generation

Data generation is done via dsdgen. See dsdgen --help for all options. If you do not run dsdgen from the tools/ directory then you will need to use the option -DISTRIBUTIONS /.../tpcds-kit/tools/tpcds.idx.

Query generation

Query generation is done via dsqgen. See dsqgen --help for all options.

The following command can be used to generate all 99 queries in numerical order (-QUALIFY) for the 10TB scale factor (-SCALE) using the Netezza dialect template (-DIALECT) with the output going to /tmp/query_0.sql (-OUTPUT_DIR).

dsqgen \
-DIRECTORY ../query_templates \
-INPUT ../query_templates/templates.lst \
-VERBOSE Y \
-QUALIFY Y \
-SCALE 10000 \
-DIALECT netezza \
-OUTPUT_DIR /tmp