This package provides the code for the LMFAO engine. Please refer to our SIGMOD 2019 paper for details on the engine.
In order to compile the LMFAO project, you need to have cmake
and boost
installed.
Then, you need to run the following commands:
cmake .
make
In order to run LMFAO, you need to provide the following configuration files. The data/
directory provides examples for these configuration files.
treedecomposition.conf
defines the tree decomposition used to compute the aggregates in LMFAOfeatures.conf
enumerates the features of the desired machine learning model and defines the type of each feature (continuous or categorical)
By default, LMFAO assumes that the configuration files are provided in the directory that contains the data, but you can specify a different directory with the --files (-f)
flag in the multifaq
command (see below).
LMFAO is run in two stages: Then, you need to compile and run the generated code. The commands to run LMFAO end-to-end are:
-
First, the following command generates the C++ code for a specific application and dataset.
./multifaq --path PATH_TO_DATASET --model DESIRED_MODEL
where
--path
denotes the path to multi-relational dataset--model
specifies the model to be computed, the current options include:covar
computes the Covariance Matrixreg
computes a linear regression modelkmeans
computes the KMeans clusters with the Rk-means algorithm (see the AISTATS 2020 paper for details)mi
computes all pairwise mutual information
Please refer to
./multifaq -h
for additional information on the command line options.By default, this command generates C++ code in the
runtime/cpp/
directory. You can change the output directory with the--out (-o)
flag.This command also generates the
compiler-data.out
file which provides information on the number aggregates, queries, views, and groups that are required for the given application. -
Then, run the following commands to execute the generated code:
cd runtime/cpp/
make -j
./lmfao
If you want to inspect the views that are computed by LMFAO, you need to compile the generated code with make -j dump
, which will output the code to the output\
directory. Make sure that this directory exists before you run LMFAO.
We provide a fragment of the Kaggle Favorita dataset as an example. To compute the covar matrix, you need to run the following commands:
./multifaq --path data/favorita-small --model covar
cd runtime/cpp/
make -j
./lmfao