This package takes a data science grammar and provides code to explore the set of possible pipelines from the grammar. Features:
- Produces multiples executable LALE pipelines from the grammar, with optional user constraints. It does so using AI planning.
- Trains hyperparameters and evaluates generated pipelines.
- Can use measured pipeline accuracy to produce better pipelines in subsequent iterations.
The full details are in Katz, M., Ram, P., Sohrabi, S., & Udrea, O. (2020). Exploring Context-Free Languages via Planning: The Case for Automating Machine Learning. Proceedings of the International Conference on Automated Planning and Scheduling, 30(1), 403-411. PDF
1. Install Singularity (needed for planutils)
- Currently, singularity doesn't support MAC OS.
- Checkout Admin guide for the details
## Install system dependencies in Debian/Ubuntu
$ sudo apt-get update && sudo apt-get install -y \
build-essential \
uuid-dev \
libgpgme-dev \
squashfs-tools \
libseccomp-dev \
wget \
pkg-config \
git \
cryptsetup-bin
## Install system dependencies in Centos/Redhat
$ sudo yum update -y && \
sudo yum groupinstall -y 'Development Tools' && \
sudo yum install -y \
openssl-devel \
libuuid-devel \
libseccomp-devel \
wget \
squashfs-tools \
cryptsetup
## Install Go
$ export VERSION=1.14.12 OS=linux ARCH=amd64 && \
wget https://dl.google.com/go/go$VERSION.$OS-$ARCH.tar.gz && \
sudo tar -C /usr/local -xzvf go$VERSION.$OS-$ARCH.tar.gz && \
rm go$VERSION.$OS-$ARCH.tar.gz
$ echo 'export GOPATH=${HOME}/go' >> ~/.bashrc && \
echo 'export PATH=/usr/local/go/bin:${PATH}:${GOPATH}/bin' >> ~/.bashrc && \
source ~/.bashrc
## Download and build singularity
$ wget https://github.com/hpcng/singularity/releases/download/v3.7.2/singularity-3.7.2.tar.gz
$ tar xvf singularity-3.7.2.tar.gz
$ cd singularity && ./mconfig && cd ./builddir && make && sudo make install && cd ../..
- Install packages through pip
- Setup planutils and install K* planner and HTN to PDDL translator.
## Create a conda environment
$ conda create -n grammar2plans python=3.7
$ conda activate grammar2plans
## Install python packages
$ pip install -r requirements.txt
## Setup planutils
$ planutils setup
$ export PATH=$PATH:~/.planutils/bin
$ planutils install kstar
$ planutils install hpddl2pddl
- Start jupyter notebooks:
$ jupyter notebook. - Navigate to
notebooks/DataSciencePipelinePlanningTutorial - Executing the cells will create intermediate planning and result files in
output.
- You can run an instance of VS Code with the PDDL language support plugin to see intermediate planning task files:
code output/
By default, the code using the kstar planner that is part of the planutils package. You can however use a different planner by setting the PLANNER_URL environment variable to a service with a matching REST API.
- Download the IBM AI Planner Service or any other service with the same REST API.
- Run the service - as a local docker container or as part of a cloud service.
- Set
PLANNER_URLto the planner you want to use. For instance, if you run the service in a local docker container as per the service README and you would like to usekstar, you would setPLANNER_URL=http://localhost:4501/planners/topk/kstar-topk. - To return to using the
planutilsversion ofkstar,unset PLANNER_URL.