compiler for fortran stencils using verified lifting,
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
backend
doc
frontend
tests/integration
website
.gitignore
LICENSE
README.md

README.md

STNG

Compiler for fortran stencils using verified lifting

This repo contains the code for the STNG compiler.

Note: This is still under active development. If you encounter bugs, like to suggest features, etc, please sign up on our mailing list or open a github issue.

Description

STNG consists of the following components:

  • a frontend parser that parses the input Fortran code, identifies potential stencils, and generates input for the synthesizer.
  • a synthesizer that attempts to lift each of the identified stencil to a high-level summary that can be re-targeted to execute on GPUs using Halide.
  • a backend that generates Halide code after a summary has been found.

The following describes how to use the various components in the compiler.

Usage Instructions

Frontend

  1. Clone this repo and run make in frontend. Building the frontend on your own machine requires installing Rose. If you don't want to do that, an easier way is to build the docker image file with:
$ cd frontend
$ docker build --rm -t stng_frontend .

which builds the frontend. After that, starts a container with:

$ docker run -ti stng_frontend /bin/bash 

The frontend executable is /home/stng/stng/frontend/bin/translator.

  1. Run the frontend with the input Fortran file. This will generate a number of output files in the output directory as specified by the -out flag.

If you are using docker then after building the image you can run the frontend with:

$ docker run --rm -v `pwd`:`pwd` -w `pwd` -t stng_frontend translator <input filename> -out <output directory>

This maps the current directory on the host (PWD) to the same directory in the docker container, allowing you to pass files directly from the host's current directory to the container. The outputs are generated in <output directory>.

Backend

  1. Build the backend docker image:
$ cd backend/docker
$ docker build --rm -t stng_backend .
  1. The backend needs to be run from within the stng/backend directory, or, optionally, you can run from another directory while setting the PYTHONPATH variable appropriately. The main executable for the backend is backend/stng-backend.py, which can be run with:
$ cd backend
$ docker run --rm -v `pwd`:`pwd` -w `pwd` -e PYTHONPATH=`pwd` stng_backend ./stng-backend.py --help

It supports three major options: --generate-sketch generates a Sketch from an IR file generated from the frontend. --generate-z3 generates a file to verify in Z3. Finally, --generate-backend-code generates either serial C++ or Halide code for a given stencil, based on the Sketch output. Each has some suboptions described by in the help text above.

An End-to-End Example

Frontend

We will use this stencil as an example (simple.f90 in frontend/tests/simple.f90).

SUBROUTINE simple(j, x_min, x_max)
INTEGER :: j
INTEGER :: x_max
INTEGER :: x_min
REAL(kind=8), DIMENSION(x_min:x_max) :: in, out

DO j = x_min, x_max
  out(j) = in(j)
END DO

END SUBROUTINE

Note that this benchmark consists of one stencil (namely the DO...END DO loop over the array in). The job of the frontend is to identify this stencil and translates the stencil into a simpler intermediate representation (IR) that the synthesizer can use to search for a high-level summary later on.

Running frontend/bin/translator simple.f90 -out simple_out generates the following files in the simple_out directory:

simple_loop0.ir
simple_loop0.f90
simple.f90
simple_loop0_halideglue.cpp
simple_loop0_halideglue.h
simple_loop0_benchmark.f90

In general each identified stencil in file foo is labeled as foo_loop0, foo_loop1, etc.

Here's what each of the files contains:

  • simple_loop0.ir: an internal IR representation of the stencil for the synthesizer and backend to process.
  • simple_loop0.f90: the original stencil code, extracted as a function on its own.
  • simple.f90: a transformed version of the input file where the stencil is replaced by a call to Halide (simple_loop0_halideglue), or the original version if the translation failed (simple_loop0 as defined in simple_loop0.f90).
  • simple_loop0_halideglue.{cpp, h}: glue code to call the Halide runtime. The Halide implementation of the stencil (simple_loop0_halide in simple_loop0_halideglue.cpp) will be generated by the synthesizer and the backend.
  • simple_loop0_benchmark.f90: code for doing timing measurements. This is for running the integration tests and can be ignored otherwise.

Finding summaries using sketch and z3

Given the files generated by the frontend, we will generate a Sketch file that searches for summaries of this stencil, and then we'll verify that the found summary is correct using Z3.

To generate the Sketch file, run:

docker run --rm -v `pwd`:`pwd` -w `pwd` -e PYTHONPATH=`pwd` stng_backend ./stng-backend.py \
  --generate-sketch --sketch-level 11 simple_loop0.ir simple_loop0.sk

In general, different generated Sketch files require different parameters to Sketch in order to resolve. We'll use a set of parameters that works for most generated stencil sketches:

docker run --rm -v `pwd`:`pwd` -w `pwd` -e PYTHONPATH=`pwd` stng_backend sketch \
  -V11 --fe-cegis-path  /sketch/sketch-backend/src/SketchSolver/cegis --beopt:simplifycex NOSIM \
  --fe-fpencoding AS_FFIELD --bnd-arr-size 400 --bnd-arr1d-size 400 --bnd-inbits 2 --bnd-cbits 2 \
  --bnd-unroll-amnt 18 --bnd-inline-amnt 15 --debug-cex simple_loop0.sk \
  > simple_loop0.sk.out

We can observe this running by looking at simple_loop0.sk.out. After a few seconds, it should return, with the end of the file showing that the Sketch problem was correctly solved.

Now, given the candidate postconditions and invariants in the Sketch output, we generate a Z3 file to check if the candidates are correct. To generate the Z3 file:

docker run --rm -v `pwd`:`pwd` -w `pwd` -e PYTHONPATH=`pwd` stng_backend ./stng-backend.py \
  --generate-z3 --sketch-output-src simple_loop0.sk.out simple_loop0.ir simple_loop0.z3

and to check with Z3:

docker run --rm -v `pwd`:`pwd` -w `pwd` -e PYTHONPATH=`pwd` stng_backend \
  z3 -smt2 simple_loop0.z3

Z3 should respond that the problem is UNSAT (i.e. no counterexample exists).

Halide code generation

STNG supports two backends for code generation: we can generate Halide or serial C++ code.

To generate Halide code, we use the --backend-halide option, and for C++, the --backend-cpp option:

docker run --rm -v `pwd`:`pwd` -w `pwd` -e PYTHONPATH=`pwd` stng_backend ./stng-backend.py \
  --generate-backend-code --backend-halide --sketch-output-src simple_loop0.sk.out \
  simple_loop0.ir simple_loop0_halide.cpp

Test Cases

  • frontend/tests includes a number of small benchmark cases that you can play around with.

  • tests/integration includes three benchmarks (Cloverleaf, stencilmark, NAS MG) that are used in our PLDI paper. To run the frontend on those benchmarks, do the following:

    • CloverLeaf: Edit STENCIL_COMPILER in CloverLeaf_Serial/Makefile, then make clover_leaf_stencil in CloverLeaf_Serial
    • Stencilmark: Edit STENCIL_COMPILER in Makefile, then make stencil in stencilmark_fortran
    • MG: first read README_STNG and modify MG/Makefile, then run make MG_stencil in NPB3.2-SER. Note that you need a fortran compiler installed on your system to build the prerequisite files.

Doing so will generate a stencil directory containing the outputs from running the frontend on these benchmarks. Do not run the generated binaries generated by the frontend! Those were artifacts from the original implementations and are not the STNG compiled versions.

Known Issues

  • The current backend is a partial rewrite and is more limited than the backend in the paper. We plan on reintroducing features such as imperfect loop nests that are more general than the original implementation.