Skip to content

microsoft/knossos-ksc

Repository files navigation

Knossos-KSC

Compile a lisp-like IR with automatic differentiation and user-defined rewrites.

This project is a compiler and code-gen tool that will accelerate writing AI algorithms as well as making them easier.
The core is a purely functional lisp-like IR that can be translated from high-level languages, and can be linked to a variety of backends to generate code.

Currently implemented frontends are

  • TorchScript: ts2k
  • Julia: j2ks
  • F#: f2k
  • ONNX: onnx2ks
  • KS-Lisp: The IR itself is exchanged in a lisp-like text format (see below).

Current backends:

  • CPU/C++: In src/ksc and src/runtime
  • GPU/Futhark: Also in src/ksc
  • MLIR: In mlir

Current transformers:

  • KSC: Various Autodiff and optimization transforms, in Haskell
  • RLO: A rewriting-based optimizer using reinforcement learning

Docs are built to https://microsoft.github.io/knossos-ksc/

KS-Lisp: A low-sugar IR

The text form of the IR is "low sugar", focused more on simplicity than user-friendliness. However it is human writable, and lispers in particular may like to play with it. There's a VS Code syntax highlighter extension in etc/ks-vscode.

The AST has just a few concepts: Lambda, Call, Let binding, Conditionals, Constants, Assert, Top-level function definitions, and Rewrite rules.

It's also strongly typed, using a bare set of types: Tensor, Tuple, and Values (Float, Integer, String).
However, unlike say MLIR or TorchScript, these are composable, so e.g. we can have a (Tensor 1 (Tuple Float (Tensor 2 Float))). Tensors are annotated with their "rank" or number of dimensions.

The IR is pure functional, so functions may be called more than once or not at all depending on the use of their outputs. Values are passed and returned "as if" by value, although the optimizer reserves the right implement this using refcounting/copy/move, as long as the same computation is performed.

The lisp-like IR is extremely simple -- all the language builtins are in this code:

;; Externally defined function "atan2", which returns a Float, and takes two Floats
(edef atan2 Float (Float Float)) 

#| Block comments
 -- User-defined function myfun 
 takes an Integer and Tensor of (Float Float) pairs
 and returns a pair of String and Float
|#
(def myfun                                       ; function name
  (Tuple String Float)                           ; return type
  ((i : Integer)                                 ; argument 1: int
   (v : Tensor 3 (Tuple Float Float)))           ; argument 2: 3D tensor of tuple
  (assert (gt i 0)                               ; (assert TEST BODY)
     (if (eq i 0)                                ; (if TEST TEXPR FEXPR)
        ; "then" br
        (tuple "fred"                            ; tuple constructor
           (let (tmp (index 0 v))                ; (let (VAR VAL) BODY)
             (mul (get$1$2 tmp) 2.0)) )          ; no builtins -- even mul is a function
        ; "else" br
        (let ((t1 (index 0 v))                   ; (let ((VAR1 VAL1) ... (VARn VALn)) BODY)
              (t2 (myfun (sub i 1) v)))          ; Recursive call to myfun
          t2))))

;; Rewrite rule
(rule "mul.commute"                     ; rule name
    ((a : Float) (b : Float))           ; "template_vars": free variables in the template
    (mul a b)                           ; "template": a pattern to match
    (mul b a))                          ; replacement
; Rules are *not* applied exhaustively, that's very intentionally left up to the compiler.  
; For example the rule above, if matched exhaustively would not terminate.

;; And compilation produces f and various derivatives, as if
(edef rev$myfun ; classic reverse mode.  If f :: S -> T, then rev$f :: (S, dT) -> dS
    (Tuple (Tuple) (Tensor 1 (Tuple Float Float))) ; dS is tangent-type of inputs (dInteger = void)
    ((#|s  |# (Tuple Integer (Tensor 1 (Tuple Float Float)))) ; inputs in a single tuple
     (#|dt |# (Tuple (Tuple) Float)))) ; dT is tangent type of returns

(edef D$myfun  ; linear map as per Elliot.  If f :: S->T, D$f :: S -> (LM dS dT) where LM is linear map
    (LM
      (Tuple (Tuple) (Tensor 1 (Tuple Float Float))) ; dS
      (Tuple (Tuple) Float) ; dT
    )
    (Integer (Tensor 1 (Tuple Float Float)))) ; inputs as normal (not single-tupled)

See the ksc syntax primer for a longer introduction to the ks language. The ksc test directory provides more examples of the constructs available when writing .ks files.

INSTALLATION/BUILDING

Installation

See ksc or mlir

Running the ksc executable

Running

Run the ksc executable as follows to differentiate, compile and run a .ks program. This example runs hello-world.ks.

./build/bin/ksc --compile-and-run \
  --ks-source-file src/runtime/prelude.ks \
  --ks-source-file test/ksc/hello-world.ks \
  --ks-output-file obj/test/ksc/hello-world.kso \
  --cpp-include prelude.h \
  --cpp-output-file obj/test/ksc/hello-world.cpp \
  --c++ g++ \
  --exe-output-file obj/test/ksc/hello-world.exe

or with PowerShell syntax:

./build/bin/ksc --compile-and-run `
  --ks-source-file src/runtime/prelude.ks `
  --ks-source-file test/ksc/hello-world.ks `
  --ks-output-file obj/test/ksc/hello-world.kso `
  --cpp-include prelude.h `
  --cpp-output-file obj/test/ksc/hello-world.cpp `
  --c++ g++ `
  --exe-output-file obj/test/ksc/hello-world.exe

Which should produce output like this:

read decls
Linted Typechecked defs
Linted Grad
Linted Grad tupled
Linted Optgrad
Linted Optgrad tupled
Linted Diffs
Linted OptDiffs
Linted CSE
Writing to obj/test/ksc/hello-world.kso
Writing to obj/test/ksc/hello-world.cpp
Compiling: g++ -fmax-errors=5 -fdiagnostics-color=always -Wall -Wno-unused -Wno-maybe-uninitialized -Isrc/runtime -O3 -g -std=c++17 -o obj/test/ksc/hello-world.exe obj/test/ksc/hello-world.cpp
Running

Hello world!
If you are seeing this output then knossos-ksc has successfully compiled and run the hello-world.ks program!

PyTorch frontend

See doc/sphinx

Tests

To run the ksc self-tests use the command line

./build/bin/ksc --test --fs-test out.fs

(Don't worry if the final test, of out.fs, fails. It is a test for F#-to-ks, which most users will not have set up.)

Generating a .kso file from a .ks file without differentiating

To generate a .kso file from a .ks file without differentiating, i.e. to type check and apply ksc's heuristic optimisations, use the command line

./build/bin/ksc --generate-cpp \
  --ks-source-file src/runtime/prelude.ks \
  --ks-source-file input.ks \
  --ks-output-file output.ks \
  --cpp-include prelude.h \
  --cpp-output-file output.cpp

KSC Internals

Syntax of .ks files

In the compiler, the IR is defined in Lang.hs. The syntax is defined by the parser in Parse.hs and the pretty-printer in Lang.hs. testRoundTrip in Main.hs checks that they agree.

Continuous integration

knossos-ksc has a continuous integration build set up on Azure DevOps. To manually run a CI build click "Run pipeline", type the name of your branch under "Branch/tag", and click "Run".

Code of Conduct

Collaboration on this project is subject to the Microsoft Open Source Code of Conduct.