Skip to content
High-performance stateful serverless runtime based on WebAssembly
C++ C Python CMake Shell Dockerfile Other
Branch: master
Clone or download
Shillaker Copy-on-write memory restore (#136)
* Checking CoW idea

* Memory mapping tests

* Zygote CoW

* WAVM changes
Latest commit 7ba372d Jan 21, 2020
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
ansible Load testing configuration (#133) Jan 15, 2020
bin Load testing configuration (#133) Jan 15, 2020
conf Inference experiment (#129) Jan 2, 2020
docker Inference experiment (#129) Jan 2, 2020
docs Changed signature of compile task (#134) Jan 21, 2020
experiment Copy-on-write memory restore (#136) Jan 21, 2020
func C/C++ demo (#132) Jan 9, 2020
include Copy-on-write memory restore (#136) Jan 21, 2020
k8s Wavm upgrade and benchmarking (#131) Jan 6, 2020
lib-cpp C/C++ demo (#132) Jan 9, 2020
lib-pyinit Python Benchmark Performance (#108) Nov 27, 2019
src Copy-on-write memory restore (#136) Jan 21, 2020
tasks Changed signature of compile task (#134) Jan 21, 2020
test Copy-on-write memory restore (#136) Jan 21, 2020
test_aws State performance and native function runtime (#83) Oct 18, 2019
test_ibm State performance and native function runtime (#83) Oct 18, 2019
test_knative Native python execution and flushing workers (#96) Nov 13, 2019
test_utils Build fixes (#123) Dec 4, 2019
third-party Copy-on-write memory restore (#136) Jan 21, 2020
toolchain C/C++ demo (#132) Jan 9, 2020
typescript Remote files/ toolchain (#79) Oct 1, 2019
wasm C/C++ demo (#132) Jan 9, 2020
.dockerignore C/C++ demo (#132) Jan 9, 2020
.env Wavm upgrade and benchmarking (#131) Jan 6, 2020
.gitattributes Small change to add logging level env param Nov 11, 2018
.gitignore Build fixes (#123) Dec 4, 2019
.gitmodules C/C++ demo (#132) Jan 9, 2020
.travis.yml Chaining functions by name (#125) Dec 11, 2019
CMakeLists.txt C/C++ demo (#132) Jan 9, 2020
LICENSE.md Added licence Oct 3, 2018
README.md Changed signature of compile task (#134) Jan 21, 2020
VERSION Wavm upgrade and benchmarking (#131) Jan 6, 2020
docker-compose-ibm.env Python state and chaining (#90) Oct 31, 2019
docker-compose-ibm.yml IBM - part 1 (#81) Oct 8, 2019
docker-compose-native.yml Tensorflow Experiment Updates (#116) Dec 3, 2019
docker-compose-test.yml Python Benchmark Performance (#108) Nov 27, 2019
docker-compose.yml Load testing configuration (#133) Jan 15, 2020
faasm_logo.png logo Dec 20, 2019
requirements.txt Wavm upgrade and benchmarking (#131) Jan 6, 2020
workon.sh C/C++ demo (#132) Jan 9, 2020

README.md

Faasm Build Status License

Faasm is a high-performance stateful serverless runtime. The goal of the project is enabling fast, efficient serverless big data.

Faasm provides multi-tenant isolation, but also lets functions share regions of memory. These shared memory regions give low-latency concurrent access to data, supporting high-performance distributed serverless applications.

By running WebAssembly, Faasm combines software fault isolation with standard OS tooling to provide security and resource isolation guarantees at low cost. Functions run side-by-side as threads of a single runtime process, with low overheads and fast boot times. The underlying WebAssembly execution and code generation is handled by WAVM, an excellent server-side WebAssembly VM.

Faasm supports C/C++ natively and extends support to dynamic languages such as Python by compiling the language runtime itself to WebAssembly. The Python support is based heavily on the work of the Pyodide project, with custom C-extensions and decorators in Pyfaasm.

Faasm uses a custom host interface to give functions access to state and interact with the runtime. Larger applications can be constructed by composing multiple functions together dynamically in chains. The Faasm scheduler ensures these functions execute close to their required data, reducing unnecessary duplication and overhead.

Faasm is a runtime, intended for integration into other serverless platforms. The primary integration is with Knative.

Quick start

You can start a simple Faasm runtime using the docker-compose.yml file in the root of the project. This creates a couple of worker instances as well as an upload endpoint for receiving function data and state. There is also a Redis container used for communication between the workers.

You can start it by running:

# Single worker
docker-compose up

# Three workers
docker-compose up --scale worker=3

Note that the first time you run the local set-up it will generate some machine code specific to your host. This is stored in the machine-code directory in the root of the project and reused on subsequent runs.

Compiling a C++ function

C++ functions are built with CMake and held in the func directory. demo/hello.cpp is a simple hello world function.

The Faasm toolchain is packaged in the faasm/toolchain container and can be run with the bin/toolchain.sh script, i.e.:

./bin/toolchain.sh

This container has all the tooling ready to use. To compile and upload the hello function you an run:

inv compile demo hello
inv upload demo hello

You can invoke the function as follows:

inv invoke demo hello

You should then see the response Hello faasm!.

Running a Python function

An example Python function is found at func/python/hello.py. This can be uploaded with:

inv upload --py python hello

And invoke with:

inv invoke --py python hello

This should give a message and the version of Python being run.

Writing functions

Faasm aims to be uninvasive, allowing code to run natively and in a serverless context. This is important for local development and testing as well as porting existing applications.

C++

In C++ functions make use of the Faasm macros. These macros mean the code can be compiled with a standard toolchain and run natively, but when compiled with the Faasm toolchain, will run in a serverless context.

#include "faasm/faasm.h"

FAASM_MAIN_FUNC() {
    // Do something

    return 0;
}

Some example functions can be found in the func/demo directory.

C++ API

Faasm allows users functions to interact with the underlying system to accomplish a number of things e.g. accessing input and output, interacting with distributed state, and invoking other functions.

Some of the methods include:

  • faasmGetInput() - allows functions to retrieve their input data
  • faasmSetOutput() - this allows functions to return output data to the caller
  • faasmChainFunction() - this allows one function to invoke others
  • faasmAwaitCall() - waits for a chained function invocation to finish
  • faasmReadState() and writeState() - allows functions to read/ write key/value state
  • faasmReadStateOffset() and faasmWriteStateOffset() - allows functions to read/ write at specific points in existing state (e.g. updating a subsection of an array)

Python API

Python functions interact with the Faasm API via pyfaasm, a module with a C-extension providing the relevant bindings to the C++ API.

Chaining

"Chaining" is when one function makes a call to another function (which must be owned by the same user). There are two supported methods of chaining, one for invoking totally separate Faasm functions, the other for automatically parallelising functions in the same piece of code (useful for porting legacy applications).

Chaining a function

Multiple functions can be defined in the same file, invoke each other and await results. For example:

#include "faasm/faasm.h"
#include <vector>

// Define some function to be chained
FAASM_FUNC(funcOne, 1) {
    return 0;
}

// Define another function to be chained
FAASM_FUNC(funcTwo, 2) {
    return 0;
}

// Define the main function
FAASM_MAIN_FUNC() {
    // Chain calls to the other functions
    int callOneId = faasmChainThis(1);
    int callTwoId = faasmChainThis(2);

    // Await results
    faasmAwaitCall(callOneId);
    faasmAwaitCall(callTwoId);

    return 0;
}

In Python this looks like:

from pyfaasm.code import await_call, chain_this, faasm_func, faasm_main

@faasm_func(1)
def func_one():
    pass

@faasm_func(2)
def func_two():
    pass

@faasm_main
def main_func():
    call_one = chain_this(1)
    call_two = chain_this(2)

    await_call(call_one)
    await_call(call_two)

Chaining can also be done across functions defined separately, e.g. in C++:

#include "faasm/faasm.h"

FAASM_MAIN_FUNC() {
    // Chain a call to my other function named "other-func"
    int callId = faasmChainFunction("other-func");
    faasmAwaitCall(callId);

    return 0;
}

State

All of a users' functions have access to shared state. This state is implemented as a simple key-value store and accessed by the functions faasmReadState and faasmWriteState. Values read and written to this state must be byte arrays.

State can be dealt with in a read-only or lock-free manner (and shared in regions of shared memory with colocated functions), or via synchronous distributed locks.

A function accessing state will look something like:

#include "faasm/faasm.h"

FAASM_MAIN_FUNC() {
    const char *key = "my_state_key";

    // Read the state into a buffer
    long stateSize = 123;
    uint8_t *myState = new uint8_t[stateSize];
    faasmReadState(key, newState, stateSize);

    // Do something useful, modify state

    // Write the updated state
    faasmWriteState(key, myState, stateSize);

    return 0;
}

Offset state

Faasm also exposes the faasmReadStateOffset and faasmWriteStateOffset functions, which allow reading and writing subsections of byte arrays stored against keys in the key-value store.

For example, if I have an array of 100 bytes stored in memory, I can read bytes 15-20, then update them separately.

This can be useful for implementing distributed iterative algorithms.

Uploading state

If you need to prepopulate state for your functions you can use the state upload endpoint. This can be called with:

# Format /s/<username>/<key>
curl http://localhost:8002/s/user123/my_key -X PUT -d "your data"

This state can then be accessed in your functions using the specified key. For larger state, you can also upload a file:

curl http://localhost:8002/s/user123/my_key -X PUT -T /tmp/my_state_file

Where /tmp/my_state_file contains binary data you wish to be inserted at your specified key.

Proto-functions

Proto-functions are a way to reduce the initialisation time of functions. They are a chunk of code that executes once and is then used to spawn all subsequent function invocations. The complete state of the function after proto-function execution is duplicated for all subsequent function invocations.

The proto-function should be idempotent as it may be run more than once.

Proto-function code is marked up with the FAASM_ZYGOTE macro:

#include "faasm/faasm.h"

int myGlobal;

FAASM_ZYGOTE() {
    // Run once
    myGlobal = 5;

    return 0;
}

FAASM_MAIN_FUNC() {
    // Context available to all subsequent function calls
    printf("My global = %i\n", myGlobal);

    return 0;
}

Scheduling and Work Sharing

Faasm workers schedule tasks via a standard work-sharing approach. This means function calls are distributed randomly, and workers will hand them off to another worker if that worker is better suited to execute it. This means functions will always be run by the same worker (hence able to share local in-memory state), provided resources are available.

In auto-scaling environments such as KNative, the workers will auto-scale to meet demand.

Integrations

Faasm's recommended integration is with KNative, although it also works with AWS Lambda and IBM Cloud Functions. For more information consult the docs folder.

You can’t perform that action at this time.