# SAAF Notebook

This Jupyter Notebook provides an interactive platform for FaaS development. 

## Part 1: Notebook Setup

Welcome to the SAAF Jupyter notebook! This default notebook provides comments to guide you through all of the main features. If you run into errors or probls please make sure you have the AWS CLI properly configure so that you can deploy function with it, have Docker installed and running, gave execute permission to everything in the /jupyter_workspace and /test directory, and finally installed all the dependencies. You can use quickInstall.sh in the root folder to walk you through the setup process and install dependencies. This notebook is designed to be ran locally or on EC2 instances and works very will with Visual Studio Code's Jupyeter plugin. Other environments may work but getting this notebook to work on cloud based platforms like Google Collab may be very difficult.

Anyway, this first cell is just imports needed to setup the magic that goes on behind the scenes. Run it and it should return nothing. In this cell we define our config object, this object contains any information that we need to deploy functions, such as a role for functions on AWS Lambda. If all of your functions will use the same config object, you can set it globally by using setGlobalConfig. Any methods that take a config object will priorize the object passed to them over the global config.

The setGlobalDeploy function defines that you want your cloud functions to be automatically deployed when they are ran. This can be disabled by setting the method to false.

Function documentation available in jupyter_workspace/platforms/jupyter/interactive_helpers.py


In [None]:
import os
import sys
sys.path.append(os.path.realpath('..'))
from platforms.jupyter.interactive_helpers import *

# Configure your function details here. Currently the only thing you need is a lambda ARN to assign to functions.
config = {
    "lambdaRoleARN": "{FILL THIS IN}"
}
setGlobalConfig(config)

# If you want to disable automatic deployment across the entire notebook change this.
setGlobalDeploy(True)

# Functions

Any function with the @cloud_function decorator will be uploaded to the cloud. Define platforms and memory settings in the decorator. 
Functions are tested locally and must run sucessfully before being deployed.

## Part 2: Deploying Functions

Here is your first cloud function! Creating cloud functions is as simple as writing python functions with (request, context) arguments and adding the @cloud_function decorator! See the two hello world functions below, they are nearly identical! But when we run them we will see that the CPU used on the cloud will be different than our local CPU returned by the SAAF inspector inspectCPU method. That is because the function is running on AWS Lambda! 

You can add arguments to the cloud_function dectorator to define the platform you would like to deploy to, the memory setting, and different context objects. Other arguments like references, requirements, and containerize can be used to change behavior.

Cloud functions defined in this notebook do have a few limitations. The main one is that nothing outside the function is deployed to the cloud. That is why imports are inside the function, which is a little weird and can have an effect on what you can import. But for most things this is fine. 

Alongside deploying your function code, you can deploy files alongside this function by adding them to the src/includes_{function name} folder (This function will use src/includes_hello_world). This folder will be automatically created when the function is ran. You can include basically anything, files, scripts, python libraries, whatever you need.

If everything is setup correct, all you need to do is run this code block and you'll get a hello_world function on AWS Lambda! If not all dependencies are installed you can use ./quickInstall.sh to download them.

In [None]:
@cloud_function()
def hello_world(request, context): 
    from Inspector import Inspector
    inspector = Inspector()
    inspector.inspectCPU()
    inspector.addAttribute("message", "Hello from the cloud " + str(request["name"]) + "!")
    return inspector.finish()

def hello_world_local(request, context): 
    from Inspector import Inspector
    inspector = Inspector()
    inspector.inspectCPU()
    inspector.addAttribute("message", "Hello from your computer " + str(request["name"]) + "!") 
    return inspector.finish()

# Run our local hello_world function and check the CPU.
local = hello_world_local({"name": "Steve"}, None)
print("Local CPU: " + local['cpuType'])

# Run our cloud hello_world function and check the CPU.
cloud = hello_world({"name": "Steve"}, None)
print("Cloud CPU: " + cloud['cpuType'])


## Part 3: Chaining Functions and Run Modes

What if we want one cloud function to call another function? This jello_world cloud function is calling the hello_world cloud function we created earlier. What is going to happen? When deployed, could will be generated automatically so that the jello_world function will make a request and call the hello_world function! Simply add any cloud functions that this function calls to the references list and this code will be generated.

This function isn't cheating and just deploying both hello_world and jello_world together, both are deployed as seperate functions and making requests to the other. This example isn't practical but all features of python, such as multithreading, can be used to make multiple requests to functions in parallel. After running, see src/handler_jello_world.py for the automatically generated source code.

Alongside that, this function has a custom run mode. There are three run modes that define how functions are executed when they are called. By default, RunMode.CLOUD is used and calling cloud functions will run them on the cloud. RunMode.LOCAL makes it so that cloud functions are executed locally when called on their own, so to run them on the cloud you must use the test method. As you can see here, we have one single function but like in the previous example we can see different CPUs depending on if it is ran locally or on the cloud using the test method. But, since hello_world is still a cloud function with the default RunMode.CLOUD, it will be called on the cloud instead of running locally. Finally, if you don't want your functions running locally or on the cloud but instead just deployed when the cell is ran you can use RunMode.NONE.

In [None]:
@cloud_function(references=[hello_world], runMode=RunMode.LOCAL)
def jello_world(request, context): 
    from Inspector import Inspector
    inspector = Inspector()
    inspector.inspectAll()
    
    cloud_request = hello_world(request, None)
    hello_message = cloud_request['message']
    jello_message = hello_message.replace("Hello", "Jello")
    inspector.addAttribute("message", jello_message)
    inspector.addAttribute("cloud_request", cloud_request)
    
    inspector.inspectAllDeltas()
    return inspector.finish()


local = jello_world({"name": "Bob"}, None)
print("---")
print("Local jello_world CPU: " + local['cpuType'])
print("Local hello_world call in jello_world CPU: " + local['cloud_request']['cpuType'])

cloud = test(function=jello_world, payload={"name": "Bob"}, quiet=True, skipLocal=True)
print("---")
print("Cloud jello_world CPU: " + cloud['cpuType'])

## Part 4: Requirements and Containerize

This function here requires the igraph dependency, you can see it defined in the requirements argument of the decorator. Alongside that, this function will be deployed as a container instead of zip file. Containers are built, submitted to ECR, and deployed to AWS Lambda. For all function builds, you can see the generated files in the /deploy directory. The complete build for this function will be in /deploy/graph_rank_container_aws_build where you will be able to see all the python files, dependencies, and Dockerfile. The build folder will be destroyed and recreated every time a function is deployed so it is not recommended to manually edit. 

If the run mode was set to local, any dependencies this function uses would need to be install locally first. But since this function uses the default CLOUD run mode you do not need to install them.

This function also uses more memory than the others, so we have changed the memory setting to 1024MBs instead of the default 256MBs.

In [None]:
@cloud_function(memory=1024, requirements="python-igraph", containerize=True)
def page_rank_container(request, context):
    from Inspector import Inspector 
    import datetime
    import igraph
    import time
    
    inspector = Inspector()
    inspector.inspectAll()
    
    size = request.get('size')  
    loops = request.get('loops')

    for x in range(loops):
        graph = igraph.Graph.Tree(size, 10)
        result = graph.pagerank()  

    inspector.inspectAllDeltas()
    return inspector.finish()

page_rank_container({"size": 10000, "loops": 5}, None)

# Execute Experiments

Use FaaS Runner to execute complex FaaS Experiments.

## Part 5: FaaS Runner Experiments

Now, what's cooler than running a function on the cloud once? Running it multiple times! The run_experiment function allows you to create complex FaaS experiments. This function uses our FaaS Runner application to execute functions behind the scenes. It's primary purpose is to run multiple function requests across many threads. You define payloads in the payloads list, choose your memory setting (it will switch settings automatically) and define how many runs you want to do, across how many threads, and how many times you want to repeat the test with iterations. These are the most important parameters, but there are many more defined in the link below. 

After an experiment runs, the results are converted into a pandas dataframe that you can continue using in this notebook. For example you can use matplotlib to generate graphs (see below), or do any other form of data processing. 

Below are two different experiments for our functions. Execute them and generate graphs using the code cells below. You now have experienced all the functionality of the SAAF Jupyter Workspace! Happy FaaS developing!


In [None]:

# Define experiment parameters. For more detail see: https://github.com/wlloyduw/SAAF/tree/master/test
hello_experiment = {
  "payloads": [{"name": "Bob"}],
  "memorySettings": [256],
  "runs": 25,
  "threads": 5,
  "iterations": 4,
  "warmupBuffer": 0,
  "sleepTime": 0,
  "randomSeed": 42,
  "showAsList": [],
  "showAsSum": ["newcontainer"],
  "ignoreFromAll": ["zAll", "version", "linuxVersion", "hostname"],
  "invalidators": {},
  "removeDuplicateContainers": False,
  "overlapFilter": "functionName",
  "openCSV": False
}

# Execute experiment
hello_world_results = run_experiment(function=hello_world, experiment=hello_experiment)

In [None]:
page_rank_experiment = {
  "payloads": [{"size": 50000, "loops": 5},
                {"size": 100000, "loops": 5},
                {"size": 150000, "loops": 5}],
  "memorySettings": [512],
  "runs": 100,
  "threads": 100,
  "iterations": 2,
  "warmupBuffer": 1,
  "sleepTime": 0,
  "randomSeed": 42,
  "showAsList": [],
  "showAsSum": ["newcontainer"],
  "ignoreFromAll": ["zAll", "version", "linuxVersion", "hostname"],
  "invalidators": {},
  "removeDuplicateContainers": False,
  "overlapFilter": "functionName",
  "openCSV": True
}

# Execute experiment
page_rank_results = run_experiment(function=page_rank_container, experiment=page_rank_experiment)

# Process Results

FaaS Runner experiment results are parsed into a Pandas dataframe. This flexibility allows the ability to perform any kind of data processing that you would like.

In [None]:
# Import matplotlib and setup display.
import matplotlib.pyplot as plt
%matplotlib inline

# Histogram of runtime
plt.hist(hello_world_results['userRuntime'], 10)

In [None]:
# Import matplotlib and setup display.
import matplotlib.pyplot as plt
%matplotlib inline

# Histogram of runtime
plt.hist(page_rank_results['userRuntime'], 10)

## Part 6: Expand to Multiple Platform

SAAF functions and cloud functions in this notebook support multiple platforms. Currently functions can be deployed to AWS Lambda, Google Cloud Functions, and IBM Cloud Functions. Azure Cloud Functions should work, but something about the deployment scripts broke. So Azure Functions support is coming in the future. The three supported platforms support all the same features of AWS Lambda functions with a few limitations. 
 1. If a function is deployed to multiple platforms (such as the one below) then only the LOCAL run mode is supported. If a single platform is selected then the CLOUD run mode is supported on all platforms.
 2. Google Cloud Functions take a while to deploy and regularly need to be deployed more than once to get running. To deploy a function multiple times run this cell, wait for the processing to finish, make a minor change to the code (like adding a space at the end of a line) and run the cell again.
 3. Containerize is only supported on AWS Lambda.

Run the cell below to deploy this cloud function to all supported platforms. After the functions are deployed and working, run the experiment below to compare the runtime of each platform.

In [None]:
@cloud_function(platforms=[Platform.AWS, Platform.GCF, Platform.IBM], runMode=RunMode.LOCAL)
def hello_faas(request, context): 
    from Inspector import Inspector
    inspector = Inspector()
    inspector.inspectAll()
    inspector.addAttribute("message", "Hello from " + inspector.getAttribute("platform") +  " " + str(request["name"]) + "!")
    inspector.inspectAllDeltas()
    return inspector.finish()

hello_faas({"name": "Bob"}, None) # What platform would this call? Idk, so multiplatform functions only support RunMode.LOCAL

# So instead use test to call your function on each platform...
test(function=hello_faas, payload={"name": "Bob"})

In [None]:
# Or run experiments on each platform. Here we define a simple experiment.
platform_experiment = {
  "payloads": [{"name": "Bob"}],
  "memorySettings": [],
  "runs": 10,
  "threads": 10,
  "iterations": 1,
  "warmupBuffer": 0,
  "sleepTime": 0,
  "randomSeed": 42,
  "showAsList": [],
  "showAsSum": ["newcontainer"],
  "ignoreFromAll": ["zAll", "version", "linuxVersion", "hostname"],
  "invalidators": {},
  "removeDuplicateContainers": False,
  "overlapFilter": "functionName",
  "openCSV": False
}

# Then run the experiments on each platform
lambdaResults = run_experiment(function=hello_faas, experiment=platform_experiment, platform=Platform.AWS)
googleResults = run_experiment(function=hello_faas, experiment=platform_experiment, platform=Platform.GCF)
ibmResults = run_experiment(function=hello_faas, experiment=platform_experiment, platform=Platform.IBM)

# And find out the average runtime of each platform
print("AWS Lambda Average Runtime: " + str(np.average(lambdaResults['runtime'])))
print("Google Cloud Functions Average Runtime: " + str(np.average(googleResults['runtime'])))
print("IBM Cloud Functions Average Runtime: " + str(np.average(ibmResults['runtime'])))