Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Idleness optimisation in bag-of-tasks processing through cloud functions
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Type||Name||Latest commit message||Commit time|
|Failed to load latest commit information.|
This little experiment attempts to find out whether the "idleness optimisation" problem in FaaS would have a feasible and practical solution. In short, the problem statement is that since cloud functions are billed in 100ms intervals, there may be a loss if cloud functions execute just a little longer. The idea is when the function processes many units of data, then processing another one (and another one, if needed...) would yield a situation where the execution time comes close to the billing interval, thus reducing the loss at the expense of reduced parallelisation. The datasets 'ds1' and 'ds2' consist of some nice photos (28 and 73, respectively) which are retrieved by the script 'gettestdata.sh' which defaults to the larger dataset 'ds2'. As the files are quite big, the script 'shrinktestdata.sh' produces a smaller variant. See 'gettestdata.doc' for details. Instructions: - run ./gettestdata.sh, it will fetch some sample data, nice pictures of Poland and Switzerland - run python3 strategies.py, it will take some time... to repeatedly blur all images - statistics will be output into strategies.csv and some more on the terminal directly - you can compare strategies.csv against the existing reference results strategies.*.csv Interpretation: - the saving should be close to the theoretic maximum while the maxthread value should be as small as possible When using the FaaS execution instead of the simulation, the local script 'faasproducer.py' uploads the chosen dataset into a queue in the cloud (AWS SQS), and the cloud function script 'faasconsumer.py' then runs as Lambda to fake-process the data. (Actual processing would require the installation of Scipy into the function bundle.) The FaaS execution is currently incomplete. The cloud function script 'faasproxy.py' adds monetary information. The local script 'faasmeasure.py' invokes the proxy and finally retrieves statistics. Reproducibility: 0. only once: ./gettestdata.sh ds2; ./shrinktestdata.sh _testdata.ds2 1. python3 faasproducer.py _testdata.ds2-small/ 73 data units; theoretic max savings boundary = 36.50s ... (wait) 2. python3 faasmeasure.py 1 False 3. # inspect data/faasmeasure.1.False.log 4. # restart with 1. and tune parameters in 2., e.g. 3 False