Merge 6764b6f into 3bc6308

alexcasalboni · Jun 10, 2020 · 32b7ac7 · 32b7ac7
2 parents 3bc6308 + 6764b6f
commit 32b7ac7
Show file tree

Hide file tree

Showing 8 changed files with 336 additions and 58 deletions.
diff --git a/README-SAR.md b/README-SAR.md
@@ -47,6 +47,8 @@ The AWS Step Functions state machine accepts the following parameters:
 * **autoOptimize** (false by default): if `true`, the state machine will apply the optimal configuration at the end of its execution
 * **autoOptimizeAlias** (string): if provided - and only if `autoOptimize` if `true`, the state machine will create or update this alias with the new optimal power value
 * **dryRun** (false by default): if true, the state machine will execute the input function only once and it will disable every functionality related to logs analysis, auto-tuning, and visualization; the dry-run mode is intended for testing purposes, for example to verify that IAM permissions are set up correctly
+* **preProcessorARN** (string): it must be the ARN of a Lambda function; if provided, the function will be invoked before every invocation of `lambdaARN`; more details below in the Pre/Post-processing functions section
+* **postProcessorARN** (string): it must be the ARN of a Lambda function; if provided, the function will be invoked after every invocation of `lambdaARN`; more details below in the Pre/Post-processing functions section
 
 
 Additionally, you can specify a list of power values at deploy-time in the `PowerValues` CloudFormation parameter. These power values will be used as the default in case no `powerValues` input parameter is provided.
@@ -97,6 +99,33 @@ To simplify these calculations, you could use weights that sum up to 100.
 
 Note: the number of weighted payloads must always be smaller or equal than `num` (or `num >= count(payloads)`). For example, if you have 50 weighted payloads, you'll need to set at least `num: 50` so that each payload will be used at least once.
 
+
+### Pre/Post-processing functions
+
+Sometimes you need to power-tune Lambda functions that have side effects such as creating or deleting records in a database. In these cases, you may need to execute some pre-processing or post-processing logic before and/or after each function invocation.
+
+For example, imagine that you are power-tuning a function that deletes one record from a downstream database. Since you want to execute this function `num` times you'd need to insert some records in advance and then find a way to delete all of them with a dynamic payload. Or you could simply configure a pre-processing function (using the `preProcessorARN` input parameter) that will create a brand new record before the actual function is executed.
+
+Here's the flow in pseudo-code:
+
+```
+function Executor:
+  iterate from 0 to num:
+    [payload = execute Pre-processor (payload)]
+    results = execute Main Function (payload)
+    [execute Post-processor (results)]
+```
+
+A few things to note:
+
+* You can configure a pre-processor and/or a post-processor independently
+* The pre-processor will receive the original payload
+* If the pre-processor returns a non-empty output, it will overwrite the original payload
+* The post-processor will receive the main function's output as payload
+* If a pre-processor or post-processor fails, the whole power-tuning state machine will fail
+* Pre/post-processors don't have to be in the same region of the main function
+* Pre/post-processors don't alter the statistics related to cost and performance
+
 ## State Machine Output
 
 The state machine will return the following output:

diff --git a/README.md b/README.md
@@ -107,7 +107,7 @@ cd the-lambda-power-tuner
 npm run deploy
 ```
 
-For Python deployment, see the instructions [here](https://github.com/cdk-patterns/serverless#2-download-pattern-in-python-or-typescript-cdk)
+For Python deployment, see the instructions [here](https://github.com/cdk-patterns/serverless#2-download-pattern-in-python-or-typescript-cdk).
 
 ## How to execute the state machine (programmatically)
 
@@ -146,13 +146,15 @@ The AWS Step Functions state machine accepts the following parameters:
 * **lambdaARN** (required, string): unique identifier of the Lambda function you want to optimize
 * **powerValues** (optional, string or list of integers): the list of power values to be tested; if not provided, the default values configured at deploy-time are used (by default: 128MB, 256MB, 512MB, 1024MB, 1536MB, and 3008MB); you can provide any power values between 128MB and 3,008MB in 64 MB increments; if you provide the string `"ALL"` instead of a list, all possible power configurations will be tested
 * **num** (required, integer): the # of invocations for each power configuration (minimum 5, recommended: between 10 and 100)
-* **payload** (string, object, or list): the static payload that will be used for every invocation (object or string); when using a list, a weighted payload is expected in the shape of `[{"payload": {...}, "weight": X }, {"payload": {...}, "weight": Y }, {"payload": {...}, "weight": Z }]`, where the weights `X`, `Y`, and `Z` are treated as relative weights (not perentages); more details below in the Weighted Payloads section
+* **payload** (string, object, or list): the static payload that will be used for every invocation (object or string); when using a list, a weighted payload is expected in the shape of `[{"payload": {...}, "weight": X }, {"payload": {...}, "weight": Y }, {"payload": {...}, "weight": Z }]`, where the weights `X`, `Y`, and `Z` are treated as relative weights (not perentages); more details below in the [Weighted Payloads section](#user-content-weighted-payloads)
 * **parallelInvocation** (false by default): if true, all the invocations will be executed in parallel (note: depending on the value of `num`, you may experience throttling when setting `parallelInvocation` to true)
 * **strategy** (string): it can be `"cost"` or `"speed"` or `"balanced"` (the default value is `"cost"`); if you use `"cost"` the state machine will suggest the cheapest option (disregarding its performance), while if you use `"speed"` the state machine will suggest the fastest option (disregarding its cost). When using `"balanced"` the state machine will choose a compromise between `"cost"` and `"speed"` according to the parameter `"balancedWeight"`
 * **balancedWeight** (number between 0.0 and 1.0, by default is 0.5): parameter that express the trade-off between cost and time, 0.0 is equivalent to `"speed"` strategy, 1.0 is equivalent to `"cost"` strategy
 * **autoOptimize** (false by default): if `true`, the state machine will apply the optimal configuration at the end of its execution
 * **autoOptimizeAlias** (string): if provided - and only if `autoOptimize` if `true`, the state machine will create or update this alias with the new optimal power value
 * **dryRun** (false by default): if true, the state machine will execute the input function only once and it will disable every functionality related to logs analysis, auto-tuning, and visualization; the dry-run mode is intended for testing purposes, for example to verify that IAM permissions are set up correctly
+* **preProcessorARN** (string): it must be the ARN of a Lambda function; if provided, the function will be invoked before every invocation of `lambdaARN`; more details below in the [Pre/Post-processing functions section](#user-content-prepost-processing)
+* **postProcessorARN** (string): it must be the ARN of a Lambda function; if provided, the function will be invoked after every invocation of `lambdaARN`; more details below in the [Pre/Post-processing functions section](#user-content-prepost-processing)
 
 
 Additionally, you can specify a list of power values at deploy-time in the `PowerValues` CloudFormation parameter. These power values will be used as the default in case no `powerValues` input parameter is provided.
@@ -205,6 +207,32 @@ To simplify these calculations, you could use weights that sum up to 100.
 Note: the number of weighted payloads must always be smaller or equal than `num` (or `num >= count(payloads)`). For example, if you have 50 weighted payloads, you'll need to set at least `num: 50` so that each payload will be used at least once.
 
 
+### Pre/Post-processing functions
+
+Sometimes you need to power-tune Lambda functions that have side effects such as creating or deleting records in a database. In these cases, you may need to execute some pre-processing or post-processing logic before and/or after each function invocation.
+
+For example, imagine that you are power-tuning a function that deletes one record from a downstream database. Since you want to execute this function `num` times you'd need to insert some records in advance and then find a way to delete all of them with a dynamic payload. Or you could simply configure a pre-processing function (using the `preProcessorARN` input parameter) that will create a brand new record before the actual function is executed.
+
+Here's the flow in pseudo-code:
+
+```
+function Executor:
+  iterate from 0 to num:
+    [payload = execute Pre-processor (payload)]
+    results = execute Main Function (payload)
+    [execute Post-processor (results)]
+```
+
+A few things to note:
+
+* You can configure a pre-processor and/or a post-processor independently
+* The pre-processor will receive the original payload
+* If the pre-processor returns a non-empty output, it will overwrite the original payload
+* The post-processor will receive the main function's output as payload
+* If a pre-processor or post-processor fails, the whole power-tuning state machine will fail
+* Pre/post-processors don't have to be in the same region of the main function
+* Pre/post-processors don't alter the statistics related to cost and performance
+
 ## State Machine Output
 
 The state machine will return the following output:

diff --git a/lambda/executor.js b/lambda/executor.js
@@ -10,7 +10,16 @@ const minRAM = parseInt(process.env.minRAM, 10);
  */
 module.exports.handler = async(event, context) => {
     // read input from event
-    let {lambdaARN, value, num, enableParallel, payload, dryRun} = extractDataFromInput(event);
+    let {
+        lambdaARN,
+        value,
+        num,
+        enableParallel,
+        payload,
+        dryRun,
+        preProcessorARN,
+        postProcessorARN,
+    } = extractDataFromInput(event);
 
     validateInput(lambdaARN, value, num); // may throw
 
@@ -27,9 +36,9 @@ module.exports.handler = async(event, context) => {
     const payloads = generatePayloads(num, payload);
 
     if (enableParallel) {
-        results = await runInParallel(num, lambdaARN, lambdaAlias, payloads);
+        results = await runInParallel(num, lambdaARN, lambdaAlias, payloads, preProcessorARN, postProcessorARN);
     } else {
-        results = await runInSeries(num, lambdaARN, lambdaAlias, payloads);
+        results = await runInSeries(num, lambdaARN, lambdaAlias, payloads, preProcessorARN, postProcessorARN);
     }
 
     // get base cost
@@ -59,6 +68,8 @@ const extractDataFromInput = (event) => {
         enableParallel: !!event.parallelInvocation,
         payload: event.payload,
         dryRun: event.dryRun === true,
+        preProcessorARN: event.preProcessorARN,
+        postProcessorARN: event.postProcessorARN,
     };
 };
 
@@ -111,14 +122,14 @@ const convertPayload = (payload) => {
     return payload;
 };
 
-const runInParallel = async(num, lambdaARN, lambdaAlias, payloads) => {
+const runInParallel = async(num, lambdaARN, lambdaAlias, payloads, preARN, postARN) => {
     const results = [];
     // run all invocations in parallel ...
     const invocations = utils.range(num).map(async(_, i) => {
-        const data = await utils.invokeLambda(lambdaARN, lambdaAlias, payloads[i]);
+        const data = await utils.invokeLambdaWithProcessors(lambdaARN, lambdaAlias, payloads[i], preARN, postARN);
         // invocation errors return 200 and contain FunctionError and Payload
         if (data.FunctionError) {
-            throw new Error(`Invocation error (running in parallel): ${data.Payload} with payload ${payloads[i]}`);
+            throw new Error(`Invocation error (running in parallel): ${data.Payload} with payload ${JSON.stringify(payloads[i])}`);
         }
         results.push(data);
     });
@@ -127,14 +138,14 @@ const runInParallel = async(num, lambdaARN, lambdaAlias, payloads) => {
     return results;
 };
 
-const runInSeries = async(num, lambdaARN, lambdaAlias, payloads) => {
+const runInSeries = async(num, lambdaARN, lambdaAlias, payloads, preARN, postARN) => {
     const results = [];
     for (let i = 0; i < num; i++) {
         // run invocations in series
-        const data = await utils.invokeLambda(lambdaARN, lambdaAlias, payloads[i]);
+        const data = await utils.invokeLambdaWithProcessors(lambdaARN, lambdaAlias, payloads[i], preARN, postARN);
         // invocation errors return 200 and contain FunctionError and Payload
         if (data.FunctionError) {
-            throw new Error(`Invocation error (running in series): ${data.Payload} with payload ${payloads[i]}`);
+            throw new Error(`Invocation error (running in series): ${data.Payload} with payload ${JSON.stringify(payloads[i])}`);
         }
         results.push(data);
     }

diff --git a/lambda/utils.js b/lambda/utils.js
@@ -172,6 +172,44 @@ module.exports.deleteLambdaAlias = (lambdaARN, alias) => {
     return lambda.deleteAlias(params).promise();
 };
 
+/**
+ * Invoke a (pre/post-)processor Lambda function and return its output (data.Payload).
+ */
+module.exports.invokeLambdaProcessor = async(processorARN, payload) => {
+    const processorData = await utils.invokeLambda(processorARN, null, payload);
+    if (processorData.FunctionError) {
+        throw new Error(`Processor ${processorARN} failed with error ${processorData.Payload} and payload ${JSON.stringify(payload)}`);
+    }
+    return processorData.Payload;
+};
+
+/**
+ * Wrapper around Lambda function invocation with pre/post-processor functions.
+ */
+module.exports.invokeLambdaWithProcessors = async(lambdaARN, alias, payload, preARN, postARN) => {
+    // first invoke pre-processor, if provided
+    if (preARN) {
+        console.log('Invoking pre-processor');
+        // overwrite payload with pre-processor's output (only if not empty)
+        const preProcessorOutput = await utils.invokeLambdaProcessor(preARN, payload);
+        if (preProcessorOutput) {
+            payload = preProcessorOutput;
+        }
+    }
+
+    // invoke function to be power-tuned
+    const data = await utils.invokeLambda(lambdaARN, alias, payload);
+
+    // then invoke post-processor, if provided
+    if (postARN) {
+        console.log('Invoking post-processor');
+        // note: invocation may have failed (data.FunctionError)
+        await utils.invokeLambdaProcessor(postARN, data.Payload);
+    }
+
+    return data;
+};
+
 /**
  * Invoke a given Lambda Function:Alias with payload and return its logs.
  */

diff --git a/package-lock.json b/package-lock.json
diff --git a/scripts/deploy.sh b/scripts/deploy.sh
@@ -1,6 +1,6 @@
 # config
-BUCKET_NAME=your-sam-templates-bucket
-STACK_NAME=lambda-power-tuning
+BUCKET_NAME=sam-templates-demos-cpt
+STACK_NAME=lambda-power-tuning-new
 PowerValues='128,256,512,1024,1536,3008'
 LambdaResource='*'