add pre/post-processors to documentation

alexcasalboni · Jun 10, 2020 · 6764b6f · 6764b6f
1 parent 9a83281
commit 6764b6f
Show file tree

Hide file tree

Showing 2 changed files with 59 additions and 2 deletions.
diff --git a/README-SAR.md b/README-SAR.md
@@ -47,6 +47,8 @@ The AWS Step Functions state machine accepts the following parameters:
 * **autoOptimize** (false by default): if `true`, the state machine will apply the optimal configuration at the end of its execution
 * **autoOptimizeAlias** (string): if provided - and only if `autoOptimize` if `true`, the state machine will create or update this alias with the new optimal power value
 * **dryRun** (false by default): if true, the state machine will execute the input function only once and it will disable every functionality related to logs analysis, auto-tuning, and visualization; the dry-run mode is intended for testing purposes, for example to verify that IAM permissions are set up correctly
+* **preProcessorARN** (string): it must be the ARN of a Lambda function; if provided, the function will be invoked before every invocation of `lambdaARN`; more details below in the Pre/Post-processing functions section
+* **postProcessorARN** (string): it must be the ARN of a Lambda function; if provided, the function will be invoked after every invocation of `lambdaARN`; more details below in the Pre/Post-processing functions section
 
 
 Additionally, you can specify a list of power values at deploy-time in the `PowerValues` CloudFormation parameter. These power values will be used as the default in case no `powerValues` input parameter is provided.
@@ -97,6 +99,33 @@ To simplify these calculations, you could use weights that sum up to 100.
 
 Note: the number of weighted payloads must always be smaller or equal than `num` (or `num >= count(payloads)`). For example, if you have 50 weighted payloads, you'll need to set at least `num: 50` so that each payload will be used at least once.
 
+
+### Pre/Post-processing functions
+
+Sometimes you need to power-tune Lambda functions that have side effects such as creating or deleting records in a database. In these cases, you may need to execute some pre-processing or post-processing logic before and/or after each function invocation.
+
+For example, imagine that you are power-tuning a function that deletes one record from a downstream database. Since you want to execute this function `num` times you'd need to insert some records in advance and then find a way to delete all of them with a dynamic payload. Or you could simply configure a pre-processing function (using the `preProcessorARN` input parameter) that will create a brand new record before the actual function is executed.
+
+Here's the flow in pseudo-code:
+
+```
+function Executor:
+  iterate from 0 to num:
+    [payload = execute Pre-processor (payload)]
+    results = execute Main Function (payload)
+    [execute Post-processor (results)]
+```
+
+A few things to note:
+
+* You can configure a pre-processor and/or a post-processor independently
+* The pre-processor will receive the original payload
+* If the pre-processor returns a non-empty output, it will overwrite the original payload
+* The post-processor will receive the main function's output as payload
+* If a pre-processor or post-processor fails, the whole power-tuning state machine will fail
+* Pre/post-processors don't have to be in the same region of the main function
+* Pre/post-processors don't alter the statistics related to cost and performance
+
 ## State Machine Output
 
 The state machine will return the following output:

diff --git a/README.md b/README.md
@@ -107,7 +107,7 @@ cd the-lambda-power-tuner
 npm run deploy
 ```
 
-For Python deployment, see the instructions [here](https://github.com/cdk-patterns/serverless#2-download-pattern-in-python-or-typescript-cdk)
+For Python deployment, see the instructions [here](https://github.com/cdk-patterns/serverless#2-download-pattern-in-python-or-typescript-cdk).
 
 ## How to execute the state machine (programmatically)
 
@@ -146,13 +146,15 @@ The AWS Step Functions state machine accepts the following parameters:
 * **lambdaARN** (required, string): unique identifier of the Lambda function you want to optimize
 * **powerValues** (optional, string or list of integers): the list of power values to be tested; if not provided, the default values configured at deploy-time are used (by default: 128MB, 256MB, 512MB, 1024MB, 1536MB, and 3008MB); you can provide any power values between 128MB and 3,008MB in 64 MB increments; if you provide the string `"ALL"` instead of a list, all possible power configurations will be tested
 * **num** (required, integer): the # of invocations for each power configuration (minimum 5, recommended: between 10 and 100)
-* **payload** (string, object, or list): the static payload that will be used for every invocation (object or string); when using a list, a weighted payload is expected in the shape of `[{"payload": {...}, "weight": X }, {"payload": {...}, "weight": Y }, {"payload": {...}, "weight": Z }]`, where the weights `X`, `Y`, and `Z` are treated as relative weights (not perentages); more details below in the Weighted Payloads section
+* **payload** (string, object, or list): the static payload that will be used for every invocation (object or string); when using a list, a weighted payload is expected in the shape of `[{"payload": {...}, "weight": X }, {"payload": {...}, "weight": Y }, {"payload": {...}, "weight": Z }]`, where the weights `X`, `Y`, and `Z` are treated as relative weights (not perentages); more details below in the [Weighted Payloads section](#user-content-weighted-payloads)
 * **parallelInvocation** (false by default): if true, all the invocations will be executed in parallel (note: depending on the value of `num`, you may experience throttling when setting `parallelInvocation` to true)
 * **strategy** (string): it can be `"cost"` or `"speed"` or `"balanced"` (the default value is `"cost"`); if you use `"cost"` the state machine will suggest the cheapest option (disregarding its performance), while if you use `"speed"` the state machine will suggest the fastest option (disregarding its cost). When using `"balanced"` the state machine will choose a compromise between `"cost"` and `"speed"` according to the parameter `"balancedWeight"`
 * **balancedWeight** (number between 0.0 and 1.0, by default is 0.5): parameter that express the trade-off between cost and time, 0.0 is equivalent to `"speed"` strategy, 1.0 is equivalent to `"cost"` strategy
 * **autoOptimize** (false by default): if `true`, the state machine will apply the optimal configuration at the end of its execution
 * **autoOptimizeAlias** (string): if provided - and only if `autoOptimize` if `true`, the state machine will create or update this alias with the new optimal power value
 * **dryRun** (false by default): if true, the state machine will execute the input function only once and it will disable every functionality related to logs analysis, auto-tuning, and visualization; the dry-run mode is intended for testing purposes, for example to verify that IAM permissions are set up correctly
+* **preProcessorARN** (string): it must be the ARN of a Lambda function; if provided, the function will be invoked before every invocation of `lambdaARN`; more details below in the [Pre/Post-processing functions section](#user-content-prepost-processing)
+* **postProcessorARN** (string): it must be the ARN of a Lambda function; if provided, the function will be invoked after every invocation of `lambdaARN`; more details below in the [Pre/Post-processing functions section](#user-content-prepost-processing)
 
 
 Additionally, you can specify a list of power values at deploy-time in the `PowerValues` CloudFormation parameter. These power values will be used as the default in case no `powerValues` input parameter is provided.
@@ -205,6 +207,32 @@ To simplify these calculations, you could use weights that sum up to 100.
 Note: the number of weighted payloads must always be smaller or equal than `num` (or `num >= count(payloads)`). For example, if you have 50 weighted payloads, you'll need to set at least `num: 50` so that each payload will be used at least once.
 
 
+### Pre/Post-processing functions
+
+Sometimes you need to power-tune Lambda functions that have side effects such as creating or deleting records in a database. In these cases, you may need to execute some pre-processing or post-processing logic before and/or after each function invocation.
+
+For example, imagine that you are power-tuning a function that deletes one record from a downstream database. Since you want to execute this function `num` times you'd need to insert some records in advance and then find a way to delete all of them with a dynamic payload. Or you could simply configure a pre-processing function (using the `preProcessorARN` input parameter) that will create a brand new record before the actual function is executed.
+
+Here's the flow in pseudo-code:
+
+```
+function Executor:
+  iterate from 0 to num:
+    [payload = execute Pre-processor (payload)]
+    results = execute Main Function (payload)
+    [execute Post-processor (results)]
+```
+
+A few things to note:
+
+* You can configure a pre-processor and/or a post-processor independently
+* The pre-processor will receive the original payload
+* If the pre-processor returns a non-empty output, it will overwrite the original payload
+* The post-processor will receive the main function's output as payload
+* If a pre-processor or post-processor fails, the whole power-tuning state machine will fail
+* Pre/post-processors don't have to be in the same region of the main function
+* Pre/post-processors don't alter the statistics related to cost and performance
+
 ## State Machine Output
 
 The state machine will return the following output: