This sample application demonstrates processing JSON arrays using AWS Step Functions Distributed Map. The Step Functions workflow reads the JSON file from a Amazon S3 bucket and iterated over the array to process each element in the array.
The following diagram shows the Step Functions workflow.
- The state machine reads the
product-updates.jsonfile from an input Amazon S3 bucket. The file contains a JSON array. - The Distributed Map state in the state machine, iterates over the JSON array for each item with the array invokes an AWS Lambda function for data enrichment. The Lambda functions add product stock and price information to the product data.
- The state machine saves the updated product data in an Amazon DynamoDB table.
- Finally the state machine upload the execution meta data in an output S3 bucket.
- Create an AWS account if you do not already have one and log in.
- Have access to an AWS account through the AWS Management Console and the AWS Command Line Interface (AWS CLI). The AWS Identity and Access Management (IAM) user that you use must have permissions to make the necessary AWS service calls and manage AWS resources mentioned in this post. While providing permissions to the IAM user, follow the principle of least-privilege.
- AWS CLI installed and configured
- Git Installed
- AWS Serverless Application Model (AWS SAM) installed
- Python 3.13+ installed
Clone the GitHub repository in a new folder and navigate to project root folder:
git clone https://github.com/aws-samples/sample-stepfunctions-json-array-processor.git
cd sample-stepfunctions-json-array-processorRun the following commands to deploy the application
sam deploy --guidedEnter the following details:
- Stack name: The CloudFormation stack name(for example, stepfunctions-json-array-processor)
- AWS Region: A supported AWS Region (for example, us-east-1)
- Keep rest of the components to default values.
The outputs from the sam deploy will be used in the subsequent steps.
Run the following command to generate sample test data and upload it to the input S3 bucket. Replace InputBucketName with the value from sam deploy output.
python3 scripts/generate_sample_data.py <InputBucketName>Run the following command to start execution of the Step Functions. Replace the StateMachineArn with the value from sam deploy output.
aws stepfunctions start-execution \
--state-machine-arn <StateMachineArn> \
--input '{}'The Step Function state machine parses the input JSON array file and then for each product in the array it invokes the Lambda function. The Lambda function updates the price and stock information and gives back the control to the state machine. The state machine stores the updated product data into the Amazon DynamoDB table and also uploads the execution metadata into the ResultBucketName bucket.
Run the following command to get the details of the execution. Replace the executionArn from the previous command.
aws stepfunctions describe-execution --execution-arn <executionArn>It should the status SUCCEEDED.
Run the following commands to validate the processed output from ProductCatalogTableName DynamoDB table and the generated manifest file from the ResultBucket. Replace the value ProductCatalogTableName and ResultBucket with the value from sam deploy output.
aws dynamodb scan --table-name <ProductCatalogTableName>
aws s3 ls s3://<ResultBucket>/results/ --recursiveCheck that the DynamoDB table contains the updated product information.
Run the following commands to delete the resources deployed in this sample application.
# Delete S3 bucket contents
aws s3 rm s3://<InputBucketName> --recursive
aws s3 rm s3://<ResultBucketName> --recursive
# Delete SAM stack
sam deleteThis library is licensed under the MIT-0 License. See the LICENSE file.
