Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add benchmark tool #1057

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions examples/standalone-data-generator/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# run below command on M1 mac to build image for linux/amd64
FROM --platform=linux/amd64 python:3-alpine

# Set the working directory to /app
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Make the entrypoint script executable
RUN chmod +x /app/performance-entrypoint.sh

RUN ls -la /app

# Set the default command to run when the container starts
CMD ["python", "create_event_for_performance.py"]

ENTRYPOINT [ "/app/performance-entrypoint.sh" ]
42 changes: 41 additions & 1 deletion examples/standalone-data-generator/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,4 +109,44 @@ DEFAULT_PRODUCT_COUNT = 2

```

Please modify the `enums.py` to define the detail logic for device and user information.
Please modify the `enums.py` to define the detail logic for device and user information.

## Clickstream sample data for performance test
We have developed a tool for performance tests, which utilizes ECS Fargate to continuously send test data to Clickstream. The number of ECS Service tasks can be increased or decreased to adjust the RPS (Requests per Second).

Follow below step to start your Clickstream performance test
1. Add configuration file:

Please download your `amplifyconfiguration.json` file from your web console of the solution, then copy it to the root
folder which at the same level as README file. then you can execute the following command to create and send your sample
data.

In the `amplifyconfiguration.json` file, we will only parse `appId`, `endpoint` and `isCompressEvents` three parameter
to run the program.

2. Start performance test:
Run below script to start performance test
```
./create-ecs-performance-test.sh <testName> <taskNumber> <region>
```

<testName>: the suffix of the ECS cluster name, for example: if <testName> is **MyTest**, you ECS cluster name is **clickstream-sample-data-cluster-MyTest**
<taskNumber>: the number of your ECS service tasks, you can change it from the ECS console after the cluster created.
<region>: the region where the ECS cluster created.
For example:
```
./create-ecs-performance-test.sh MyTest 5 us-east-1
```

According to our testing, if the ECS cluster of performance tool is created in the same region of the Clickstream solution, a task can generate about 200 RPS.
A request payload size is ~2KB and contains 10 events.

3. Clean up performance test resources.
run below script to clean up all resources of performance test tool
```
./cleanup-ecs-performance-test.sh <testName> <region>
```




47 changes: 47 additions & 0 deletions examples/standalone-data-generator/cleanup-ecs-performance-test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
#!/bin/bash

# Variables
TEST_NAME=$1
AWS_REGION=$2
CLUSTER_NAME="clickstream-sample-data-cluster-${TEST_NAME}"
SERVICE_NAME="clickstream-sample-data-service-${TEST_NAME}"
TASK_DEFINITION_NAME="clickstream-sample-data-task-${TEST_NAME}"
ECR_REPO_NAME="clickstream-sample-data-repo-${TEST_NAME}"
LOG_GROUP_NAME="/ecs/clickstream-sample-data-log-${TEST_NAME}"
ROLE_NAME="clickstream-sample-data-ecs-task-role-${TEST_NAME}-${AWS_REGION}"

# De-register ECS Service
echo "Deleting ECS Service..."
aws ecs update-service --cluster "${CLUSTER_NAME}" --service "${SERVICE_NAME}" --desired-count 0 --region ${AWS_REGION}
aws ecs delete-service --cluster "${CLUSTER_NAME}" --service "${SERVICE_NAME}" --force --region ${AWS_REGION}

sleep 30

# Deregister ECS Task Definition
echo "Deregistering Task Definitions..."
TASK_DEFS=$(aws ecs list-task-definitions --family-prefix "${TASK_DEFINITION_NAME}" --region ${AWS_REGION} --query "taskDefinitionArns[]" --output text)
for task_def in $TASK_DEFS; do
aws ecs deregister-task-definition --task-definition "${task_def}" --region ${AWS_REGION}
done

# Delete ECS Cluster
echo "Deleting ECS Cluster..."
aws ecs delete-cluster --cluster "${CLUSTER_NAME}" --region ${AWS_REGION}

# Delete ECR repository
echo "Deleting ECR Repository..."
aws ecr delete-repository --repository-name "${ECR_REPO_NAME}" --region ${AWS_REGION} --force

# Detach IAM policies and delete role
echo "Detaching IAM Policies and Deleting Role..."
aws iam detach-role-policy --role-name "${ROLE_NAME}" --policy-arn "arn:aws:iam::aws:policy/CloudWatchLogsFullAccess"
aws iam detach-role-policy --role-name "${ROLE_NAME}" --policy-arn "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
aws iam delete-role --role-name "${ROLE_NAME}"

# Delete CloudWatch Log Group
echo "Deleting CloudWatch Log Group..."
aws logs delete-log-group --log-group-name "${LOG_GROUP_NAME}" --region ${AWS_REGION}

rm -f amplifyconfiguration-${TEST_NAME}.json

echo "Cleanup completed."
17 changes: 17 additions & 0 deletions examples/standalone-data-generator/configure.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,23 @@
BATCH_EVENT_DURATION_IN_MINUTES = 20
IS_LOG_FULL_REQUEST_MESSAGE = True

# for performance tool
ALL_USER_REALTIME_PERFORMANCE = 100000
RANDOM_DAU_PERFORMANCE = range(10000, 10001)
THREAD_NUMBER_FOR_USER_PERFORMANCE = 1
FLUSH_DURATION_PERFORMANCE = 3
BATCH_EVENT_DURATION_IN_MINUTES_PERFORMANCE = 2
NEED_SLEEP = True
# 100 RPS
# PERFORMANCE_SLEEP_TIME = 0.002
# 10 RPS
# PERFORMANCE_SLEEP_TIME = 0.08

PERFORMANCE_SLEEP_TIME = 0.003

# PERFORMANCE_SLEEP_TIME = 0.001


# common settings
SESSION_TIMES = range(1, 4)
IS_GZIP = True
Expand Down
169 changes: 169 additions & 0 deletions examples/standalone-data-generator/create-ecs-performance-test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
#!/bin/bash

# Variables
TEST_NAME=$1
DOCKERFILE_PATH="./Dockerfile"
IMAGE_NAME="clickstream-sample-data"
TAG=$TEST_NAME
AWS_REGION=$3
ECR_REPO_NAME="clickstream-sample-data-repo-${TEST_NAME}"
NUMBER_OF_TASKS=$2
CLUSTER_NAME="clickstream-sample-data-cluster-${TEST_NAME}"
SERVICE_NAME="clickstream-sample-data-service-${TEST_NAME}"
TASK_DEFINITION_NAME="clickstream-sample-data-task-${TEST_NAME}"
SUBNET_IDS=""
SECURITY_GROUP_ID=""
LOG_GROUP_NAME="/ecs/clickstream-sample-data-log-${TEST_NAME}"
ROLE_NAME="clickstream-sample-data-ecs-task-role-${TEST_NAME}-${AWS_REGION}"

# backup amplifyconfiguration.json for later checking
cp -f amplifyconfiguration.json "amplifyconfiguration-${TEST_NAME}.json"

AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
ROLE="arn:aws:iam::${AWS_ACCOUNT_ID}:role/${ROLE_NAME}"

# Check if the AWS_ACCOUNT_ID variable is set
if [ -z "$AWS_ACCOUNT_ID" ]; then
echo "Failed to retrieve AWS account ID."
exit 1
else
echo "AWS Account ID: $AWS_ACCOUNT_ID"
fi

# Get the default VPC ID
DEFAULT_VPC_ID=$(aws ec2 describe-vpcs --filters "Name=is-default,Values=true" --region ${AWS_REGION} --query "Vpcs[0].VpcId" --output text)

if [ -z "$DEFAULT_VPC_ID" ]; then
echo "No default VPC found."
exit 1
fi

# Fetch Subnet IDs if SUBNET_IDS is empty
if [ -z "$SUBNET_IDS" ]; then
SUBNET_IDS=$(aws ec2 describe-subnets --filters "Name=vpc-id,Values=$DEFAULT_VPC_ID" --region ${AWS_REGION} --query "Subnets[*].SubnetId" --output text)
SUBNET_IDS=$(echo $SUBNET_IDS | sed 's/ /,/g') # Convert space-separated list to comma-separated
if [ -z "$SUBNET_IDS" ]; then
echo "No subnets found in the default VPC."
exit 1
fi
echo "SUBNET_IDS=\"$SUBNET_IDS\""
fi

# Fetch Security Group ID if SECURITY_GROUP_ID is empty
if [ -z "$SECURITY_GROUP_ID" ]; then
SECURITY_GROUP_ID=$(aws ec2 describe-security-groups --filters "Name=vpc-id,Values=$DEFAULT_VPC_ID" --region ${AWS_REGION} --query "SecurityGroups[?GroupName=='default'].GroupId" --output text)
if [ -z "$SECURITY_GROUP_ID" ]; then
echo "No default security group found in the default VPC."
exit 1
fi
echo "SECURITY_GROUP_ID=\"$SECURITY_GROUP_ID\""
fi

# Create a CloudWatch log group if it doesn't exist
LOG_GROUP_EXISTS=$(aws logs describe-log-groups --log-group-name-prefix "${LOG_GROUP_NAME}" --query 'logGroups[?logGroupName==`'${LOG_GROUP_NAME}'`]' --output text --region ${AWS_REGION})

if [ -z "$LOG_GROUP_EXISTS" ]; then
echo "start create log group"
aws logs create-log-group --log-group-name "${LOG_GROUP_NAME}" --region ${AWS_REGION}
fi

# Check if the role exists
ROLE_EXISTS=$(aws iam list-roles --query 'Roles[?RoleName==`'${ROLE_NAME}'`].RoleName' --output text)

if [ -z "$ROLE_EXISTS" ]; then
echo "start create role"
# Create a trust policy file
echo '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ecs-tasks.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}' > TrustPolicy.json


# Create the role with the trust policy
aws iam create-role --role-name "${ROLE_NAME}" --assume-role-policy-document file://TrustPolicy.json

# Attach the permission
aws iam attach-role-policy --role-name "${ROLE_NAME}" --policy-arn "arn:aws:iam::aws:policy/CloudWatchLogsFullAccess"
aws iam attach-role-policy --role-name "${ROLE_NAME}" --policy-arn "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
fi

# Clean up the trust policy file
rm -f TrustPolicy.json

# Full image name with AWS account ID and region
FULL_IMAGE_NAME="${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_REGION}.amazonaws.com/${ECR_REPO_NAME}:${TAG}"

# Get the login command from ECR and execute it directly
aws ecr get-login-password --region $AWS_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_REGION.amazonaws.com

# Build your Docker image locally
docker build -t $IMAGE_NAME -f $DOCKERFILE_PATH .

# Check if ECR repository exists
REPO_EXISTS=$(aws ecr describe-repositories --repository-names "${ECR_REPO_NAME}" --region ${AWS_REGION} 2>&1)

if [ $? -ne 0 ]; then
# If the repository does not exist in ECR, create it.
aws ecr create-repository --repository-name "${ECR_REPO_NAME}" --region ${AWS_REGION}
fi

# Tag the Docker image with the full image name
docker tag $IMAGE_NAME:latest $FULL_IMAGE_NAME

# Push Docker image to ECR
docker push $FULL_IMAGE_NAME

# Create a task definition file
cat > task-definition.json << EOF
{
"family": "${TASK_DEFINITION_NAME}",
"executionRoleArn": "${ROLE}",
"taskRoleArn": "${ROLE}",
"containerDefinitions": [
{
"name": "${IMAGE_NAME}",
"image": "${FULL_IMAGE_NAME}",
"essential": true,
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "${LOG_GROUP_NAME}",
"awslogs-region": "${AWS_REGION}",
"awslogs-stream-prefix": "ClickStream-Test"
}
}
}
],
"networkMode": "awsvpc",
"requiresCompatibilities": ["FARGATE"],
"cpu": "1024",
"memory": "2048"
}
EOF


# Register the task definition with ECS
aws ecs register-task-definition --region $AWS_REGION --cli-input-json file://task-definition.json

# Check if the ECS cluster exists
CLUSTER_EXISTS=$(aws ecs describe-clusters --clusters "${CLUSTER_NAME}" --region ${AWS_REGION} | jq -r .clusters[0].status)

if [ "$CLUSTER_EXISTS" != "ACTIVE" ]; then
# If the cluster does not exist, create it
aws ecs create-cluster --cluster-name "${CLUSTER_NAME}" --region ${AWS_REGION} --settings name=containerInsights,value=enabled
fi

aws ecs create-service --service-name "${SERVICE_NAME}" --desired-count ${NUMBER_OF_TASKS} --task-definition "${TASK_DEFINITION_NAME}" --cluster "${CLUSTER_NAME}" --region ${AWS_REGION} --launch-type "FARGATE" --network-configuration "awsvpcConfiguration={subnets=[$SUBNET_IDS],securityGroups=[$SECURITY_GROUP_ID],assignPublicIp=ENABLED}"

# Clean up
rm -f task-definition.json


Loading
Loading