The main purpose of model-simulator is to serve as a placeholder model to enable testing other
tools and workflows. It is a Docker image that can be run locally and on AWS SageMaker. It simulates
varying response times, returning error codes, reading artifacts, logging messages, and other model
behaviors.
This repo started with the
scikit_bring_your_own
example, but has modified predictor.py to handle requests as specified below.
Clone the repo:
git clone https://github.com/intuit/model-simulator.git
cd model-simulatorEnsure the serve file is executable:
chmod +x model/serveBuild docker image:
docker build -t model-simulator .docker run -v $(pwd)/data:/opt/ml/model -p 8080:8080 --rm model-simulator:latest serve- The data folder is mounted at
/opt/ml/modelto match where SageMaker will later be mounting themodel.tar.gzfile below. - The service is now running at http://localhost:8080/invocations
curl --data-binary @input_data.json -H "Content-Type: application/json" -v http://localhost:8080/invocations- where
input_data.jsonis a file like the following sample request.
{
"sleep_seconds" : 1.500,
"status" : 200,
"file_path" : "1k_characters.txt",
"message" : "hello world"
}sleep_seconds: Number of seconds to sleep, to simulate response time. Optional. Default is 0.status: HTTP status code to return, to simulate various error codes like 400, 500, etc. Optional. Default is 200.file_path: Name of file under /opt/ml/model to be read. See files underdatadirectory. Could use large files to simulate large response sizes. Optional. Default is empty string.message: Any string value. Could use large messages to simulate large request sizes. Optional. Default is empty string.exception: Any string value. Raise an exception with the given string, instead of returning the otherwise expected response and status. Optional. Default is to skip over raising an exception.empty: Boolean. If true, overrides any other settings here and returns empty string as response body. Optional. Default is false.
{
"echo": {
"file_path": "1k_characters.txt",
"message": "hello world",
"sleep_seconds": 1.5,
"status": 200
},
"file_contents": "1000m ipsum dolor sit amet, consectetur adipiscing elit. Phasellus quis sapien sem. Pellentesque rutrum rhoncus lorem, pretium cursus massa aliquet a. Curabitur egestas neque nunc, nec vehicula quam congue id. Mauris feugiat pharetra diam, non sagittis ante tincidunt vitae. Duis enim odio, gravida a mattis et, condimentum sit amet nunc. In tempus quis felis quis scelerisque. Aenean malesuada diam lectus, congue lacinia lacus porttitor id. Pellentesque eu tempor nunc. Cras et semper enim. Praesent sed dolor a nulla molestie fermentum a a enim. Maecenas erat tellus, adipiscing eget massa eu, varius placerat nunc. Morbi eros nunc, consequat quis velit at, pulvinar vulputate risus. Quisque velit tortor, posuere sed molestie at, molestie in risus. Pellentesque pellentesque lobortis nibh, nec hendrerit dolor adipiscing eu.Nunc vel ligula imperdiet, feugiat risus in, viverra metus. Etiam tempor velit in quam facilisis hendrerit. Pellentesque volutpat sollicitudin tortor at consectetur nulEND.\n",
"file_path": "1k_characters.txt",
"message": "hello world",
"sleep_seconds": 1.5,
"status": 200,
"version": "v1.3b"
}- The response includes the parameters from the request, plus:
file_contents: the contents of given file, if provided.
Starting the inference server with 5 workers.
[2021-04-24 06:31:59 +0000] [9] [INFO] Starting gunicorn 20.1.0
[2021-04-24 06:31:59 +0000] [9] [INFO] Listening at: unix:/tmp/gunicorn.sock (9)
[2021-04-24 06:31:59 +0000] [9] [INFO] Using worker: sync
[2021-04-24 06:31:59 +0000] [13] [INFO] Booting worker with pid: 13
[2021-04-24 06:31:59 +0000] [14] [INFO] Booting worker with pid: 14
[2021-04-24 06:31:59 +0000] [15] [INFO] Booting worker with pid: 15
[2021-04-24 06:32:00 +0000] [17] [INFO] Booting worker with pid: 17
[2021-04-24 06:32:00 +0000] [18] [INFO] Booting worker with pid: 18
request.data: b'{\n "sleep_seconds" : 1.500,\n "status" : 200,\n "file_path" : "1k_characters.txt",\n "message" : "hello world"\n}\n'
request_dict: {'sleep_seconds': 1.5, 'status': 200, 'file_path': '1k_characters.txt', 'message': 'hello world'}
About to sleep 1.5 seconds
Return status 200 and response {"echo": {"file_path": "1k_characters.txt", "message": "hello world", "sleep_seconds": 1.5, "status": 200}, "file_contents": "1000m ipsum dolor sit amet, consectetur adipiscing elit. Phasellus quis sapien sem. Pellentesque rutrum rhoncus lorem, pretium cursus massa aliquet a. Curabitur egestas neque nunc, nec vehicula quam congue id. Mauris feugiat pharetra diam, non sagittis ante tincidunt vitae. Duis enim odio, gravida a mattis et, condimentum sit amet nunc. In tempus quis felis quis scelerisque. Aenean malesuada diam lectus, congue lacinia lacus porttitor id. Pellentesque eu tempor nunc. Cras et semper enim. Praesent sed dolor a nulla molestie fermentum a a enim. Maecenas erat tellus, adipiscing eget massa eu, varius placerat nunc. Morbi eros nunc, consequat quis velit at, pulvinar vulputate risus. Quisque velit tortor, posuere sed molestie at, molestie in risus. Pellentesque pellentesque lobortis nibh, nec hendrerit dolor adipiscing eu.Nunc vel ligula imperdiet, feugiat risus in, viverra metus. Etiam tempor velit in quam facilisis hendrerit. Pellentesque volutpat sollicitudin tortor at consectetur nulEND.\n", "file_path": "1k_characters.txt", "message": "hello world", "sleep_seconds": 1.5, "status": 200, "version": "v1.3b"}
172.17.0.1 - - [24/Apr/2021:06:32:08 +0000] "POST /invocations HTTP/1.1" 200 1247 "-" "curl/7.54.0"
Now that the build and local tests are passing, the next step is to deploy to Amazon SageMaker. For
the examples below, 111111111111 represents your AWS Account number.
For deployment to SageMaker, the contents of the data folder need to be packaged and pushed to an S3 location.
- Create the tar.gz file
cd data
tar -zcvf model.tar.gz *- Upload the model.tar.gz file to some S3 location. For example:
aws s3 cp model.tar.gz s3://your-bucket-name-here/model-simulator/1-0/model.tar.gz- AWS Console > Elastic Container Registry > Repositories > Create Repository
- Repository name:
111111111111.dkr.ecr.us-west-2.amazonaws.com/model-simulator - Push to repository:
aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin 111111111111.dkr.ecr.us-west-2.amazonaws.com
docker tag model-simulator:latest 111111111111.dkr.ecr.us-west-2.amazonaws.com/model-simulator:latest
docker push 111111111111.dkr.ecr.us-west-2.amazonaws.com/model-simulator:latest- AWS Console > Amazon SageMaker > Inference > Models
- Model name:
model-simulator - IAM role: new role
- Location of inference code image:
111111111111.dkr.ecr.us-west-2.amazonaws.com/model-simulator:latest - Location of model artifacts:
s3://your-bucket-name-here/model-simulator/1-0/model.tar.gz
- AWS Console > Amazon SageMaker > Inference > Endpoint configurations
- Endpoint configuration name:
LEARNING-model-simulator-1-0 - Production variants: add
model-simulatormodel created above - Set instance type:
ml.t2.medium(for testing)
- AWS Console > Amazon SageMaker > Inference > Endpoints
- Endpoint name:
LEARNING-model-simulator-1 - Endpoint configuration: the one created above
- Endpoint will be in
Creatingstatus. Wait a few minutes until endpoint reachesInServicestatus.
One option for sending a request to the SageMaker endpoint is using the AWS CLI:
aws sagemaker-runtime invoke-endpoint --endpoint-name LEARNING-model-simulator-1 --body eyJkYXRhIjoiaGVsbG8ifQ== outfile.txt
where the value for --body is the base64 encoding of your request body. In this example, for the
plain text value:
{"data":"hello"}
use its base64 encoding:
eyJkYXRhIjoiaGVsbG8ifQ==
The output from the command looks like:
{
"ContentType": "application/json",
"InvokedProductionVariant": "variant-name-1"
}
and the content of the outfile.txt looks like:
{
"echo": {
"data": "hello"
},
"file_contents": "",
"file_path": "",
"message": [],
"sleep_seconds": 0,
"status": 200,
"version": "v1.3b"
}
Another option is to make an HTTP request directly to the endpoint: https://runtime.sagemaker.us-west-2.amazonaws.com/endpoints/LEARNING-model-simulator-1/invocations but this requires setting certain headers.
For more details, see:
For load testing the endpoint using Gatling to make HTTP requests, see:
Pick a new version number v#.#.#.
Update the version number in predictor.py. TODO: Move value to external file.
Create a new GitHub Release with the same tag version.
GitHub Action docker-build-push.yml will then build the Docker image and push it to the GitHub Packages Docker repository. TODO: In the future, will also deploy to Docker Hub.
Feel free to open an issue or pull request!
Make sure to read our code of conduct.
This project is licensed under the terms of the Apache License 2.0.
- Use Your Own Inference Code with Hosting Services: This page describes how Amazon SageMaker interacts with your Docker container and how it should respond.
- Example Bring Your Own Container: Sample code used as a starting point for this repo.
