Skip to content

AlexMeinke/serverless-hosting-of-image-captioning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Serverless ML inference on AWS

Take a deep learning model and host it on AWS, paying only for the inference time that you actually use. The solution uses API Gateway for handling requests from the internet which then get passed to AWS Lambda which in turn loads its code and deep learning model from a docker image that is hosted on AWS Elastic Container Registry.

The model weights and the inference code are taken from this repository. I hope you find this repository useful for serverlessly hosting your own models. Also, see the accompanying blog post.

architecture for serverless hosting of deep learning model

I will assume that you have an AWS account and that you have the permissions to use ECR, AWS Lambda and API Gateway.

Creating the Docker image

First you need to download the model weights and place them into your folder. Then you need to build the docker image

docker build -t YOUR-IMAGE-NAME .

Next you should create a repository on ECR and note down your AWS region and the prefix for your ECR (the first part of your repository's URI). Assuming you have the right credentials configured on your AWS CLI you can run the following command in order to allow docker to push to your ECR:

aws ecr get-login-password --region YOUR-AWS-REGION | docker login --username AWS --password-stdin YOUR-ECR-PREFIX.dkr.ecr.YOUR-AWS-REGION.amazonaws.com

Then it's time to actually push to the ECR.

docker tag  YOUR-IMAGE-NAME:latest YOUR-ECR-PREFIX.dkr.ecr.YOUR-AWS-REGION.amazonaws.com/YOUR-IMAGE-NAME:latest
docker push YOUR-ECR-PREFIX.dkr.ecr.YOUR-AWS-REGION.amazonaws.com/YOUR-IMAGE-NAME:latest

If you hit a snag during any of these steps, also refer to the documentation.

Creating the Lambda Function

Using the docker container from AWS Lambda is easy. During creation in the AWS Management Console, you simply select Container Image as the source. After creating the function, don't forget to increase the timeout to at least 35 seconds as during cold starts the model takes a long time to load. During warm-starts the latency is on the order of 4-5 seconds for this particular model.

Creating the API

In order to host the model as an actual API that is accessible from the internet, you can use API Gateway. Simply create a method (like a POST method) and selection your Lambda function as the integration enpoint. Also check the box "Use Lambda Proxy integration" in your method's integration request menu. Finally, you also need to go to settings, "Binary Media Types"and add image/*. If you now deploy your model, you should be able to POST an image and receive a short textual description of its content.

About

Take a deep learning model for image-to-caption generation and host it serverlessly on AWS using AWS Lambda, ECR and API Gateway.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published