Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Fargate GPU Support: When is GPU support coming to fargate? #88

mbnr85 opened this Issue Jan 4, 2019 · 13 comments


None yet
Copy link

mbnr85 commented Jan 4, 2019

Tell us about your request
What do you want us to build?

Which service(s) is this request for?
This could be Fargate, ECS, EKS, ECR

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.

Are you currently working around this issue?
How are you currently solving this problem?

Additional context
Anything else we should know?

If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

@mbnr85 mbnr85 added the Proposed label Jan 4, 2019


This comment has been minimized.

Copy link

pauncejones commented Jan 10, 2019

Hi There, can you give us more details about your use case? Instance type, CUDA version, and more info about what you're trying to do - workload, etc.? Thanks.


This comment has been minimized.

Copy link

mbnr85 commented Jan 11, 2019

We would like to run object detection on Fargate.

CUDA version 9.0, 9.1 (both work)
Instance type p2.xlarge
Algorithm: Object detection
Input: Frame
Output: Metadata preferably json with coordinates and confidence.
TPS: 10 frames/sec

Does Fargate have some concept of reserved instance discounts in EC2 or Sustained usage discounts?


This comment has been minimized.

Copy link

FernandoMiguel commented Jan 11, 2019

Does Fargate have some concept of reserved instance discounts in EC2 or Sustained usage discounts?


@abby-fuller abby-fuller added the Fargate label Jan 18, 2019


This comment has been minimized.

Copy link

mikaelhg commented Feb 8, 2019

I have a similar use case. I'd like to run deep learning inference tasks on CUDA-capable GPUs on Fargate, and pay per second of usage.

The specific use case is inference tasks which are run fairly seldom, but need to respond in seconds, rather than minutes. In other words, waiting a few minutes for an EC2 instance to boot up, just doesn't cut the mustard. But neither does the application need to be taking up a GPU 24/7 unproductively, just to run the inference job for a minute or two, twice a day.


This comment has been minimized.

Copy link

juve commented Feb 13, 2019

I also have an inference use-case where we would like to be able to autoscale inference sqs workers in Fargate. We originally tried to use ECS, but found it too cumbersome to scale both the containers and the EC2 instances, so we are currently just using EC2 instances with an autoscaling group. We considered using Sagemaker, but that will require some engineering effort for us to adapt our architecture and models.


This comment has been minimized.

Copy link

aysark commented Feb 16, 2019

I'd be interested in this too and have similar usecases as above.


This comment has been minimized.

Copy link

gfodor commented Feb 23, 2019

I have a use case for this too, where we want to spin up GPU resources to do live video streaming of a WebGL application but be able to relinquish those completely after the stream ends, with minimal start up time or over-metering. In our case, we would need the ability to run an X11 server with GPU hardware acceleration.


This comment has been minimized.

Copy link

prameshbajra commented Mar 21, 2019

@mbnr85 I too am trying to do object detection on fargate.
Is this even possible (for now)? Have you found anything? What did you do in your case?


This comment has been minimized.

Copy link

ngander-amfam commented Apr 8, 2019

When training data science models our workloads can take advantage of GPU compute. To start those workloads will run in ECS although eventually we’d likely migrate those to EKS. We’d like to be able to use Fargate to run GPU accelerated workloads but that is not currently supported. Does AWS have GPU compute on the Fargate roadmap, and if so, is there any timeline that can be shared?


This comment has been minimized.

Copy link

tomfranken commented Apr 12, 2019

Also interested for machine learning...


This comment has been minimized.

Copy link

romanovzky commented Apr 18, 2019

Interested for ML training and inference as well. The overhead to transfer to sagemaker is too high, we just train models on EC2 GPU boxes and then use CPU runtime for inference on Fargate instances. However, some models would benefit from GPU at inference time (namely those trained on CUDA specific implementations, which as of now we are not using for lack of inference infrastructure). The inference use case is sporadic, such that a full-time EC2 box is too pricey.


This comment has been minimized.

Copy link

prameshbajra commented Apr 19, 2019

@romanovzky We both are on the same boat I guess. I too am in a similar situation.


This comment has been minimized.

Copy link

ashirgao commented Apr 19, 2019

I too am looking forward for this feature.

My use-case:

I need to run jobs that benefit from GPU acceleration (mostly model inference and some CPU bound tasks eg. embedding clustering, DB insertions etc.). Each job takes around 10-15 mins on a p2.xlarge. I receive 100-120 such jobs through the day (get 8-10 jobs in the span of 30 sec at max).

My requirement:

A server-less GPU container solution.

My current solution:

My GPU utilizing containers run as custom Sagemaker training jobs.


  • With my increased Sagemaker limit on p2.xlarge systems, I can have 20 jobs running in parallel. And 0 idle cost. So, sort of server-less GPU containers :)
  • Per-second billing.
  • My containers have minimal Sagemaker specific code and hence can be easily run on EC2, ECS or even my own desktop system.


  • Sagemaker actually spawns a new instance for my container. This results in longer wait times. (Usually 2x Fargate wait times.)
  • Need to add additional logic in my lambda function that triggers Fargate jobs and Sagemaker jobs separately.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.