Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Fargate GPU Support: When is GPU support coming to fargate? #88

Open
mbnr85 opened this issue Jan 4, 2019 · 148 comments
Open

AWS Fargate GPU Support: When is GPU support coming to fargate? #88

mbnr85 opened this issue Jan 4, 2019 · 148 comments
Assignees
Labels
ECS Amazon Elastic Container Service EKS Amazon Elastic Kubernetes Service Fargate AWS Fargate Work in Progress

Comments

@mbnr85
Copy link

mbnr85 commented Jan 4, 2019

Tell us about your request
What do you want us to build?

Which service(s) is this request for?
This could be Fargate, ECS, EKS, ECR

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.

Are you currently working around this issue?
How are you currently solving this problem?

Additional context
Anything else we should know?

Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)

@mbnr85 mbnr85 added the Proposed Community submitted issue label Jan 4, 2019
@pauncejones
Copy link
Contributor

Hi There, can you give us more details about your use case? Instance type, CUDA version, and more info about what you're trying to do - workload, etc.? Thanks.

@mbnr85
Copy link
Author

mbnr85 commented Jan 11, 2019

We would like to run object detection on Fargate.

Setup:
CUDA version 9.0, 9.1 (both work)
Instance type p2.xlarge
Algorithm: Object detection
Input: Frame
Output: Metadata preferably json with coordinates and confidence.
TPS: 10 frames/sec

Does Fargate have some concept of reserved instance discounts in EC2 or Sustained usage discounts?

@FernandoMiguel
Copy link

Does Fargate have some concept of reserved instance discounts in EC2 or Sustained usage discounts?

No

@abby-fuller abby-fuller added the Fargate AWS Fargate label Jan 18, 2019
@mikaelhg
Copy link

mikaelhg commented Feb 8, 2019

I have a similar use case. I'd like to run deep learning inference tasks on CUDA-capable GPUs on Fargate (edit: or Lambda), and pay per second of usage.

The specific use case is inference tasks which are run fairly seldom, but need to respond in seconds, rather than minutes. In other words, waiting a few minutes for an EC2 instance to boot up, just doesn't cut the mustard. But neither does the application need to be taking up a GPU 24/7 unproductively, just to run the inference job for a minute or two, twice a day.

Edit: By mid-2021, extremely easy quantization and optimization, along with with better models, have removed my need for this use case - but I suppose the people giving the comment the thumbs up might still have something going on in this direction.

@juve
Copy link

juve commented Feb 13, 2019

I also have an inference use-case where we would like to be able to autoscale inference sqs workers in Fargate. We originally tried to use ECS, but found it too cumbersome to scale both the containers and the EC2 instances, so we are currently just using EC2 instances with an autoscaling group. We considered using Sagemaker, but that will require some engineering effort for us to adapt our architecture and models.

@aysark
Copy link

aysark commented Feb 16, 2019

I'd be interested in this too and have similar usecases as above.

@gfodor
Copy link

gfodor commented Feb 23, 2019

I have a use case for this too, where we want to spin up GPU resources to do live video streaming of a WebGL application but be able to relinquish those completely after the stream ends, with minimal start up time or over-metering. In our case, we would need the ability to run an X11 server with GPU hardware acceleration.

@prameshbajra
Copy link

@mbnr85 I too am trying to do object detection on fargate.
Is this even possible (for now)? Have you found anything? What did you do in your case?

@ngander-amfam
Copy link

When training data science models our workloads can take advantage of GPU compute. To start those workloads will run in ECS although eventually we’d likely migrate those to EKS. We’d like to be able to use Fargate to run GPU accelerated workloads but that is not currently supported. Does AWS have GPU compute on the Fargate roadmap, and if so, is there any timeline that can be shared?

@tomfranken
Copy link

Also interested for machine learning...

@romanovzky
Copy link

Interested for ML training and inference as well. The overhead to transfer to sagemaker is too high, we just train models on EC2 GPU boxes and then use CPU runtime for inference on Fargate instances. However, some models would benefit from GPU at inference time (namely those trained on CUDA specific implementations, which as of now we are not using for lack of inference infrastructure). The inference use case is sporadic, such that a full-time EC2 box is too pricey.

@prameshbajra
Copy link

@romanovzky We both are on the same boat I guess. I too am in a similar situation.

@ashirgao
Copy link

I too am looking forward for this feature.

My use-case:

I need to run jobs that benefit from GPU acceleration (mostly model inference and some CPU bound tasks eg. embedding clustering, DB insertions etc.). Each job takes around 10-15 mins on a p2.xlarge. I receive 100-120 such jobs through the day (get 8-10 jobs in the span of 30 sec at max).

My requirement:

A server-less GPU container solution.

My current solution:

My GPU utilizing containers run as custom Sagemaker training jobs.

Advantages:

  • With my increased Sagemaker limit on p2.xlarge systems, I can have 20 jobs running in parallel. And 0 idle cost. So, sort of server-less GPU containers :)
  • Per-second billing.
  • My containers have minimal Sagemaker specific code and hence can be easily run on EC2, ECS or even my own desktop system.

Disadvantages:

  • Sagemaker actually spawns a new instance for my container. This results in longer wait times. (Usually 2x Fargate wait times.)
  • Need to add additional logic in my lambda function that triggers Fargate jobs and Sagemaker jobs separately.

@ctmckee
Copy link

ctmckee commented May 9, 2019

Also....
Some machine learning models require GPU support for predictions (they will not predict on CPU).

For example (an InternalError that can occur when attempting to get a RefineNet predictions on CPU):
InternalError: The CPU implementation of FusedBatchNorm only supports NHWC tensor format for now.

I too support GPU support with Fargate

@Zirkonium88
Copy link

Zirkonium88 commented May 12, 2019

We would like to call from a Docker container (RStudio) several others for a distributed deep/machine learning training using Fargate/AWS Batch. The results should be saved on S3 and wrote back to the RStudio Docker container. Unfortunately, Fargate shows no support for GPUs.

@richarms
Copy link

I would also like to launch GPU containers from Fargate. I have two use-cases: 1. spawning powerful deep learning Jupyterhub development environments for our machine-learning group's researchers that will effortlessly disappear when the individual Jupyterhub kernel is killed. 2. Infrequent, quickly-scaled, deep (i.e. the use of GPU is justified) inference tasks.

a thought: for 2., I hadn't thought of using the suggestion above of an auto-scaling EC2 group (that presumably then use something like a scripted docker-machine command to provision the instance, and launch a kernel container) to run the GPU containers, but this seems like a nasty, expensive (in time and currency) hack for what should be a bit more elegant.

@ClaasBrueggemann
Copy link

Any news on this?

@prameshbajra
Copy link

prameshbajra commented Oct 4, 2019

@ClaasBrueggemann I dont think they will provide this anytime soon. AWS is heavily promoting SageMaker now and in many/most cases that's the way to go. :)

@jl-DaDar
Copy link

what about for 3d model rendering? we aren't needing this for machine learning.

@goswamig
Copy link
Member

goswamig commented Dec 3, 2019

+1 for this support.

@prameshbajra
Copy link

prameshbajra commented Dec 4, 2019

what about for 3d model rendering? we aren't needing this for machine learning.

In that case getting a GPU instance like P2, G3 etc might help?
Amazon won't be providing GPUs any time soon in fargate I believe.

@srinivaspype
Copy link

Any SLA for this? Currently Fargate implementation provides general-purpose CPU cycle speed 2.2GHz- 2.3GHz for us and not capable of running CPU/GPU critical applications.

@adiii717
Copy link

Fargate does not support GPU and we can expect nearly in future.

In Closing
Fargate helped us solve a lot of problems related to real-time processing, including the reduction of operational overhead, for this dynamic environment. We expect it to continue to grow and mature as a service. Some features we would like to see in the near future include GPU support for our GPU-based AI Engines and the ability to cache container images that are larger for quicker “warm” launch times.
https://aws.amazon.com/blogs/architecture/building-real-time-ai-with-aws-fargate/

@depthwise
Copy link

FWIW, it'd be great to run a typical deep learning experiment queue on something like this. Upload code+configs to S3. Lambda picks up, stuffs it into a container, training runs to completion and saves back to S3. Super simple, very scalable.

@prameshbajra
Copy link

FWIW, it'd be great to run a typical deep learning experiment queue on something like this. Upload code+configs to S3. Lambda picks up, stuffs it into a container, training runs to completion and saves back to S3. Super simple, very scalable.

Sounds much more like something that sagemaker would do.

@mrichman
Copy link

What is the status of this? I'm very interested in CUDA support in Fargate tasks.

@mikestef9 mikestef9 added the EKS Amazon Elastic Kubernetes Service label Apr 9, 2020
@simjak
Copy link

simjak commented Dec 6, 2023

The current best practice is to launch GPU instances by executing ECS from lambda, right?

Can you give a reference for this?

@raheem-imcapsule
Copy link

raheem-imcapsule commented Jan 9, 2024

Any update on this.
Issue 5th anniversary completed successfully 🥳 🎈

@nonedone
Copy link

Adding my nudge here on this historic ticket.

@backnight
Copy link

+1

@dscain
Copy link

dscain commented Jan 29, 2024

For simple Generative AI workloads, fargate+gpu would be really valuable, even more now with so many companies working gen AI features.

@oscarnevarezleal
Copy link

The reality is there's not enough GPU power available for on demand scenarios. Even more, the available supply goes directly to big players.

I just opened some AWS accounts for new clients and they start with a zero EC2 GPU quota, have to increasing by submitting an increment request and even after that they don't give you all the availability you ask for rather they told you to wait and see if you really needed more.

Looking at the trends over the years GPU will need to become either absurdly cheap or widely available (same thing must of the times ) before we can go on demand mode for later to be IaC available.

@weijuans weijuans added the ECS Amazon Elastic Container Service label Feb 5, 2024
@rejochandran
Copy link

rejochandran commented Mar 29, 2024

1911 days and counting...
would love to have this; could be a game changer 🔥

@pepitoenpeligro
Copy link

Hi team, this feature can be very interesting for certain types of inferences, especially considering the weight of ML and IA in general on the overall AWS path. Thank you so much team :)

@kunal14053
Copy link

Do we have any updates on this?

@genifycom
Copy link

Sadly Amazon are letting us down on the AI/ML front by not giving us the flexibility we need to advance. As a result we are falling behind where we should be at this point.

The model seems to be that cloud providers know best and will force us down their path.

We need GPU access in multiple scenarios.

@kmulka-bloomberg
Copy link

Custom Model Import was announced for Bedrock the other day. Not sure of all your use cases, but might be an option for a managed AI model hosting with pay-per-token pricing. https://aws.amazon.com/about-aws/whats-new/2024/04/custom-model-import-amazon-bedrock/

@FelixRelli
Copy link

Custom Model Import was announced for Bedrock the other day. Not sure of all your use cases, but might be an option for a managed AI model hosting with pay-per-token pricing. https://aws.amazon.com/about-aws/whats-new/2024/04/custom-model-import-amazon-bedrock/

"Custom models can only be accessed using Provisioned Throughput." => So no pay-per token.
https://aws.amazon.com/bedrock/pricing/

@kmulka-bloomberg
Copy link

"Custom models can only be accessed using Provisioned Throughput." => So no pay-per token. https://aws.amazon.com/bedrock/pricing/

I'm working on getting clarification from AWS, but this blog post says that custom model import uses the On-Demand mode.
https://aws.amazon.com/blogs/aws/import-custom-models-in-amazon-bedrock-preview/

@tamreddy-dot
Copy link

.

@johnwheeler
Copy link

What is gpu_count for on FargateTaskDefinition?

@ssignal
Copy link

ssignal commented Aug 8, 2024

@johnwheeler
I think the error message and the below link will help you.
"Resource handler returned message: "Invalid request provided: Create TaskDefinition: Tasks using the Fargate launch type do not support GPU resource requirements."

https://nocd.hashnode.dev/registering-gpu-instance-w-aws-elastic-container-service-ecs

@github-project-automation github-project-automation bot moved this to Researching in containers-roadmap Oct 23, 2024
@vibhav-ag vibhav-ag moved this from Researching to We're Working On It in containers-roadmap Oct 23, 2024
@vibhav-ag vibhav-ag added Work in Progress and removed Proposed Community submitted issue labels Oct 23, 2024
@kendrexs
Copy link

kendrexs commented Oct 23, 2024

This is WIP for ECS (edited)

  • To clarify, this feature launch is not yet named and is tied to ECS enchanced Capacity Management effort (see @AbhishekNautiyal post). Not necessarily labelled Fargate

@kendrexs
Copy link

kendrexs commented Nov 19, 2024

updated comment to clarify the feature for GPUs and other advanced capacity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ECS Amazon Elastic Container Service EKS Amazon Elastic Kubernetes Service Fargate AWS Fargate Work in Progress
Projects
Status: We're Working On It
Development

No branches or pull requests