Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fargate Spot Support #2162

Closed
SoManyHs opened this issue Apr 9, 2021 · 3 comments · Fixed by #2188
Closed

Fargate Spot Support #2162

SoManyHs opened this issue Apr 9, 2021 · 3 comments · Fixed by #2188
Assignees
Labels
type/design Issues that are design proposals.

Comments

@SoManyHs
Copy link
Contributor

SoManyHs commented Apr 9, 2021

Description

Since the launch of ECS Capacity Providers (https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cluster-capacity-providers.html), Copilot customers have been asking for support for using Fargate Spot in order to take advantage of the benefit of cost savings that spot provides. Capacity Providers are used to manage the infrastructure the tasks in your clusters use. With Fargate Spot, you can run interruption-tolerant ECS tasks at a discounted rate compared to the Fargate price.

This issue tracks the milestones for supporting Fargate Spot in the Copilot CLI.

User Stories

Current scope

  1. 100% spot. As a Copilot customer, I want to be able to launch my Fargate services on spot capacity only, e.g. in a test environment, in order to save on cost. I want to be able to this regardless of whether I am using application autoscaling or a fixed count, and am willing to risk failed placement on spot.
  2. Burst into spot. As a Copilot customer, I want to be able to specify a dedicated number of Fargate services, and then scale on only Spot after that. This ensures that I have guaranteed capacity for a minimum number of services, while being able to use a cheaper option for scaling beyond that.

Out of current scope

  1. As a Copilot customer, I want to be able to specify a dedicated number of Fargate services, and then scale on either Fargate or Spot after that (mixed scaling).
  2. As a Copilot customer, I want to be able to launch Fargate services on dedicated Fargate capacity, and have some number of services run on Spot capacity up to a certain explicit desiredCount.
  3. As a Copilot customer, I want to be able to first launch Fargate services on spot capacity, but if there is no spot capacity available, I want to fall back to on-demand Fargate capacity and maintain my desiredCount. (Related: [ECS] : Capacity Strategy to Fall back to OD only When No More Spot Capacity Available containers-roadmap#773 and [service] [request]: Fargate Spot failover to Fargate containers-roadmap#852)
  4. As a Copilot customer, I want to use both Fargate and Spot capacity with a fixed desiredCount, but not necessarily prioritize one or the other (i.e. specify some percentage on each type of capacity).

Milestone 1

Use case: 100% spot

Manifest schema:
Introduce spot as subfield of count. Its value can either be an integer n, for specifying a fixed desired count of services launched on spot capacity, or a boolean, to specify intent to use 100% spot with application autoscaling.

Without autoscaling:

count:
  spot: n

With autoscaling:

count:
  range: n-m
  spot: true

Milestone 2

Use Case: Burst into spot

Several potential options for the manifest:

Option 1

Specify spot as a percentage. The main benefit here is that it would easily extend into other percentages in the case of scaling on both spot and Fargate capacity.

count:
  range: 5-10
  spot: 100%

Option 2

Specify dedicated as an integer p that represents the maximum number of tasks you want to scale your services on dedicated Fargate instances. There can also be a spot flag to explicitly indicate that any other scaled services will be placed on spot capacity.

count:
  range: 3-10
  dedicated: 5
  spot: true

Option 3

Another possibility, outlined in this proof of concept using the CDK, has the dedicated field as range n-p that is a subset of the autoscaling range n-m. This option is similar to Option 2, but makes the range of the dedicated capacity explicit; however, it does mean that n has to match on both ranges.

count:
  range: 2-10
  dedicated: 2-5

Option 4

One of the original proposals for this use case is described here: #1586 (comment). Here, both dedicated and spot accept ranges, n-p and p-m. This syntax assumes that the autoscaling range is n-m, and p would have to be the inflection point specified in both ranges. It is also meant to imply that services would first be placed on dedicated instances before scaling into spot, and would make extensibility to other use cases trickier.

count:
  range: 
   dedicated: 2-5
   spot: 5-10

Option 5

Have spot as a top-level field, not under count, with a percentage and a minimum:

spot:
  # percentage of targets to be launched on spot once `minimum` is reached.
  # Required if using spot.
  percentage: 80% 
  
  # Optional. Minimum desired count for services running on spot capacity, which 
  # translates to `base` value on FARGATE_SPOT capacity provider strategy.
  # If 0 or undefined, the FARGATE capacity provider strategy
  # will have its `base` value set to the minimum in the `range` field unless 
  # `percentage` is set to 100%.
  # Default is 0.
  minimum: 5

Part of #1154
Related: #1586

@SoManyHs SoManyHs self-assigned this Apr 9, 2021
@SoManyHs SoManyHs added this to Backlog in Sprint 🏃‍♀️ via automation Apr 9, 2021
@SoManyHs SoManyHs moved this from Backlog to In progress in Sprint 🏃‍♀️ Apr 9, 2021
@cristim
Copy link

cristim commented Apr 13, 2021

@SoManyHs which milestone or configuration options are you implementing in your fargate-spot branch? I'll try to take it for a ride tomorrow.

Never mind, I found it in the diff, https://github.com/SoManyHs/copilot-cli/blob/cbaf6b694f6576de50c83d436a0463f4d81ed971/internal/pkg/manifest/svc_test.go#L59-L67

@SoManyHs
Copy link
Contributor Author

Update: We are reworking the design for using spot with autoscaling. Right now the milestone in progress is supporting 100% spot without autoscaling:

count:
  spot: n

@efekarakus efekarakus added the type/design Issues that are design proposals. label Apr 15, 2021
@SoManyHs
Copy link
Contributor Author

Update:

Manifest design:

There will be a new subfield spot under count, whose value can be an integer. To be used without Autoscaling.
There will be three new subfields under range; min, max and spot_from.

# Count can either be an integer p , representing desiredCount, or a map
count:
  # Optional. Value is an integer p, representing desiredCount.
  # NOTE: Mutually exclusive with range.
  spot: p
  # Optional. Can either be a range of form "n-m", where n is the minCapacity and 
  # m is the maxCapacity, or a map.
  # Default - none
  range:
    # Optional. Value is integer n that correponds to 
    # `minCapacity` on AutoScalingTarget.
    min: n 
    # Optional. Value is integer m that correponds to 
    # `maxCapacity` on AutoScalingTarget.
    max: m
    # Optional. Value is integer q that corresponds to the desiredCount after which 
    # new copies of the service will be launched on spot.
    spot_from: q

100% spot without autoscaling:

count:
  spot: n
  # no autoscaling fields specified

For this case, desiredCount on the service will be set to n, and a Capacity Provider Strategy using FARGATE_SPOT would be specified on the Service with weight of 1, with no base. If range is specified, there will be a validation error.

📤 Output parameters:

  • desiredCount: n
  • CapacityProviderStrategies:
    • FARGATE: nil
    • FARGATE_SPOT: { weight: 1 }
  • no autoscaling target

100% spot with autoscaling:

count:
  range:
    min: n
    max: m
    spot_from: n // NOTE: must be the same as min value

📤 Output parameters: (See CDK example)

  • desiredCount: nil
  • CapacityProviderStrategies:
    • FARGATE: nil
    • FARGATE_SPOT: { weight: 1 }
  • AutoscalingTarget: { minCapacity: n, maxCapacity: m }

Scale into spot:

count:  
  range:
    min: n
    max: m
    spot_from: q

📤 Output parameters:

  • desiredCount: nil
  • CapacityProviderStrategies:
    • FARGATE: { base: q-1, weight: 0}
    • FARGATE_SPOT: { weight: 1 }
  • AutoscalingTarget: { minCapacity: n, maxCapacity: m }

mergify bot pushed a commit that referenced this issue Apr 19, 2021
This adds a new field, `spot` to the manifest as a subfield of `count`.  

First step of #2162

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
SoManyHs added a commit to SoManyHs/copilot-cli that referenced this issue Apr 20, 2021
Range can now either be a range string or a map containing the subfields
"min", "max" and "spot_from". This allows customers to use Application
Autoscaling with spot capacity.

Part of aws#2162
SoManyHs added a commit to SoManyHs/copilot-cli that referenced this issue Apr 21, 2021
Range can now either be a range string or a map containing the subfields
"min", "max" and "spot_from". This allows customers to use Application
Autoscaling with spot capacity.

Part of aws#2162
SoManyHs added a commit to SoManyHs/copilot-cli that referenced this issue Apr 21, 2021
Range can now either be a range string or a map containing the subfields
"min", "max" and "spot_from". This allows customers to use Application
Autoscaling with spot capacity.

Part of aws#2162
@mergify mergify bot closed this as completed in #2188 Apr 21, 2021
Sprint 🏃‍♀️ automation moved this from In progress to Pending release Apr 21, 2021
mergify bot pushed a commit that referenced this issue Apr 21, 2021
<!-- Provide summary of changes -->

Closes #2162

<!-- Issue number, if available. E.g. "Fixes #31", "Addresses #42, 77" -->

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
mergify bot pushed a commit that referenced this issue Apr 22, 2021
Final part of #2162

NOTE: the new `Capacity Provider` column is placed before the `Health Status` column due to weird spacing issues caused by colorizing the status field.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
thrau pushed a commit to localstack/copilot-cli-local that referenced this issue Dec 9, 2022
This adds a new field, `spot` to the manifest as a subfield of `count`.  

First step of aws#2162

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
thrau pushed a commit to localstack/copilot-cli-local that referenced this issue Dec 9, 2022
<!-- Provide summary of changes -->

Closes aws#2162

<!-- Issue number, if available. E.g. "Fixes aws#31", "Addresses aws#42, 77" -->

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
thrau pushed a commit to localstack/copilot-cli-local that referenced this issue Dec 9, 2022
Final part of aws#2162

NOTE: the new `Capacity Provider` column is placed before the `Health Status` column due to weird spacing issues caused by colorizing the status field.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/design Issues that are design proposals.
Projects
Sprint 🏃‍♀️
  
Pending release
Development

Successfully merging a pull request may close this issue.

3 participants