Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the algorithm configurable #5

Closed
cristim opened this issue Aug 19, 2016 · 8 comments
Closed

Make the algorithm configurable #5

cristim opened this issue Aug 19, 2016 · 8 comments

Comments

@cristim
Copy link
Member

cristim commented Aug 19, 2016

This could be achieved using some additional metadata specified in the tag set on the AutoScaling group.

@cristim
Copy link
Member Author

cristim commented Sep 30, 2016

I've been thinking to change the tag set on the group, and allow people to configure autospotting based on the data contained in that tag.

This is how my current idea looks like:

  • I'd like to rename the tag to autospotting, to make it more clear that the format has changed. For a while both the current tag and the new tag name would be supported, but at some point in the future the current tag would be obsoleted.
  • the new tag would contain data in a key=value format, with multiple items separated by some special character, such as &
  • there would be a predefined set of keys, properly documented and with stable names, such as these listed below:
    • enabled=true|false
    • keep_on_demand_capacity=true|false - this would require one of the following two ones to be present
    • on_demand_percentage=20 - there may also be a documented 'sane' default value
    • on_demand_number=2 - this would conflict with the on_demand_percentage, only one of them would be allowed
    • some way for specifying weighting of various instance types, this can be decided later whenever implementing Investigate weighting of instance types #12

Wrapping up:

Key: autospotting
Value: enabled=true&keep_on_demand_capacity=true&on_demand_number=2

The benefit is we don't need any external database(we could create a dynamoDB table if we ever need more, but I doubt it), and we can just use the tags which we already need anyway.

The only problem of this approach is the limited size of the value, which is 255 characters, so you can't specify too many things there but hopefully we never need as much to configure. The length of the above example is 60 characters, so with this approach we may be able to give about 10-12 options.

One way to mitigate this limit may be to also provide short versions of the parameters, extracting the first letter of the underscore-separated words and making it uppercase, which should compress pretty well:
keep_on_demand_capacity -> KODC

Value: E=true&KODC=true&ODN=2

The length in this case is 22, which should allow us about 30 options in total.

Getting forward with the idea, we could also compress the true/false in the same way to T/F, saving a few more bytes, resulting in a length of 16 characters for that example, which can allow around 50 options. We may never reach so many overall, and it is really unlikely to need as many in the same configuration:

Value: E=T&KODC=T&ODN=2

I see how this may become quite cryptic for a human, so at some point we may need to create a more user-friendly tool to set these options on the tags of all the AutoScaling groups, perhaps also deploy the software and perform updates, but that will most likely come later if at all.

@cristim
Copy link
Member Author

cristim commented Nov 18, 2016

A few other ideas that came up recently, thanks my colleague Emil Filipov:

  • Make the main function configurable by using CloudFormation parameters, passed to the Lambda function at runtime via the CloudWatch event data.
  • By default, if any parameters are missing, only apply on the AutoScaling groups shipped from the same CloudFormation stack as the function, so that the function could be embedded into multiple stacks and would only be applied on each of them.
  • Now that AWS increased the limit of tags that can be applied on AutoScaling Groups, we can use more AutoScaling group tags for configuration.
  • We could support configuring a certain on-demand capacity, given in percentage or absolute numbers:
Key: autospotting-on-demand-capacity-percentage
Value: 20

or

Key: autospotting-on-demand-capacity-instances
Value: 3

@xlr-8
Copy link
Contributor

xlr-8 commented Dec 14, 2016

Another idea, could also be to put a certain price for bidding or a configuration per ASG:

Key: autospotting-bid-price
Value: 0.5

@roeyazroel
Copy link
Contributor

@xlr-8 it's kind of missing the entire point of autospotting, you don't want to manage it as the price changing over time, maybe max bid price tag will be useful as the current algorithm takes the on demand price as max.

Another idea is how long an instance should be in the ASG before it will be replaced with spot, this is for highly dynamic scale. in this case if scale out of the ASG is just for 30 minutes, the autospotting will replace the instance after 5 minutes in the worst case... and eventually you paid for on demand + spot instance for this "peak" time.

@cristim
Copy link
Member Author

cristim commented Dec 14, 2016

I don't think it makes sense to specify an absolute bid price, just run spot as long as you have cheaper spot instances than the initial on demand ones, and run on demand when it is getting more expensive.

I do see a possible use case for having a customizable ratio, which would allow bidding for say 2x or 0.5x on demand price. But then again you lose money if the price is constantly in between the custom threshold and the base on demand price.

@xlr-8
Copy link
Contributor

xlr-8 commented Dec 15, 2016

@roeyazroel Sorry, I might have poorly expressed myself, it is indeed what I meant, having a spot request bid price that would replace the ondemand max price value.

@cristim Yes, I am aware of the fragile balance between bid / on-demand price. Which actually brings a question: how does auto-spotting reacts when bid-request prices vary as well?
Let's say the on-demand price in the ASG is $1/h, the lowest price found was $0.5/h, but 5-10min later you find $0.4/h. Does it directly tries to replace those instances? It's a silly question which might not be a problem, depending on how frequent the lowest bid-price is flapping.

@cristim
Copy link
Member Author

cristim commented Dec 15, 2016

@xlr-8 the spot bid price is always the same as the initial instance's on-demand hourly price. Currently AutoSpotting doesn't care about price fluctuations as long as the spot instances are running. The price may go up and down as long as it is below the bid price(when the spot instance would get terminated by AWS), regardless of other instance types that may become cheaper than those that are currently running.

AutoSpotting only takes action when there is at least an on-demand instance in the group(resulted from scaling out or spot termination), in which case it will pick an on-demand instance from the group, find the cheapest spot instance available at that time, launch it and swap it with your on-demand instance. That's really all it does.

The current logic could be enhanced to do this kind of terminations, but there are some aspects to consider:

  • the lower price may increase above the current price, in which case you may move back and forth or hop between multiple spot prices
  • you still pay an hour of on-demand price for each replacement

So the lower price has to be stable for a relatively long time, enough to make it worth that hour of on-demand price for each change.

In addition, a way to switch more often between spot instance types may be to not bid the on-demand price for the new spot instance, but pick a value close enough to its last week's average, in which case larger fluctuations will trigger often the replacement always with the cheapest one.

All this could be implemented in a sort of 'saving profile' that would always hunt the cheapest instance types, and it could be enabled once we have autospotting configurable.

@cristim
Copy link
Member Author

cristim commented Dec 21, 2016

Fixed in #54, other configuration options to be tracked as separate issues.

@cristim cristim closed this as completed Dec 21, 2016
@xlr-8 xlr-8 self-assigned this Dec 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants