New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(aws): add market type to allow usage of spot instances #24
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep the imports like before, besides that LGTM
|
@tboerger I noticed that - VSCode with Go extension formatted it like that automatically. I was not sure if such PR would be welcomed so I didn't bother with details. The source code of the autoscaler docs is not public, is it? |
Not sure where @bradrydzewski have put the sources, asking him :) |
9a9bbd5
to
648faac
Compare
|
the code looks good, but I would caution against using spot instances for auto-scaling. The challenge you will run into is that we have no way to know if / when the spot instance has been terminated, which will prevent the autoscaler from being able to correctly calculate capacity. Also if an instance is terminated while a build is running it will be stuck, and will require manual intervention. |
|
Shot instances are automatically killed? |
|
yes, spot instances are terminated when your bid price falls below market price |
|
@bradrydzewski We're using spot instances also for our production services and we're quite happy with the way how they work. We don't specify the maximum price and the default is on demand price so it never happens that it's terminated because of the market price. However Amazon terminates it anyway from time to time. From our experience it's usually several weeks. What is worse is that they don't have unlimited capacity of spot instances so if you require only one specific instance type it might happen that they refuse to create an instance for you. So the best practice is to allow amazon to choose from several types of instances or manually try different one when the provisioning fails. But for our use case what is in this PR is enough. I'm scaling CI quite aggressively and we're using quite a large instances (c5.2xlarge). Only one is running all the time (and I plan to scale it to 0 outside of working hours) and it handles max 2 tasks. If there're more jobs to be processed a new instance is created and it's killed after 1 hour if it's not needed anymore. With spot instances the price is $0.1326/hr instead of $0.5006 which is quite a big difference. Build artefacts can be cached on S3. The only thing that sucks is that all the docker images has to be re-downloaded on the fresh instances so the first build of each project can be slower. But there's probably no easy way how to avoid that is it? I have one off topic question - is the autoscaling possible when Drone is deployed in Kubernetes cluster? What would be the recommended way? It would be great to utilize the k8s horizontal pod autoscaler and cluster autoscaler but Drone is not able to run the pipeline as k8s pods, is it? Btw Drone is great, thanks for it! We're really looking forward to 0.9 and beyond. |
|
There is a Kubernetes runtime which would use the Kubernetes resources, but AFAIK it's not in a somehow finished state. |
very cool, I did not know it worked this way :)
I am replying to my own comment here ... I forgot that I implemented a pinger that periodically checks to make sure your servers are available. I never had time to test it in production, so it is currently behind a feature flag
You could probably create an ami with all of your commonly used images, and the use this as the default ami to reduce startup time. |
|
I have the "pinger" enabled but so far I think the instances were always terminated by the autoscaler so the pinger hasn't had a change to show off. Custom AMI is a good idea but too much hassle I guess. An option to inject a custom code to the autoscaler's userdata could be just as useful and easier to use. Probably an idea for another PR:). |
no need, you can do this today :) create a cloud-init file on your host machine (where you have the autoscaler installed). Make any customizations that you see fit. You will need to provide your custom cloud-init file to the autoscale server. You can mount the file as a volume: You will also need to tell the autoscale server where to find the file inside the container: |
|
How could I missed that?! Thanks again. |
|
@bradrydzewski you should remove the docker restart, otherwise it's restarted twice. Totally forgot about it, I have dropped that from my custom cloud init. Docker gets started after the installation (at least on Ubuntu) |
|
ok, I think we are good to merge based on my updated understanding of how spot instances work, and the fact that we have the instance pinger in place. This could still result in some stuck builds, but that is something we should be able to automate (detect and cancel or restart) in a separate patch, once I've had some more time to think through how that would work. |
|
Hmm... looks like somebody forgot to update the autoscale documentation :) |
No description provided.