Skip to content
Browse files

Updates guide to start-stop script

  • Loading branch information
davidmerfield committed Dec 17, 2019
1 parent 357ef9f commit 0edabb2d6b0e96a252610ca92e06c0e1ee508fd6
Showing with 22 additions and 1 deletion.
  1. +22 −1 notes/guides/hard-stop-start-ec2-instance.txt
@@ -1,2 +1,23 @@
Cloudwatch create alarm -> SMS -> Lambda
What's the problem?
Blot's EC2 instance becomes unresponsive. Doesn't respond to requests, ssh connections or even system reboots. I want to get the server responsive again when it's down.

What's the solution?
Instead of a system level reboot – I manually stop and then manually start the instance.

How to automate this?
Cloudwatch offers an action to reboot an instance when an alarm is going. However, it does not seem to offer a way to stop-then-start an instance when an alarm is going.

So, I ended up delivering a message to SNS when a cloudwatch alarm is going off. I then subscribe to this message queue from a lambda function which itself stops an instance, then starts it again.

CloudWatch (ec2 instance monitoring) -> SNS (simple notification service) -> Lambda (serverless function invocation)

- Increase granularity of Cloudwatch checks but add ways to prevent an infinite loop
- if the server responds 200 OK to the web request then exit the script
- wait for the server to respond 200 OK before exiting lambda script?
- don't run the function more than once in paralell
- wait for at least x minutes before re-running the function?

0 comments on commit 0edabb2

Please sign in to comment.
You can’t perform that action at this time.