Skip to content
Permalink
Browse files

Updates guide to start-stop script

  • Loading branch information
davidmerfield committed Dec 17, 2019
1 parent 357ef9f commit 0edabb2d6b0e96a252610ca92e06c0e1ee508fd6
Showing with 22 additions and 1 deletion.
  1. +22 −1 notes/guides/hard-stop-start-ec2-instance.txt
@@ -1,2 +1,23 @@
Cloudwatch create alarm -> SMS -> Lambda
What's the problem?
-------------------
Blot's EC2 instance becomes unresponsive. Doesn't respond to requests, ssh connections or even system reboots. I want to get the server responsive again when it's down.

What's the solution?
--------------------
Instead of a system level reboot – I manually stop and then manually start the instance.

How to automate this?
---------------------
Cloudwatch offers an action to reboot an instance when an alarm is going. However, it does not seem to offer a way to stop-then-start an instance when an alarm is going.

So, I ended up delivering a message to SNS when a cloudwatch alarm is going off. I then subscribe to this message queue from a lambda function which itself stops an instance, then starts it again.

CloudWatch (ec2 instance monitoring) -> SNS (simple notification service) -> Lambda (serverless function invocation)

Improvements
------------
- Increase granularity of Cloudwatch checks but add ways to prevent an infinite loop
- if the server responds 200 OK to the web request then exit the script
- wait for the server to respond 200 OK before exiting lambda script?
- don't run the function more than once in paralell
- wait for at least x minutes before re-running the function?

0 comments on commit 0edabb2

Please sign in to comment.
You can’t perform that action at this time.