Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
323 lines (309 sloc) 13.3 KB
---
AWSTemplateFormatVersion: '2010-09-09'
Description: >
Demonstrates:
* Reboot instances to take critical, reboot required patches during ASG scaling (e.g. Amazon Linux Kernel Updates)
Supports
* CF Update for Monthly Server Patching Via Server Replacmeent
* TroubleShooting mode to enable Session Manager Connection
* Parameter (ELBName) that is
* Optional
* can be sourced from a ParentStack Import
* OR sourced directly as a parameter
* Parameter overrides parent stack if both are present
Parameters:
AAAReadmeBlogPost:
Description: Read the following post to learn why this template is helpful - Darwin.
Type: String
Default: https://cloudywindows.io/post/asg-lifecycle-hook-for-linux-kernel-patching-with-a-reboot-in-aws-autoscaling-groups/
AAAReadmeBlogPost2:
Description: Read the following post to learn why this template is helpful - Darwin.
Type: String
Default: https://cloudywindows.io/post/cloudformation-stack-attack/
AMIID:
Description: >
AMI ID - For testing pick an older one that will need a kernel patch or other reboot required patch for sure.
For Amazon Linux 1 that won't be too hard ;)
Type: String
Default: ami-04768381bf606e2b3
DesiredCapacity:
Description: >
ASG Desired Capacity - 4 is a good number for testing updates.
1 is sufficient to observe how the userdata code processes to accomplish the yum update and reboot before the lifecycle hook.
Type: Number
Default: 4
PatchRunDate:
Description: >
Enter the deploy or update date - changing this is required to force a rolling replacement for patching.
It has a secondary purpose as a convenient way to document the patch date as an environment variable and an EC2 tag.
It is just a string that you could set to any value and has no role in selecting patches or anything else - but it
does need to change from it's previous setting in order for the update to be forced.
Type: String
Default: 2019-06-04
TroubleShootingMode:
Description: Enables troubleshooting - currently this just enables SSM so Session Manager can be used to logon to the machine(s).
Type: String
Default: false
AllowedValues:
- true
- false
UpdateType:
Description: (Only Applies To Updates) Whether ASG Update Should Do a Rolling Update or an ASG Replacement.
Type: String
Default: RollingThroughInstances
AllowedValues:
- ReplaceEntireASG
- RollingThroughInstances
SetupPseudoWebApp:
Description: Puts up apache with a simple landing page. Warning, adds an ingress for port 80 to the VPC default security group.
Type: String
Default: false
AllowedValues:
- true
- false
ELBName:
Description: ELBName (Optional) - overrides same name in parent stack exports if both are provided
Type: String
ParentStackName:
Description: Parent stack name (Optional) - passes resources including ELBName
Type: String
Conditions:
CreateDebugResources: !Equals [ !Ref TroubleShootingMode, "true" ]
ReplaceEntireASG: !Equals [ !Ref UpdateType, "ReplaceEntireASG" ]
SetupPseudoWebApp: !Equals [ !Ref SetupPseudoWebApp, "true" ]
ParentStackNameWasPassed: !Not [ !Equals [ !Ref ParentStackName, "" ]]
ELBNameWasPassed: !Not [ !Equals [ !Ref ELBName, "" ]]
Resources:
ASGRebootRole:
Type: AWS::IAM::Role
Properties:
Path: /
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action: sts:AssumeRole
Principal:
Service:
- ec2.amazonaws.com
ManagedPolicyArns:
!If
- CreateDebugResources
-
- arn:aws:iam::aws:policy/service-role/AmazonSSMMaintenanceWindowRole
- arn:aws:iam::aws:policy/service-role/AmazonSSMAutomationRole
- arn:aws:iam::aws:policy/service-role/AmazonEC2RoleforSSM
- !Ref "AWS::NoValue"
ASGInstanceProfile:
Type: AWS::IAM::InstanceProfile
Properties:
Path: /
Roles: [ !Ref ASGRebootRole ]
ASGSelfAccessPolicy:
Type: AWS::IAM::Policy
Properties:
PolicyName: ASGSelfAccessPolicy
Roles: [ !Ref ASGRebootRole ]
PolicyDocument:
Version: 2012-10-17
Statement:
- Sid: ASGSelfAccessPolicy
Resource: "*"
Effect: Allow
Action:
- iam:ListAccountAliases
- autoscaling:DescribeAutoScalingInstances
- autoscaling:DescribeAutoScalingGroups
- autoscaling:DescribeLifecycle*
- Sid: ASGLifeCycleAccessPolicy
Resource: !Sub 'arn:${AWS::Partition}:autoscaling:${AWS::Region}:${AWS::AccountId}:autoScalingGroup:*:autoScalingGroupName/${AWS::StackName}*'
Effect: Allow
Action:
- autoscaling:CompleteLifecycleAction
- autoscaling:RecordLifecycleActionHeartbeat
#To use a tag condition, update Resource to '*' and uncomment this segement
#Condition:
# StringEquals:
# autoscaling :ResourceTag/Name: !Ref AWS::StackName
EC2SelfAccessPolicy:
Type: AWS::IAM::Policy
Properties:
PolicyName: EC2SelfAccessPolicy
Roles: [ !Ref ASGRebootRole ]
PolicyDocument:
Version: 2012-10-17
Statement:
- Sid: EC2SelfAccessPolicy
Resource: "*"
Effect: Allow
Action:
- ec2:DescribeInstances
- ec2:DescribeTags
WebPlusDefaultSecurityGroupSelfIngress:
Type: 'AWS::EC2::SecurityGroupIngress'
Condition: SetupPseudoWebApp
Properties:
GroupName: default
IpProtocol: tcp
FromPort: 80
ToPort: 80
CidrIp: 0.0.0.0/0
InstanceASG:
Type: AWS::AutoScaling::AutoScalingGroup
CreationPolicy:
AutoScalingCreationPolicy:
MinSuccessfulInstancesPercent: 75
ResourceSignal:
Timeout: PT15M
Count: !Ref DesiredCapacity
# Decide which UpdatePolicy to uncomment to see this template work with either one.
UpdatePolicy:
AutoScalingRollingUpdate:
MaxBatchSize: 2
MinInstancesInService: 1
MinSuccessfulInstancesPercent: 75
WaitOnResourceSignals: true
PauseTime: PT0M30S
SuspendProcesses:
- HealthCheck
- ReplaceUnhealthy
- AZRebalance
- AlarmNotification
- ScheduledActions
AutoScalingReplacingUpdate:
# WillReplace=True will make ReplacingUpdate take precedent over RollingUpdate
!If
- ReplaceEntireASG
- WillReplace: 'true'
- WillReplace: 'false'
Properties:
HealthCheckGracePeriod: 3
AvailabilityZones:
Fn::GetAZs:
Ref: AWS::Region
MinSize: '1'
MaxSize: '10'
DesiredCapacity: !Ref DesiredCapacity
#Demonstrates an 1) **Optional dependency** on an ELB which can be sourced from either
# 2) **A parent stack**, or 3) **overridden by a parameter** in this stack
LoadBalancerNames:
!If
- ELBNameWasPassed
- - !Ref ELBName
- !If
- ParentStackNameWasPassed
- - Fn::ImportValue:
!Sub "${ParentStackName}-ELBName"
- !Ref AWS::NoValue
LaunchConfigurationName:
Ref: ASGLaunchConfig
LifecycleHookSpecificationList:
- LifecycleTransition: 'autoscaling:EC2_INSTANCE_LAUNCHING'
LifecycleHookName: instance-patching-reboot
HeartbeatTimeout: 3600
Tags:
- Key: Name
Value: !Ref AWS::StackName
PropagateAtLaunch: 'True'
- Key: LAST_CF_PATCH_RUN
Value: !Ref PatchRunDate
PropagateAtLaunch: 'True'
ASGLaunchConfig:
Type: AWS::AutoScaling::LaunchConfiguration
Properties:
ImageId:
Ref: AMIID
InstanceType: t2.small
IamInstanceProfile: !Ref ASGInstanceProfile
BlockDeviceMappings:
- DeviceName: "/dev/xvda"
Ebs:
VolumeType: 'gp2'
VolumeSize: 30
UserData:
Fn::Base64: !Sub |
#!/bin/bash
LAST_CF_PATCH_RUN="${PatchRunDate}" #Forces change for patching rolling replacement and documents last CF triggered patching
ACTUAL_PATCH_DATE=$(date +%Y-%m-%d)
MYINSTANCEID="$(wget -q -O - http://169.254.169.254/latest/meta-data/instance-id)"
MYREGION="$(curl -s 169.254.169.254/latest/meta-data/placement/availability-zone | sed 's/.$//')"
NAMEOFASG=$(aws ec2 describe-tags --region $MYREGION --filters {"Name=resource-id,Values=$MYINSTANCEID","Name=key,Values=aws:autoscaling:groupName"} --output=text | cut -f5)
PATCHDONEFLAG=/root/patchingrebootwasdone.flg
function logit() {
LOGSTRING="$(date +"%_b %e %H:%M:%S") $(hostname) USERDATA_SCRIPT: $1"
echo "$LOGSTRING"
#For CloudFormation, if you already collect /var/log/cloud-init-output.log, then you could mute the next logging line
echo "$LOGSTRING" >> /var/log/messages
}
logit "Processing userdata script on instance: $MYINSTANCEID"
logit "Operating in Region: $MYREGION, launched from ASG: $NAMEOFASG"
yum update yum-utils
uname -r
if [ ! -z $NAMEOFASG ]; then
logit "Instance is in an ASG, will process lifecycle hooks"
logit "Listing hook to verify permissions and hook presence"
aws --region ${AWS::Region} autoscaling describe-lifecycle-hooks --auto-scaling-group-name $NAMEOFASG
else
logit "Instance is not in an ASG or if it is, the instance profile used does not have permissions to its own tags."
fi
if [ -f $PATCHDONEFLAG ]; then
logit "Completed a post-patching reboot, skipping patching check..."
else
logit "Lets patch (including the kernel if necessary)..."
yum update -y
logit "ACTUAL_PATCH_DATE may be newer because this instance was autoscaled after the LAST_CF_PATCH_RUN"
logit "LAST_CF_PATCH_RUN: $LAST_CF_PATCH_RUN"
echo "export LAST_CF_PATCH_RUN=$LAST_CF_PATCH_RUN" >> /etc/profile.d/lastpatchingdata.sh
logit "ACTUAL_PATCH_DATE: $ACTUAL_PATCH_DATE"
echo "export ACTUAL_PATCH_DATE=$ACTUAL_PATCH_DATE" >> /etc/profile.d/lastpatchingdata.sh
needs-restarting -r
if [ $? -gt 0 ]; then
logit "Detected that a reboot is required, rebooting..."
logit "Resetting userdata semaphore..."
rm /var/lib/cloud/instances/*/sem/config_scripts_user
touch $PATCHDONEFLAG
reboot
logit "Waiting for reboot to complete..."
sleep 30
fi
fi
logit "Continuing..."
if [ ! -z $NAMEOFASG ]; then
logit "Sending a heart beat to reset the timeout counter while doing more things..."
aws --region ${AWS::Region} autoscaling record-lifecycle-action-heartbeat --instance-id $MYINSTANCEID --lifecycle-hook-name instance-patching-reboot --auto-scaling-group-name $NAMEOFASG
fi
if [[ "${SetupPseudoWebApp}" == "true" ]]; then
logit "Installing Web Application (To Emulate a Real Software Stack)"
yum install -y httpd
service httpd start
chkconfig httpd on
cat << HTMLHomePage > /var/www/html/index.html
<html><head><title>CloudyWindows.io ASG With Lifecycle Hooks</title></head>
<body>
<h1>CloudyWindows.io ASG With Lifecycle Hooks</h1>
<p><A HREF="https://cloudywindows.io/post/asg-lifecycle-hook-for-linux-kernel-patching-with-a-reboot-in-aws-autoscaling-groups/"><IMG SRC="https://cloudywindows.io/mstile-150x150.png"> Check Out The Blog Post For This Template!</A>
<BR><BR>Page retrieved from load balanced instance: $MYINSTANCEID
<BR>in ASG: $NAMEOFASG
<BR>As a part of CF Stack: ${AWS::StackName}
<BR>with LAST_CF_PATCH_RUN: $LAST_CF_PATCH_RUN and ACTUAL_PATCH_DATE: $ACTUAL_PATCH_DATE</p>
<iframe src="https://cloudywindows.io" frameborder=0 width="100%" height="100%" ></iframe>
</body>
</html>
HTMLHomePage
fi
logit "Code Deploy Install..."
cd /tmp
aws s3 cp s3://aws-codedeploy-us-east-1/latest/install . --region ${AWS::Region}
chmod +x ./install
./install auto
if [[ "${TroubleShootingMode}" == "true" ]]; then
logit "TroubleShootingMode is true - Installing SSM for Session Manager Access..."
sudo yum install -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm
sudo start amazon-ssm-agent
fi
if [ ! -z $NAMEOFASG ]; then
logit "Completing lifecycle action hook so that ASG knows we are ready to be placed InService..."
aws --region ${AWS::Region} autoscaling complete-lifecycle-action --lifecycle-action-result CONTINUE --instance-id $MYINSTANCEID --lifecycle-hook-name instance-patching-reboot --auto-scaling-group-name $NAMEOFASG
fi
logit "Cfn-signaling success..."
/opt/aws/bin/cfn-signal --success true --stack ${AWS::StackName} --resource InstanceASG --region ${AWS::Region}
You can’t perform that action at this time.