Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ELB Health Check Issue #7

Closed
michael-newman opened this issue Nov 29, 2017 · 8 comments
Closed

ELB Health Check Issue #7

michael-newman opened this issue Nov 29, 2017 · 8 comments

Comments

@michael-newman
Copy link

When I run the Master Template, it fails and does a rollback at the point of the Web Template. As such, I ran the Templates individually successfully until I run the Web Template where it fails and does a rollback (of just the web template).

The issue is new instances fail the Health Check, and as a result, the ASG launches another EC2 instance, another fails, it launches a new instance, etc... hence, causing the Web Template to fail and rollback.

As a result, I changed the Web Template from "HealthCheckType: ELB" to "HealthCheckType: EC2" and the Web Template runs successfully to completion and the ASG does not continually launch instances. As such, it appears the Heath Check is not visible from the ELB as it should be.

Aside from the above modification, the only other modification I made to the Templates was changing the subnet IP ranges from 10.0.x.x/xx to 10.10.x.x/xx. in the VPC Template.

Is there some other configuration I need to make as a result of adjusting the CIDR ranges? or are there any other configuration I need to make in the AWS Console for these Templates to execute in totality?

Would appreciate any/all guidance and thoughts re how to get the Health Check from the ELB working.

Thank you, Mike

@michael-newman
Copy link
Author

Since the Web Template only completed with "HealthCheckType: EC2" I decided to test/change the ASG configuration in the console to "HealthCheckType: ELB"... Interesting, I have no issues now--I can adjust the ASG instance settings up/down and instances are launching/terminating as expected.

@darrylsosborne
Copy link
Contributor

I'll take a look at this over the next few days and see if a change to resolve this will be pushing during my next push in the next few days.

@michael-newman
Copy link
Author

Darryl, Know I'm currently working with AWS Tech Support as it appears the issue I was having with "HealthCheckType: ELB" are timeout issues due to latency within the site. Not sure why, yet, but the site is slow and that was causing HealthChecks to timeout/fail. If the root cause is something related to the CloudFormation Template (versus our wordpress site) I'll post it here.

@darrylsosborne
Copy link
Contributor

Michael. Have you been able to resolve your ELB healthcheck issue?

@michael-newman
Copy link
Author

Darryl,

Not yet... we have AWS techs reviewing ALB log files and Web Server packet captures as we speak--should have an answer soon.

-Mike

@michael-newman
Copy link
Author

Darryl,

FYI, AWS Techs informed me this morning they are going to now involve a CloudFormation engineer in my case--current thinking is still high latency is causing Health Checks to Fail, and they are trying to determine root cause.

Has anyone reported to you any latency issues with this setup that I could bring to the attention of the techs working the case? Particularly, once a site is migrated over or a site is heavily built out?

-Mike

@darrylsosborne
Copy link
Contributor

Mike,

The only issue we're aware of is related to EFS file systems running out of burst credits which drops the permitted throughput to the baseline throughput of 50 MiB/s per TiB of storage (or 50 KiB/s per GiB). If the WordPress site has a small amount of file system storage on EFS, it doesn't earn enough credits to sustain the throughput it needs and eventually the burst credit balance drops to zero. File systems earn credits at the baseline throughput rate and consume credits at the throughput driven by the site. If you're using more than your earning, eventually it will drop to zero. Performance may be great initially but when the burst credit balance drops to zero and the permitted throughput changes to the baseline throughput, the site experiences performance issues when the file system is unable to achieve the necessary throughput it demands. That's one of the reasons I built the burst credit balance alarms and dashboard, so customers can monitor and get notified on this condition.

@michael-newman
Copy link
Author

Darryl,

FYI, The AWS Techs started to go down the EFS burst credit path as well, although I must say, we had the latency issues before the burst credit issues which basically shut down the site. We started our effort on V1.0 and were deep into trouble shooting while you released v2.0.x--so with the holidays and an unexpected priority, we decided to delete v1.0 and will restart with v2.0.1 later this month. As such, I'll close this issue.

Very excited about what you are doing here, can't wait to pick this up again in a couple of weeks.

Regards, Mike

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants