-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aws az down #230
Aws az down #230
Conversation
a05db13
to
ecb7182
Compare
Hi @damienomurchu ! once again, thanks for working on the experiment. Have shared some info here regarding the tests. At this point, okteto is more for repeated devtest. The tests w/ chaosexperiment and chaosengine is a separate step - not integrated w/ okteto as such. |
bd04c79
to
6da4196
Compare
I've been running this locally with okteto and verified all but the destructive aspects of the experiment involving the reassignment of the blocking ACL's to take the targeted az instances out of circulation. The multi-az cluster I've been testing this against is due to expire, so I will test this on a new cluster on Monday most likely. All going well I'll tidy up the PR and take it out of draft. Thanks for the helpful info and fix on #229 ! |
Thanks @damienomurchu !! Also Adding other community members specifically interested in this experiment for their perspective/reviews - cc: @kazukousen @suhrud-kumar |
6da4196
to
2012cfb
Compare
|
||
import "github.com/litmuschaos/litmus-go/pkg/log" | ||
|
||
func AZDown() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the az-down business logic is in the experiment right now, but will look to refactor and move here
921e7d8
to
01be13f
Compare
Mostly verified by running experiment locally with okteto against a multi-az cluster Will run against a fresh cluster later today by regular litmus workflow. |
Signed-off-by: Damien Murphy <damurphy@redhat.com>
01be13f
to
335129a
Compare
Hmm, getting a panic over a nil map assignment I wasn't getting when running the experiment locally with okteto
I've rebuilt the binary and the image from the latest PR changes before testing. Any maps I'm using seem to be initialised but I seem to be missing something so any fresh eyes would be welcome :) |
@ksatchit I feel the PR is probably sufficiently advanced to consider taking it out of draft, but after today I'm mostly AFK for about a month so I will be guided by you as to whether I should take this out of draft before then and resolve any outstanding issues thrown out by the CI. |
Hi @damienomurchu , yes I guess we can move it out of draft/to actual PR!! Will take a closer look shortly (just wrapping up the 1.11.0 release) |
Tagging @uditgaurav on the e2e failure on container-kill experiment. |
Closing this as we went a different direction on this instead, re-using a shell script using the aws-cli to perform the same steps as in the logic in this experiment. I will leave the PR branch up in case any of this work is useful to others. Thanks again for the help and feedback with this PR @ksatchit |
What
Add litmus experiment to simulate failure of an aws az.
Why
Evaluate resiliency of application under test if an az goes down.
How