Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databases actions #72

Open
xpdable opened this issue Sep 26, 2019 · 5 comments
Open

Databases actions #72

xpdable opened this issue Sep 26, 2019 · 5 comments
Labels
enhancement New feature or request

Comments

@xpdable
Copy link
Contributor

xpdable commented Sep 26, 2019

We'd like to have more experiment on database level, like restart its instance.
Now we are using below service in red block. We've encountered the real-world restarts, especially MySQL many times and this cause we have to work on weekeeeeend :-(
image

@russmiles
Copy link
Contributor

:( my heart goes out to you on that! Weekend loss is bad for everyone.

Will help if I can with the actions.

@russmiles russmiles added the enhancement New feature or request label Sep 26, 2019
@buderre
Copy link
Contributor

buderre commented Oct 11, 2019

@xpdable Can you provide information whether you have a self-managed SQL server or use the database shared model in Azure?

There are some operations that can be activated on MS Azure database like
pausing - https://docs.microsoft.com/en-us/rest/api/sql/databases/pause. Since there are differences in the rented model (self-managed or pay-as-you-go) a well chaos action needs to be prepared here :)

@xpdable
Copy link
Contributor Author

xpdable commented Oct 12, 2019

@buderre as talked offline, briefly share a real case of database
Once Microsoft has to force patch a vulnerability, and our database instance of Azure Database for MySQL are restarted without any notification. Later, we learnt that all users in Azure China Cloud are affected by this restart.
Some service went down because the connection lost from databases. All end-users of the service is affected. The service team firstly shouted out, and Azure team then chased to Microsoft with the incident.
There was nothing to do except waiting the restarted done and it lasted no more than one hour before the service back to normal. The only good news was that we did not lost data

@buderre
Copy link
Contributor

buderre commented Oct 14, 2019

@xpdable This is more than sufficient information for me.

@botobako @mkaszub
I also have a suggestion for putting the scenario "database connection loss" to a chaos action. In your case the server seems to restart itself. The MS Azure REST API for MySQL does not offer a "restart server" action. I don't think that we need it. Instead let's introduce a Azure firewall rule that blocks the connection to the MySQL database for a specified time span. The advantage is that the database remains untouched and we can test the same scenario in a safe way. What do you guys think?

@Lawouach
Copy link
Contributor

Just chiming in :)

I definitely like your idea of impacting the network and I think that's quite clever to rely on the infra to do that, I wouldn't have thought of setting a firewall rule.

Otherwise, in some other areas, you can sometime simply route the network to /dev/null via an intermediate proxy. This means being able to add something in the infra which may not be allowed by your infra team.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants