-
Notifications
You must be signed in to change notification settings - Fork 7
Troubleshoot guide for node replacement #205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
docs/TroubleshootingGuide.md
Outdated
@@ -0,0 +1,34 @@ | |||
# Troubleshoot Guide | |||
|
|||
This guide explains how to replace a node when the node cannot be replaced with normal procedures. This is especially useful when the node or os-disk is accidentally terminated in the **Azure** environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be a general troubleshooting guide, so it's weird to see one of the cases in the introductory sentence. It should be just one of the cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the inconvenience.
Document modified based on the review comment.
Co-authored-by: Hiroyuki Yamada <mogwaing@gmail.com>
Co-authored-by: Hiroyuki Yamada <mogwaing@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the doc is too specific to what you did and not really generalized and formatted as a troubleshooting guide.
I left some suggestions on only part of it but anyways please update all based on them.
The point is that the readers of the doc don't really know what you did, and faced some troubles and came here to fix the troubles. Please assume that.
docs/TroubleshootingGuide.md
Outdated
|
||
This is a guide for troubleshooting scalar-terraform environment. | ||
|
||
Use this Troubleshooting Guide to: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need it.
docs/TroubleshootingGuide.md
Outdated
- Accidental deletion of resources | ||
|
||
## Accidental deletion of resources | ||
These troubleshooting steps can be used when the node cannot be replaced with normal procedures. This is especially useful when the node or os-disk is accidentally terminated in the **Azure** environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not consistent with the title and not really explaining the situation so it's not understandable...
These troubleshooting steps can be used when the node cannot be replaced with normal procedures. This is especially useful when the node or os-disk is accidentally terminated in the **Azure** environment. | |
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. The following explains how to recover from such cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion.
Review changes updated.
docs/TroubleshootingGuide.md
Outdated
- Replace accidentally removed node or os-disk | ||
- Replace accidentally removed cassandra node or os-disk with existing data disk | ||
|
||
### Replace accidentally removed node or os-disk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### Replace accidentally removed node or os-disk | |
### Recover from accidental deletion of a node or an OS disk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion.
Review changes updated.
docs/TroubleshootingGuide.md
Outdated
* Terminate the node If the os-disk is deleted. | ||
* Follow [Node Replacement](NodeReplacement.md) | ||
|
||
### Replace accidentally removed cassandra node or os-disk with existing data disk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### Replace accidentally removed cassandra node or os-disk with existing data disk | |
### Recover from accidental deletion of a Cassandra node ... (not sure how to fix) |
What do you mean os-disk with existing data disk
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion.
Review changes updated.
What do you mean os-disk with existing data disk ?
Sorry for the inconvenience.
I mean recover accidentally removed node with existing data disk.
docs/TroubleshootingGuide.md
Outdated
|
||
Please try the following | ||
* Delete the os-disk If the node is terminated. | ||
* Terminate the node If the os-disk is deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this is a very rare case ? I think we can't delete only os-disk from console. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your valuable comment.
Yes, It is not possible through the console because os-disk deletion not possible without terminating VM.
I have modified the document accordingly.
docs/TroubleshootingGuide.md
Outdated
- Recover from accidental deletion of a node | ||
- Recover a node with existing data disk from accidental deletion of a node | ||
|
||
### Recover from accidental deletion of a node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it needed for AWS as well? Only for Azure?
It must be described.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it needed for AWS as well?
no
Only for Azure?
yes
It must be described.
Document modified respectively.
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. The following explains how to recover from such cases in the AZURE scalar-terraform environment.
docs/TroubleshootingGuide.md
Outdated
This is a guide for troubleshooting scalar-terraform environment. | ||
|
||
## Accidental deletion of resources | ||
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. The following explains how to recover from such cases in the AZURE scalar-terraform environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence is a general one. Write azure
thing in each section below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the inconvenience.
Document updated with suggestions.
docs/TroubleshootingGuide.md
Outdated
## Accidental deletion of resources | ||
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. The following explains how to recover from such cases in the AZURE scalar-terraform environment. | ||
|
||
- Recover from accidental deletion of a node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion.
document updated without list.
docs/TroubleshootingGuide.md
Outdated
- Recover from accidental deletion of a node | ||
- Recover a node with existing data disk from accidental deletion of a node | ||
|
||
### Recover from accidental deletion of a node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### Recover from accidental deletion of a node | |
### Recover from accidental deletion of a node in Azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion.
Document updated with suggestions.
docs/TroubleshootingGuide.md
Outdated
* Delete the os-disk If the node is terminated. | ||
* Follow [Node Replacement](NodeReplacement.md) | ||
|
||
### Recover a node with existing data disk from accidental deletion of a node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### Recover a node with existing data disk from accidental deletion of a node | |
### Recover a node with existing data disk from accidental deletion of a node in Azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion.
Document updated with suggestions.
docs/TroubleshootingGuide.md
Outdated
- Recover a node with existing data disk from accidental deletion of a node | ||
|
||
### Recover from accidental deletion of a node | ||
If you accidentally delete a node that does not have an additional data disk, you can recover it in the following steps. It is mainly applicable for scalardl, envoy, cassy, reaper, monitor and ca nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you accidentally delete a node that does not have an additional data disk, you can recover it in the following steps. It is mainly applicable for scalardl, envoy, cassy, reaper, monitor and ca nodes. | |
If you accidentally delete a node that does not have an additional data disk in Azure, you can recover it in the following steps. It is mainly applicable for scalardl, envoy, cassy, reaper, monitor and ca nodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion.
Document updated with suggestions.
docs/TroubleshootingGuide.md
Outdated
* Follow [Node Replacement](NodeReplacement.md) | ||
|
||
### Recover a node with existing data disk from accidental deletion of a node | ||
If you accidentally delete a node that contains an additional data disk, you can recover that node with an existing data disk using the following steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you accidentally delete a node that contains an additional data disk, you can recover that node with an existing data disk using the following steps. | |
If you accidentally delete a node that contains an additional data disk in Azure, you can recover that node with an existing data disk using the following steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion.
Document updated with suggestions.
docs/TroubleshootingGuide.md
Outdated
This is a guide for troubleshooting scalar-terraform environment. | ||
|
||
## Accidental deletion of resources | ||
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. The following explains how to recover from such cases in the AZURE scalar-terraform environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. The following explains how to recover from such cases in the AZURE scalar-terraform environment. | |
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion.
Document updated with suggestion details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Troubleshoot guide for node replacement