Skip to content

Conversation

scalar-boney
Copy link
Contributor

@scalar-boney scalar-boney commented Sep 14, 2020

Troubleshoot guide for node replacement

  • Recover from accidental deletion of a node
  • Recover a node with existing data disk from accidental deletion of a node

@@ -0,0 +1,34 @@
# Troubleshoot Guide

This guide explains how to replace a node when the node cannot be replaced with normal procedures. This is especially useful when the node or os-disk is accidentally terminated in the **Azure** environment.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a general troubleshooting guide, so it's weird to see one of the cases in the introductory sentence. It should be just one of the cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the inconvenience.
Document modified based on the review comment.

scalar-boney and others added 2 commits September 15, 2020 10:02
Co-authored-by: Hiroyuki Yamada <mogwaing@gmail.com>
scalar-boney and others added 2 commits September 16, 2020 14:39
Co-authored-by: Hiroyuki Yamada <mogwaing@gmail.com>
Copy link
Collaborator

@feeblefakie feeblefakie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the doc is too specific to what you did and not really generalized and formatted as a troubleshooting guide.
I left some suggestions on only part of it but anyways please update all based on them.
The point is that the readers of the doc don't really know what you did, and faced some troubles and came here to fix the troubles. Please assume that.


This is a guide for troubleshooting scalar-terraform environment.

Use this Troubleshooting Guide to:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need it.

- Accidental deletion of resources

## Accidental deletion of resources
These troubleshooting steps can be used when the node cannot be replaced with normal procedures. This is especially useful when the node or os-disk is accidentally terminated in the **Azure** environment.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not consistent with the title and not really explaining the situation so it's not understandable...

Suggested change
These troubleshooting steps can be used when the node cannot be replaced with normal procedures. This is especially useful when the node or os-disk is accidentally terminated in the **Azure** environment.
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. The following explains how to recover from such cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion.
Review changes updated.

- Replace accidentally removed node or os-disk
- Replace accidentally removed cassandra node or os-disk with existing data disk

### Replace accidentally removed node or os-disk
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Replace accidentally removed node or os-disk
### Recover from accidental deletion of a node or an OS disk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion.
Review changes updated.

* Terminate the node If the os-disk is deleted.
* Follow [Node Replacement](NodeReplacement.md)

### Replace accidentally removed cassandra node or os-disk with existing data disk
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Replace accidentally removed cassandra node or os-disk with existing data disk
### Recover from accidental deletion of a Cassandra node ... (not sure how to fix)

What do you mean os-disk with existing data disk ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion.
Review changes updated.

What do you mean os-disk with existing data disk ?
Sorry for the inconvenience.
I mean recover accidentally removed node with existing data disk.


Please try the following
* Delete the os-disk If the node is terminated.
* Terminate the node If the os-disk is deleted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this is a very rare case ? I think we can't delete only os-disk from console. 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your valuable comment.
Yes, It is not possible through the console because os-disk deletion not possible without terminating VM.

I have modified the document accordingly.

- Recover from accidental deletion of a node
- Recover a node with existing data disk from accidental deletion of a node

### Recover from accidental deletion of a node
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it needed for AWS as well? Only for Azure?
It must be described.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it needed for AWS as well?

no

Only for Azure?

yes

It must be described.

Document modified respectively.

When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. The following explains how to recover from such cases in the AZURE scalar-terraform environment.

This is a guide for troubleshooting scalar-terraform environment.

## Accidental deletion of resources
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. The following explains how to recover from such cases in the AZURE scalar-terraform environment.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is a general one. Write azure thing in each section below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the inconvenience.
Document updated with suggestions.

## Accidental deletion of resources
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. The following explains how to recover from such cases in the AZURE scalar-terraform environment.

- Recover from accidental deletion of a node
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion.
document updated without list.

- Recover from accidental deletion of a node
- Recover a node with existing data disk from accidental deletion of a node

### Recover from accidental deletion of a node
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Recover from accidental deletion of a node
### Recover from accidental deletion of a node in Azure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion.
Document updated with suggestions.

* Delete the os-disk If the node is terminated.
* Follow [Node Replacement](NodeReplacement.md)

### Recover a node with existing data disk from accidental deletion of a node
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Recover a node with existing data disk from accidental deletion of a node
### Recover a node with existing data disk from accidental deletion of a node in Azure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion.
Document updated with suggestions.

- Recover a node with existing data disk from accidental deletion of a node

### Recover from accidental deletion of a node
If you accidentally delete a node that does not have an additional data disk, you can recover it in the following steps. It is mainly applicable for scalardl, envoy, cassy, reaper, monitor and ca nodes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you accidentally delete a node that does not have an additional data disk, you can recover it in the following steps. It is mainly applicable for scalardl, envoy, cassy, reaper, monitor and ca nodes.
If you accidentally delete a node that does not have an additional data disk in Azure, you can recover it in the following steps. It is mainly applicable for scalardl, envoy, cassy, reaper, monitor and ca nodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion.
Document updated with suggestions.

* Follow [Node Replacement](NodeReplacement.md)

### Recover a node with existing data disk from accidental deletion of a node
If you accidentally delete a node that contains an additional data disk, you can recover that node with an existing data disk using the following steps.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If you accidentally delete a node that contains an additional data disk, you can recover that node with an existing data disk using the following steps.
If you accidentally delete a node that contains an additional data disk in Azure, you can recover that node with an existing data disk using the following steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion.
Document updated with suggestions.

This is a guide for troubleshooting scalar-terraform environment.

## Accidental deletion of resources
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. The following explains how to recover from such cases in the AZURE scalar-terraform environment.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use. The following explains how to recover from such cases in the AZURE scalar-terraform environment.
When you accidentally delete a resource manually without terraform, it causes some inconsistencies between the actual state of resources and the state that terraform knows. Thus, you might need to take some extra actions to recover from such situations depending on the Cloud you use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your suggestion.
Document updated with suggestion details.

Copy link
Collaborator

@feeblefakie feeblefakie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@feeblefakie feeblefakie merged commit eaf41b7 into master Sep 25, 2020
@feeblefakie feeblefakie deleted the troubleshoot-guide branch September 25, 2020 00:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants