Skip to content

[remove datanode] Refuse to remove when there are any other unknown or readonly DataNodes in the consensus group#14145

Merged
OneSizeFitsQuorum merged 2 commits intoapache:masterfrom
HxpSerein:unknown_remove
Nov 21, 2024
Merged

[remove datanode] Refuse to remove when there are any other unknown or readonly DataNodes in the consensus group#14145
OneSizeFitsQuorum merged 2 commits intoapache:masterfrom
HxpSerein:unknown_remove

Conversation

@HxpSerein
Copy link
Copy Markdown
Collaborator

Description

  1. When there are other unknown or readonly DataNodes in the consensus group that are not the removed node, for security reasons, the current remove peer operation will wait for the unknown node to come online. To optimize this situation, we can check before and refuse to remove.
  2. In the future, the two-stage operations of add and remove in region migration can be provided to users as separate instructions. Therefore, for the remove operation, we can consider adding the unsafe instruction to force deletion to avoid stuck situations.

image

Copy link
Copy Markdown
Collaborator

@liyuheng55555 liyuheng55555 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@OneSizeFitsQuorum OneSizeFitsQuorum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@OneSizeFitsQuorum OneSizeFitsQuorum merged commit adcedc0 into apache:master Nov 21, 2024
HxpSerein added a commit to HxpSerein/iotdb that referenced this pull request Dec 3, 2024
…r readonly DataNodes in the consensus group (apache#14145)

* add uknown check

* remove useless check

(cherry picked from commit adcedc0)
OneSizeFitsQuorum added a commit that referenced this pull request Dec 6, 2024
…14301)

* Use CountDownLatch to replace Semaphore in IoTConsensus log dispatcher closing #13517

(cherry picked from commit 5599859)

* Remove datanode optimization (#13559)

(cherry picked from commit cc73946)

* Split CnToDnRequestType to sync and async & Add check for adding new request type (#13660)

(cherry picked from commit 8187498)

* Enhance Remove DataNode Test (#13809)

(cherry picked from commit 063fddd)

* [remove datanode] Identify and display invalid nodes before removing #13987

(cherry picked from commit 387b4bf)

* [remove datanode] Fixed jvm heap memory in remove datanode script (#13983)

(cherry picked from commit 78e6f3a)

* [remove datanode] Enhance remove message on environment with only ConfigNode #14007

(cherry picked from commit db7c522)

* Remove useless log "datanodeId -1" #14035

(cherry picked from commit 8a92aaf)

* [remove datanode] Log the node information when submitting RegionMigrateProcedure (#14051)

(cherry picked from commit e437ef3)

* [remove datanode] Enhance remove message on environment with only ConfigNode #14123

(cherry picked from commit f3afc20)

* [remove datanode] Refuse to remove when there are any other unknown or readonly DataNodes in the consensus group (#14145)

* add uknown check

* remove useless check

(cherry picked from commit adcedc0)

* Fix stopping cn leader stuck when region region migration #14175

Signed-off-by: OneSizeFitQuorum <tanxinyu@apache.org>
(cherry picked from commit 9b22f7c)

* [remove datanode] Do not allow regions to inherit the Removing state from datanode (#14185)

(cherry picked from commit 8a6405c)

* [remove datanode] Do not disable the entire region group for one removing region (#14241)

* fix disabled

* fix disabled

(cherry picked from commit 041d292)

* [remove datanode] Not re-submit region migration procedure when leader change or reboot #14277

(cherry picked from commit d101d76)

* Fix ConfignNode LoadManager NPE when removing datanodes #14016

Signed-off-by: OneSizeFitQuorum <tanxinyu@apache.org>
(cherry picked from commit b39c325)

* Fix ConfigNode Partition Metric NPE bug #14144

Signed-off-by: OneSizeFitQuorum <tanxinyu@apache.org>
(cherry picked from commit 4a76dfb)

* Enhance procedure recover policy

(cherry picked from commit 496c62e)

* remove table

* fix IT

---------

Co-authored-by: Li Yu Heng <liyuheng55555@126.com>
Co-authored-by: Potato <tanxinyu@apache.org>
@HxpSerein HxpSerein deleted the unknown_remove branch December 6, 2024 05:03
Caideyipi pushed a commit to Caideyipi/iotdb that referenced this pull request Mar 25, 2026
…pache#14301)

* Use CountDownLatch to replace Semaphore in IoTConsensus log dispatcher closing apache#13517

(cherry picked from commit 5599859)

* Remove datanode optimization (apache#13559)

(cherry picked from commit cc73946)

* Split CnToDnRequestType to sync and async & Add check for adding new request type (apache#13660)

(cherry picked from commit 8187498)

* Enhance Remove DataNode Test (apache#13809)

(cherry picked from commit 063fddd)

* [remove datanode] Identify and display invalid nodes before removing apache#13987

(cherry picked from commit 387b4bf)

* [remove datanode] Fixed jvm heap memory in remove datanode script (apache#13983)

(cherry picked from commit 78e6f3a)

* [remove datanode] Enhance remove message on environment with only ConfigNode apache#14007

(cherry picked from commit db7c522)

* Remove useless log "datanodeId -1" apache#14035

(cherry picked from commit 8a92aaf)

* [remove datanode] Log the node information when submitting RegionMigrateProcedure (apache#14051)

(cherry picked from commit e437ef3)

* [remove datanode] Enhance remove message on environment with only ConfigNode apache#14123

(cherry picked from commit f3afc20)

* [remove datanode] Refuse to remove when there are any other unknown or readonly DataNodes in the consensus group (apache#14145)

* add uknown check

* remove useless check

(cherry picked from commit adcedc0)

* Fix stopping cn leader stuck when region region migration apache#14175

Signed-off-by: OneSizeFitQuorum <tanxinyu@apache.org>
(cherry picked from commit 9b22f7c)

* [remove datanode] Do not allow regions to inherit the Removing state from datanode (apache#14185)

(cherry picked from commit 8a6405c)

* [remove datanode] Do not disable the entire region group for one removing region (apache#14241)

* fix disabled

* fix disabled

(cherry picked from commit 041d292)

* [remove datanode] Not re-submit region migration procedure when leader change or reboot apache#14277

(cherry picked from commit d101d76)

* Fix ConfignNode LoadManager NPE when removing datanodes apache#14016

Signed-off-by: OneSizeFitQuorum <tanxinyu@apache.org>
(cherry picked from commit b39c325)

* Fix ConfigNode Partition Metric NPE bug apache#14144

Signed-off-by: OneSizeFitQuorum <tanxinyu@apache.org>
(cherry picked from commit 4a76dfb)

* Enhance procedure recover policy

(cherry picked from commit 496c62e)

* remove table

* fix IT

---------

Co-authored-by: Li Yu Heng <liyuheng55555@126.com>
Co-authored-by: Potato <tanxinyu@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants