Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFS-15961. standby namenode failed to start ordered snapshot deletion is enabled while having snapshottable directories #2881

Merged
merged 5 commits into from Apr 27, 2021

Conversation

bshashikant
Copy link
Contributor

…n is enabled while having snapshottable directories.
@bshashikant bshashikant requested a review from smengcl April 9, 2021 06:15
Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bshashikant .
+1 pending CI

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanx @bshashikant for the fix, Can you extend a test covering the fix as well

@hadoop-yetus

This comment has been minimized.

@bshashikant
Copy link
Contributor Author

Thanx @bshashikant for the fix, Can you extend a test covering the fix as well

Added the test case.

@hadoop-yetus

This comment has been minimized.

@hadoop-yetus

This comment has been minimized.

@ayushtkn
Copy link
Member

Thanx @bshashikant for the fix, this fix seems correct.
I will let @smengcl go through it only, I think I have doubts on HDFS-15614 itself
Really sorry for the noise here

@bshashikant
Copy link
Contributor Author

@smengcl , can you please have a look again?

@ayushtkn
Copy link
Member

ayushtkn commented Apr 14, 2021

I think We should hold this off, I think the code has issues, as I said. Earlier I thought, there is some catch but I don’t think. It is misbehaving only. Ideally such features should go in a branch first....

@bshashikant
Copy link
Contributor Author

bshashikant commented Apr 14, 2021

I think We should hold this off, I think the code has issues, as I said. Earlier I thought, there is some catch but I don’t think. It is misbehaving only. Ideally such features should go in a branch first....

@ayushtkn , IMO its not a misbehaviour here. Once the snapshotTrash feature is enabled, the .Trash directory has to be present to make the feature work. As you can have pre existing snapshottable directories , one way to create the .Trash inside these was to create right after the startup is done. The other solution was to explicitly provision the .Trash using a cmd line option. The choice was made to do it automatically on restart, and fail the Namenode in case any any issues occur.

The other solution is not fail the namenode , but log an warning and later, if any Trash operations gets performed inside the snapshottable root, will fail.

The discussion is here: #2682 (comment)

@bshashikant
Copy link
Contributor Author

I think We should hold this off, I think the code has issues, as I said. Earlier I thought, there is some catch but I don’t think. It is misbehaving only. Ideally such features should go in a branch first....

@ayushtkn , IMO its not a misbehaviour here. Once the snapshotTrash feature is enabled, the .Trash directory has to be present to make the feature work. As you can have pre existing snapshottable directories , one way to create the .Trash inside these was to create right after the startup is done. The other solution was to explicitly provision the .Trash using a cmd line option. The choice was made to do it automatically on restart, and fail the Namenode in case any any issues occur.

The other solution is not fail the namenode , but log an warning and later, if any Trash operations gets performed inside the snapshottable root, will fail.

The discussion is here: #2682 (comment)

Coming to think of it, if providing an external command to create the Trash directory by admins is feasible and makes sense, i think its ok to remove the NN startup logic to create Trash directories inside snapshottable root.

@hadoop-yetus

This comment has been minimized.

@bshashikant
Copy link
Contributor Author

@ayushtkn /@smengcl any further thoughts/reviews?

Copy link
Member

@ayushtkn ayushtkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanx @bshashikant I just had a cursory look, Won't be able to spend time for a thorough review anytime soon. You can proceed, if others are convinced.

I think @arp7 has also validated HDFS-15614 and was convinced with the changes, so things should be good only, I might have messed up.

Arpit If you can spare some time, do give a check to this one too, just changes from EXIT to LOG in case of failure got changed extra.

@@ -98,6 +98,7 @@ public void setupCluster() throws Exception {
conf.setInt(DFSConfigKeys.DFS_BLOCK_SIZE_KEY, BLOCK_SIZE);
conf.setInt(DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY, 1);
conf.setInt(DFSConfigKeys.DFS_HA_TAILEDITS_PERIOD_KEY, 1);
conf.setBoolean("dfs.namenode.snapshot.trashroot.enabled", false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this config in DFSConfigKyes and use it from there, As of now I think it is defined in FsNamesystem which doesn't makes, The value of this I guess is exposed via getServerDefaults also so no point keeping it in FsN

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is intended to be an hidden config similar to ordered snapshot deletion config. I would prefer to keep it as it is now.

@hadoop-yetus

This comment has been minimized.

@smengcl
Copy link
Contributor

smengcl commented Apr 26, 2021

+1 on the latest change. Let's merge the fix to trunk before dfsadmin -provisionSnapshotTrash -all is implemented.

I have filed HDFS-15997 to implement dfsadmin -provisionSnapshotTrash -all.

@bshashikant
Copy link
Contributor Author

Thanks @ayushtkn and @smengcl for the reviews. Committing this.

@bshashikant
Copy link
Contributor Author

The failures are nit related to the patch.

@bshashikant bshashikant merged commit ef13f8a into apache:trunk Apr 27, 2021
kiran-maturi pushed a commit to kiran-maturi/hadoop that referenced this pull request Nov 24, 2021
…n is enabled while having snapshottable directories (apache#2881)
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
…n is enabled while having snapshottable directories (apache#2881)

Conflicts:
	hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHASafeMode.java
Ref: CDPD-24815

Change-Id: Ibc06963223db50b96d4cbdf64817410db9c8a365
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants