New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-16209. Add description for dfs.namenode.caching.enabled #3378
Conversation
@tasanuma @jojochuang @Hexiaoqiao @ferhui Please help review the change. Thanks a lot. |
Hi @ayushtkn , could you please also take a look. Thank you. |
HDFS-13820, added this configuration to disable the feature, But still it was made to true by default, guess due to compatibility reasons.
Is something like this possible? |
Thanks @ayushtkn for your comments. I have also seen HDFS-13820. But that feature(auto enable or auto disable) is not currently implemented. For new users who may not know this feature(Centralized Cache Management) exists, but it already runs quietly in the background, which incurs performance overhead. IMO, if we need to use this feature, it makes sense to turn it on and specify the path. What do you think? |
💔 -1 overall
This message was automatically generated. |
As @ayushtkn said, facing the same problem, HDFS-13820 add ability to disable the feature, you can also set it false. |
Thanks @ferhui for your comments. Maybe we can add a release note for this change. For new users who may not know this feature(Centralized Cache Management) exists, but it already runs quietly in the background. I think it's not a very elegant way. |
Thanks for the reference and find @tomscut ! However, I believe, as of now, we should provide one fat warning log at appropriate place stating that "please disable this config unless you are using Cache feature and we are going to disable this config by default in 4.0.0 and above releases". And we might also want to reference this Jira for perf degradation case. Thoughts? Overall, perhaps we might want to wait at least one more major release before disabling this by default rather than making incompatible change on 3.x releases. |
@ayushtkn @ferhui @virajjasani Thank you very much for your comments and suggestions. I think what you are saying is reasonable, we should not change the default value of this parameter. But we can add a caption, as @virajjasani said. I have changed the parameter description information, please have a look. Thanks again. |
@tomscut @virajjasani, Thanks. I think It's a good way to add description here. |
Thanks @ferhui for your reply. I changed the title of JIRA and PR. |
Thanks @tomscut, just one more thing. If you could add a |
Thanks @virajjasani for your suggestion, I added the todo description for it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 (non-binding)
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@tomscut Thanks for contribution. @ayushtkn @virajjasani @aajisaka @tasanuma Thanks for review! Merged to trunk. |
Thanks @ferhui @ayushtkn @tasanuma @virajjasani @aajisaka for your review and merge. |
JIRA: HDFS-16209
Namenode config:
dfs.namenode.write-lock-reporting-threshold-ms=50ms
dfs.namenode.caching.enabled=true (default)
In fact, the caching feature is not used in our cluster, but this switch is turned on by default(dfs.namenode.caching.enabled=true), incurring some additional write lock overhead.
We count the number of write lock warnings in a log file, and find that the number of rescan cache warnings reaches about 32%, which greatly affects the performance of Namenode.
We should set 'dfs.namenode.caching.enabled' to false by default and turn it on when we wants to use it.