-
Notifications
You must be signed in to change notification settings - Fork 491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-8534. Enable asynchronous service logging #4663
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of questions:
- Service logs are not a lot because we don't expect to log per request unless there's an unexpected issue. Audit logs are where we want to start with imho.
- I may not have done enough homework, but I'm wondering if the custom appender to wire rolling file appender to AsyncAppender is too common to be implemented by a lib user like us. Can't it be achieved just by log4j config?
If it's necessary, should it be in some other places thantools
.
Thank you @duongkame for the review.
I think we will need to reevaluate if going by the standard AsyncAppender would suffice and be a simpler implementation when compared to the custom AsyncRFAAppender if we choose to make the service logs asynchronous. |
@jojochuang, @smengcl could you please take a look at this as well, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tanvipenumudy . This change by itself looks fine to me.
--
Hadoop has been using Async log4j appender for NN/DN metrics logger (as least in the tests) as well:
AsyncRFAAppender
class looks the same as the Hadoop one but with some comments removed:
Hadoop's use of log4j for audit logging is archaic. We recently made a change to make it easier to migrate to log4j2. Basically, use log4j.properties to control AsyncAppender. So, keep log4j to log4j2 migration in mind. You don't want to repeat what we went through in Hadoop. |
Thanks @tanvipenumudy for working on this. Findbugs is failing, please check.
https://github.com/tanvipenumudy/ozone/actions/runs/4891447776/jobs/8731974658#step:6:2283 |
Could you please take a final look at the change, @adoroszlai, @duongkame, @jojochuang, thanks! |
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/utils/AsyncRFAAppender.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/utils/AsyncRFAAppender.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/utils/AsyncRFAAppender.java
Outdated
Show resolved
Hide resolved
hadoop-ozone/tools/src/main/java/org/apache/hadoop/ozone/utils/AsyncRFAAppender.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tanvipenumudy for updating the patch.
Thanks @tanvipenumudy for the patch, @duongkame, @jojochuang, @smengcl for the reviews. |
* master: (96 commits) HDDS-8586 Recon. - API for Count of deletePending keys and amount of data mapped to such keys. (apache#4923) HDDS-8908. Intermittent failure in TestBlockDeletion#testBlockDeletion (apache#4958) HDDS-8910. Replace LockManager with striped lock in ContainerStateManager (apache#4962) HDDS-8917. Move protobuf conversion out of the lock in PipelineStateManagerImpl (apache#4965) HDDS-8825. Use apache/hadoop 3.3.5 docker image (apache#4963) HDDS-8906. Avoid stream when getting in-service healthy nodes (apache#4960) HDDS-8907. Store volume count when storage report is updated (apache#4957) HDDS-8905. PipelineManager metrics should not be synchronized (apache#4959) HDDS-8553. Improve scanner integration tests. (apache#4936) HDDS-8854. Avoid unnecessary DatanodeDetails creation for NodeStateManager lookup (apache#4925) HDDS-8315. [Snapshot] Added unit tests for SnapshotDiffManager (apache#4716) HDDS-7968. [Snapshot] Improve KeyDeletingService to reclaim eligible key blocks in snapshot's deletedTable (apache#4935) HDDS-8838. Update default datanode check empty containter on disk to false (apache#4937) HDDS-8763. Support RocksDB iterator with ByteBuffer. (apache#4942) HDDS-8543. FSO directory should reflect bucket/cluster default replication (apache#4947) HDDS-8898. Replication limit should not be less than reconstruction weight (apache#4954) HDDS-8739. Snapdiff should return complete absolute path in Diff Entry (apache#4823) HDDS-8908. Mark TestBlockDeletion#testBlockDeletion as flaky HDDS-8534. Support asynchronous service logging (apache#4663) HDDS-8879. Cleanup SecurityConfig and related class initialization (apache#4921) ...
* tmp-dir-refactor: (99 commits) HDDS-8586 Recon. - API for Count of deletePending keys and amount of data mapped to such keys. (apache#4923) Fix SCM HA finalization compat test HDDS-8908. Intermittent failure in TestBlockDeletion#testBlockDeletion (apache#4958) HDDS-8910. Replace LockManager with striped lock in ContainerStateManager (apache#4962) HDDS-8917. Move protobuf conversion out of the lock in PipelineStateManagerImpl (apache#4965) HDDS-8825. Use apache/hadoop 3.3.5 docker image (apache#4963) HDDS-8906. Avoid stream when getting in-service healthy nodes (apache#4960) HDDS-8907. Store volume count when storage report is updated (apache#4957) HDDS-8905. PipelineManager metrics should not be synchronized (apache#4959) HDDS-8553. Improve scanner integration tests. (apache#4936) HDDS-8854. Avoid unnecessary DatanodeDetails creation for NodeStateManager lookup (apache#4925) HDDS-8315. [Snapshot] Added unit tests for SnapshotDiffManager (apache#4716) HDDS-7968. [Snapshot] Improve KeyDeletingService to reclaim eligible key blocks in snapshot's deletedTable (apache#4935) HDDS-8838. Update default datanode check empty containter on disk to false (apache#4937) HDDS-8763. Support RocksDB iterator with ByteBuffer. (apache#4942) HDDS-8543. FSO directory should reflect bucket/cluster default replication (apache#4947) HDDS-8898. Replication limit should not be less than reconstruction weight (apache#4954) HDDS-8739. Snapdiff should return complete absolute path in Diff Entry (apache#4823) HDDS-8908. Mark TestBlockDeletion#testBlockDeletion as flaky HDDS-8534. Support asynchronous service logging (apache#4663) ...
What changes were proposed in this pull request?
Currently, asynchronous service logging is not supported in Ozone. However, we do support asynchronous audit logging.
Enabling asynchronous service logging can improve performance by allowing the logging thread to continue executing without waiting for the actual logging operation to complete, reducing the impact of logging on the overall system performance.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8534
How was this patch tested?
NA