Skip to content

HDDS-9702. Improve logging when Recon gets a full update from OM#5612

Merged
dombizita merged 1 commit intoapache:masterfrom
devmadhuu:HDDS-9702
Nov 28, 2023
Merged

HDDS-9702. Improve logging when Recon gets a full update from OM#5612
dombizita merged 1 commit intoapache:masterfrom
devmadhuu:HDDS-9702

Conversation

@devmadhuu
Copy link
Contributor

@devmadhuu devmadhuu commented Nov 16, 2023

What changes were proposed in this pull request?

Since Recon will fallback to full OM DB snapshot update in case failed to get incremental updates, but we don't need this below full trace rather a brief message with some important details is enough.

2023-10-18 23:09:01,152 WARN [pool-28-thread-1]-org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl: Unable to get and apply delta updates from OM.
INTERNAL_ERROR org.apache.hadoop.ozone.om.exceptions.OMException: Unable to read full data from RocksDB wal to get delta updates. It may have partially been flushed to SSTs. Requested sequence number is 69523 and first available sequence number is 224239 in wal.
        at org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:728)
        at org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.getDBUpdates(OzoneManagerProtocolClientSideTranslatorPB.java:2091)
        at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.innerGetAndApplyDeltaUpdatesFromOM(OzoneManagerServiceProviderImpl.java:450)
        at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.getAndApplyDeltaUpdatesFromOM(OzoneManagerServiceProviderImpl.java:415)
        at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.syncDataFromOM(OzoneManagerServiceProviderImpl.java:502)
        at org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.lambda$startSyncDataFromOM$0(OzoneManagerServiceProviderImpl.java:258)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9702

How was this patch tested?

This patch was tested using manual Junit test case as well as by forcing full OM DB snaphsot update from OM.

@devmadhuu
Copy link
Contributor Author

@dombizita @sumitagrawl Pls review.

@dombizita dombizita changed the title HDDS-9702. Recon - Recon - Improve logging during when Recon gets full updates from OM HDDS-9702. Recon - Improve logging during when Recon gets full updates from OM Nov 17, 2023
Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the same changes here as well, printing out the error message rather than the whole stack trace.

LOG.error("Unable to update Recon's metadata with new OM DB. ", e);

} catch (Exception e) {
metrics.incrNumDeltaRequestsFailed();
LOG.warn("Unable to get and apply delta updates from OM.", e);
LOG.warn("Unable to get and apply delta updates from OM.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question!
Would it be a good idea to mention the this.getClass().getName() to over here, since the stack trace would not be printed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question! Would it be a good idea to mention the this.getClass().getName() to over here, since the stack trace would not be printed?

Whenever there is a log printed in logs, log4j always prints the class name from where this log has appeared, so I believe that is not needed.

@devmadhuu
Copy link
Contributor Author

Can we add the same changes here as well, printing out the error message rather than the whole stack trace.

LOG.error("Unable to update Recon's metadata with new OM DB. ", e);

No, we don't want to hide details for full snapshot error, so left as it is.

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devmadhuu LGTM

Copy link
Contributor

@dombizita dombizita left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for improving this @devmadhuu, it looks good to me!

@dombizita dombizita changed the title HDDS-9702. Recon - Improve logging during when Recon gets full updates from OM HDDS-9702. Improve logging when Recon gets a full update from OM Nov 28, 2023
@dombizita dombizita merged commit 3aca295 into apache:master Nov 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants