Skip to content

Conversation

@ddanielr
Copy link
Contributor

@ddanielr ddanielr commented Sep 7, 2023

Original Ticket Contents

This change allows easy logging targeting by putting all file deletion actions into a separate logger.

  • Adds a separate file logger to GCRun to allow for easy filtering of file deletion log messages.
  • Fixes GCRun logger name improperly created #3744 by changing how the standard logger is created.
  • Moves datalevel to the end of the logger name.

This approach was chosen because log4j2 regexFilters only work against the message body.
That solution would work as long as log messages were written in a specific format with no validation.
By using a completely separate logger, the contents of the log message don't matter.
Instead, as long as the correct logger is used, the contents can be targeted fairly easily.

This logger is created under the existing GCRun prefix, so any logging targeting the GCRun class should still work.

This change allows easy logging targeting by grouping all file deletion actions under a common prefix.
Adding a message prefix is a much smaller scoped change that is better suited to a 2.1.x release.

  • Adds a message prefix to specific log messages in GCRun to allow for easy targeting.
  • Fixes GCRun logger name improperly created #3744 by changing how the standard logger is created.
  • Moves datalevel to the end of the logger name.

How to Test

I added the following changes to the test/src/main/resources/log4j2-test.properties file.

Original Test Contents

These changes setup a separate appender, then create 3 specific loggers to pull those file deletion messages together.

appender.files.type = RollingFile
appender.files.name = FileActionsLogFiles
appender.files.fileName = ./target/test-gc-file-actions.log
appender.files.filePattern = ${filesFilename}-%d{MM-dd-yy-HH}-%i.log.gz
appender.files.layout.type = PatternLayout
appender.files.layout.pattern = %d{ISO8601} [%c{2}] %-5p: %m%n
appender.files.policies.type = Policies
appender.files.policies.time.type = TimeBasedTriggeringPolicy
appender.files.policies.time.interval = 1
appender.files.policies.time.modulate = true
appender.files.policies.size.type = SizeBasedTriggeringPolicy
appender.files.policies.size.size=100MB
appender.files.strategy.type = DefaultRolloverStrategy
appender.files.strategy.max = 10


logger.gcFiles1.name = org.apache.accumulo.gc.GCRunROOT.files
logger.gcFiles1.level = debug
logger.gcFiles1.additivity = false
logger.gcFiles1.appenderRef.files.ref = FileActionsLogFiles

logger.gcFiles2.name = org.apache.accumulo.gc.GCRunMETADATA.files
logger.gcFiles2.level = debug
logger.gcFiles2.additivity = false
logger.gcFiles2.appenderRef.files.ref = FileActionsLogFiles

logger.gcFiles3.name = org.apache.accumulo.gc.GCRunUSER.files
logger.gcFiles3.level = debug
logger.gcFiles3.additivity = false
logger.gcFiles3.appenderRef.files.ref = FileActionsLogFiles

These changes create a new appender with a regex filter set to deny message that don't match "FILE-ACTIONS:"

appender.files.type = RollingFile
appender.files.name = FileActionsLogFiles
appender.files.fileName = ./target/test-gc-file-actions.log
appender.files.filePattern = ${filesFilename}-%d{MM-dd-yy-HH}-%i.log.gz
appender.files.layout.type = PatternLayout
appender.files.layout.pattern = %d{ISO8601} [%c{2}] %-5p: %m%n
appender.files.policies.type = Policies
appender.files.policies.time.type = TimeBasedTriggeringPolicy
appender.files.policies.time.interval = 1
appender.files.policies.time.modulate = true
appender.files.policies.size.type = SizeBasedTriggeringPolicy
appender.files.policies.size.size=100MB
appender.files.filter.1.type = RegexFilter
appender.files.filter.1.regex = .*(FILE-ACTION:).*
appender.files.filter.1.onMatch = ACCEPT
appender.files.filter.1.onMismatch = DENY
appender.files.strategy.type = DefaultRolloverStrategy
appender.files.strategy.max = 10

Then the appender is added to the rootLogger

rootLogger.level = debug
rootLogger.appenderRef.console.ref = STDOUT
rootLogger.appenderRef.files.ref = FileActionsLogFiles

When running the org.apache.accumulo.test.functional.GarbageCollectorIT, from the root of the repo directory, the file test/target/test-gc-actions.log is created and contains the file deletion messages from the GC.

:2023-09-07T20:51:42,841 [gc.GCRunROOT] DEBUG: FILE-ACTION: Deleting file:/data/workspace/accumulo/test/target/mini-tests/org.apache.accumulo.test.functional.GarbageCollectorIT_gcTest/accumulo/tables/+r/root_tablet/F000005y.rf
15:2023-09-07T20:51:42,913 [gc.GCRunMETADATA] DEBUG: FILE-ACTION: Deleting file:/data/workspace/accumulo/test/target/mini-tests/org.apache.accumulo.test.functional.GarbageCollectorIT_gcTest/accumulo/tables/!0/table_info/0_1.rf
16:2023-09-07T20:51:42,915 [gc.GCRunMETADATA] DEBUG: FILE-ACTION: Deleting file:/data/workspace/accumulo/test/target/mini-tests/org.apache.accumulo.test.functional.GarbageCollectorIT_gcTest/accumulo/tables/!0/table_info/A000000e.rf

The file deletions are also still present in the main SimpleGarbageCollector_<id>.out logs

Changes the logger name and removes the space and "class" info from the
name.

Moves the datalevel to the end of the logger name to ensure it shows up
in log messages.
Adds a separate logger for file deletion operations in the Garbage
Collector.
ddanielr and others added 2 commits September 7, 2023 13:12
Co-authored-by: EdColeman <dev1@etcoleman.com>
Adds a message prefix instead which can be picked up by a regexFilter.
@ddanielr ddanielr changed the title Add Separate Logger for File deletion actions in GCRun Add Message Prefix for File deletion actions in GCRun Sep 7, 2023
@ddanielr
Copy link
Contributor Author

ddanielr commented Sep 7, 2023

I did some more work and decided to change my direction and switch to adding the prefix in the message body.

The additional logger work is still under commit d083506

public class GCRun implements GarbageCollectionEnvironment {
// loggers are not static to support unique naming by level
private final Logger log;
private static String fileActionPrefix = "FILE-ACTION:";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why we need this variable. You could just add "FILE-ACTION" at the beginning of each log message. The only reason to make it a variable is in the case that you think it would change in a future version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion: having it makes it explicit and also easier to track down with a show usages if ever need. And if it was me, making sure its spelt the same everywhere ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EdColeman you and I had the exact same thought on this.
I had created it as a variable to ensure that misspelling did not occur between log messages since this would be utilized by a regex.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's used in so many places, it probably makes sense to keep it as a variable. But, it should be final:

Suggested change
private static String fileActionPrefix = "FILE-ACTION:";
private static final String fileActionPrefix = "FILE-ACTION:";

Also, I'm not sure if "FILE-ACTION:" is a great prefix, or if we need it at all... isn't the Logger name distinct enough to filter on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a desire to filter GC logs further and only select log messages related to the deletion of a file.
This prefix standardizes the log messages related to file deletions and allows them to be easily selected.

I used FILE-ACTION as an attempt to make it generic as opposed to sometime tightly linked to DELETE.
If you have a different suggestion I'm open to it.

@EdColeman
Copy link
Contributor

Using thread context / mdc might be something that we would want to explore in general, For a 2.1 release, it may not be the best time to introduce it.

If I recall, there are subtleties with clearing the mdc stack (like lock / unlock semantics?) and sometimes using local variables? Again, not sure, and I think I was considering what it would look like for scans / bulk imports where transaction id or other "per user / thread" could provide additional insight, so might not apply here? It been awhile since I last looked.

@ddanielr
Copy link
Contributor Author

slf4j supports passing the MDC implementation to the underlying logging system, but has no direct support for MDC.
If the user's underlying logging system is log4j2 or logback, then this would work1.

If the log layout format has to be modified, is there a possibility that user established 2.1 log parsing may break between 2.1.x versions?

References:
1. https://www.slf4j.org/api/org/slf4j/MDC.html

With the use of a message prefix, the datalevel does not need to be
appended directly to the GCRun class name.

This was only done to ensure the current precision specifier did not
hide relevant information from the logs.
public class GCRun implements GarbageCollectionEnvironment {
// loggers are not static to support unique naming by level
private final Logger log;
private static String fileActionPrefix = "FILE-ACTION:";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's used in so many places, it probably makes sense to keep it as a variable. But, it should be final:

Suggested change
private static String fileActionPrefix = "FILE-ACTION:";
private static final String fileActionPrefix = "FILE-ACTION:";

Also, I'm not sure if "FILE-ACTION:" is a great prefix, or if we need it at all... isn't the Logger name distinct enough to filter on?

ddanielr and others added 3 commits September 15, 2023 21:18
Remove redundant `e.getMessage()` in log message

Co-authored-by: Christopher Tubbs <ctubbsii@apache.org>
Co-authored-by: Christopher Tubbs <ctubbsii@apache.org>
@ctubbsii ctubbsii linked an issue Sep 16, 2023 that may be closed by this pull request
@ddanielr ddanielr merged commit 75140d8 into apache:2.1 Sep 18, 2023
@ddanielr ddanielr deleted the 3744-fix branch September 18, 2023 18:13
@ctubbsii ctubbsii modified the milestones: 3.1.0, 2.1.3 Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GCRun logger name improperly created

4 participants