Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sets compaction coordinator status update log to trace #4489

Merged
merged 3 commits into from
May 1, 2024

Conversation

keith-turner
Copy link
Contributor

The log message set to trace in this PR was consuming 20% to 30% of the logs for the compaction coordinator.

@EdColeman
Copy link
Contributor

The change itself looks fine.

How fast were the messages generated? Wondering if this is another case where it would help to somehow track and generate messages at a lower rate (interval or every N).

The tipping point would be, without seeing these messages, can the compaction progress still be seen while its running to show that its likely working and making progress? Basically issue #4485 seems to be calling for better status / logging of compaction progress - so with this at trace, would it cause the same type of hurdle to troubleshoot?

(This PR will not address the other issue, just pointing it out for consideration.)

Copy link
Contributor

@EdColeman EdColeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - as long as there are other messages / ways that report compaction progress being generated. No issues with the code change itself - more trying to make sure we keep reporting some progress in sight.

Copy link
Contributor

@ddanielr ddanielr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine as the compaction progress is shown in the manager and in the compaction-coordinator.
plus we also export the information as metric data.

Copy link
Contributor

@dlmarion dlmarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this status update information is useful. It provides insight that we never had with internal compactions. I would rather see a different logger being used, that gets directed to a different file. These log messages should go in that fiile:

Compaction start:
https://github.com/apache/accumulo/blob/2.1/server/compaction-coordinator/src/main/java/org/apache/accumulo/coordinator/CompactionCoordinator.java#L486

Compaction status update:
https://github.com/apache/accumulo/blob/2.1/server/compaction-coordinator/src/main/java/org/apache/accumulo/coordinator/CompactionCoordinator.java#L594

Compaction completion:
https://github.com/apache/accumulo/blob/2.1/server/compaction-coordinator/src/main/java/org/apache/accumulo/coordinator/CompactionCoordinator.java#L546

Compaction failure:
https://github.com/apache/accumulo/blob/2.1/server/compaction-coordinator/src/main/java/org/apache/accumulo/coordinator/CompactionCoordinator.java#L565

If you think that the updates sent to the Coordinator from the Compactors is too frequent, then we should change the wait time calculation in the loop where the status is sent:

https://github.com/apache/accumulo/blob/2.1/server/compactor/src/main/java/org/apache/accumulo/compactor/Compactor.java#L680

@ctubbsii
Copy link
Member

@dlmarion Would the extra insights be more suitable as metrics, or are these logs specifically useful because they contain something that isn't possible to communicate via metrics? I like the idea of a dedicated compaction status logger that could be configured to emit to a separate place, if one wished to do so. Maybe all that's needed here is to use a dedicated logger instead of the same one as the rest of the coordinator?

@keith-turner
Copy link
Contributor Author

I think this status update information is useful. It provides insight that we never had with internal compactions. I would rather see a different logger being used, that gets directed to a different file.

A named logger would be nice. That strategy could also be useful for something else I wanted to do. I wanted to add some more detailed trace logging for when the coordinator collects info from tservers. Having a named logger for that would be nice because then could turn on just that trace logging.

@dlmarion
Copy link
Contributor

@ctubbsii - I don't think these log messages make sense as metrics because they are information about a specific transient event. The event is bound by the start and stop time, and you would need to add a tag to the metric with the external compaction id to find it. This would cause an issue because this would generate a ton of unique metrics.

I think it's sufficient to add a different logger in the CompactionCoordinator for these messages, such as:

  private static final Logger LOG = LoggerFactory.getLogger("CompactionStatusLogger");

Copy link
Contributor

@dlmarion dlmarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Looking again at the Compactor, it calls updateCompactionStatus with STARTED, SUCCEEDED, IN_PROGRESS, CANCELED, and FAILED. I wonder if only that method needs the different logger.

@keith-turner
Copy link
Contributor Author

Looking again at the Compactor, it calls updateCompactionStatus with STARTED, SUCCEEDED, IN_PROGRESS, CANCELED, and FAILED. I wonder if only that method needs the different logger.

@dlmarion made that change in 4aaaa18

@keith-turner keith-turner merged commit 2ce89e0 into apache:2.1 May 1, 2024
8 checks passed
@keith-turner keith-turner deleted the adjust_compactor_logging branch May 22, 2024 15:02
@ctubbsii ctubbsii modified the milestones: 3.1.0, 2.1.3 Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants