always sets needs recovery flag in tablet mgmt iterator by keith-turner · Pull Request #4255 · apache/accumulo

keith-turner · 2024-02-12T16:33:26Z

This commit fixes #4251. The WalSunnyDayIT was failing because the root tablet needed recovery, but the tablet mgmt iterator was not indicating this becaue the manager was in safe mode. Therefore recovery never happened and the test timed out.

This commit fixes apache#4251. The WalSunnyDayIT was failing because the root tablet needed recovery, but the tablet mgmt iterator was not indicating this becaue the manager was in safe mode. Therefore recovery never happened and the test timed out.

keith-turner · 2024-02-12T16:34:22Z

This commit also changes the tablet mgmt iterator to always indicate if volume recovery is needed.

keith-turner · 2024-02-12T16:49:52Z

I put some notes on #4251 showing some of the steps taken while debugging this problem.

dlmarion · 2024-02-12T19:47:49Z

...er/base/src/main/java/org/apache/accumulo/server/manager/state/TabletManagementIterator.java

      Exception error = null;
      try {
+        LOG.trace("Evaluating extent: {}", tm);
+        computeTabletManagementActions(tm, actions);


Prior to this change computeTabletManagementActions would only be called if the Manager state was normal, tservers were online, and there were online tables. We don't need these conditions to be true for the Root table, but I think we do for the Metadata and other tables as they are hosted by TabletServers. I'm wondering the consequence of calling this in all cases.

I'm wondering if we should only do this for the Root table.

When the manager state is SAFE_MODE, want to host the root and metadata table. If the metadata table needs log recovery and we do not call computeTabletManagementActions then the log recovery will not happen. Also if the root or metadata table need volume replacement and we do not call omputeTabletManagementActions then it will not happen. This change fixes those problems, but not sure if it introduces new problems. I did open #4256 about evaluating this code overall.

Not sure how to do it ATM, but would like to minimize the specialized code used by TabletManagementIterator that is not used by TabletGroupWatcher. That was one thing I was wondering about when opening #4256.

Maybe this is sufficient to fix the problem:

if (tm is Root or Metadata Table) { LOG.trace("Evaluating extent: {}", tm); computeTabletManagementActions(tm, actions); } else { if (tabletMgmtParams.getManagerState() != ManagerState.NORMAL || tabletMgmtParams.getOnlineTsevers().isEmpty() || tabletMgmtParams.getOnlineTables().isEmpty()) { // when manager is in the process of starting up or shutting down return everything. actions.add(ManagementAction.NEEDS_LOCATION_UPDATE); } else { LOG.trace("Evaluating extent: {}", tm); computeTabletManagementActions(tm, actions); } }

Maybe this is sufficient to fix the problem:

That is a more conservative change. Made it in 43e11b1. Thinking we may be able to remove this code, but that can be done in #4256 after we get all ITs working in elasticity.

keith-turner mentioned this pull request Feb 12, 2024

Elasticity Broken or Flaky test: WALSunnyDayIT.test() #4251

Closed

keith-turner mentioned this pull request Feb 12, 2024

Re-evaluate need to always return tablet in tablet mgmt iterator #4256

Closed

keith-turner changed the title ~~always sets neeeds recovery flag in tablet mgmt iterator~~ always sets needs recovery flag in tablet mgmt iterator Feb 12, 2024

dlmarion reviewed Feb 12, 2024

View reviewed changes

keith-turner added 2 commits February 12, 2024 16:49

code review update

43e11b1

Merge branch 'elasticity' into accumulo-4251

d8eeca8

keith-turner merged commit 4e1b052 into apache:elasticity Feb 14, 2024

keith-turner deleted the accumulo-4251 branch February 14, 2024 22:13

keith-turner added this to the 4.0.0 milestone Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

always sets needs recovery flag in tablet mgmt iterator#4255

always sets needs recovery flag in tablet mgmt iterator#4255
keith-turner merged 3 commits intoapache:elasticityfrom
keith-turner:accumulo-4251

keith-turner commented Feb 12, 2024

Uh oh!

keith-turner commented Feb 12, 2024

Uh oh!

keith-turner commented Feb 12, 2024

Uh oh!

dlmarion Feb 12, 2024

Uh oh!

keith-turner Feb 12, 2024 •

edited

Loading

Uh oh!

keith-turner Feb 12, 2024

Uh oh!

dlmarion Feb 12, 2024

Uh oh!

keith-turner Feb 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

keith-turner commented Feb 12, 2024

Uh oh!

keith-turner commented Feb 12, 2024

Uh oh!

keith-turner commented Feb 12, 2024

Uh oh!

dlmarion Feb 12, 2024

Choose a reason for hiding this comment

Uh oh!

keith-turner Feb 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

keith-turner Feb 12, 2024

Choose a reason for hiding this comment

Uh oh!

dlmarion Feb 12, 2024

Choose a reason for hiding this comment

Uh oh!

keith-turner Feb 12, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

keith-turner Feb 12, 2024 •

edited

Loading