Skip to content

Conversation

@keith-turner
Copy link
Contributor

The TabletGroupWatcher uses the TabletManagementIterator to filter tablets that need attention. The TabletMgmtIter had its own custom code to make decision about tablets. This commit modifies the TabletMgmtIterator to use the same code as the TGW when making decisions about tablets. This makes it easier to reason about and change the behavior of the TGW.

In making the TGW and TabletMgmtIter share code a change was also made to how they get information to make decisions. A new class called TabletManagementParameters was introduced that contains an immutable snapshot of the information that both classes need to make decisions. With this change the iterator and TGW are using the same information for a pass over the metadata table, in the past the information was obtained at different times and could have been different due to race conditions. These race conditions were probably not harmful, but removing them makes the code easier to reason about.

Also move some code outside of TGW away from using the TableMgmtIterator. Need to move all code outside of TGW away from using this iterator in order to make the code easier to maintain. Left TODOs in the code about this and will open follow on issues.

The TabletGroupWatcher uses the TabletManagementIterator to filter
tablets that need attention.  The TabletMgmtIter had its own custom code
to make decision about tablets.  This commit modifies the
TabletMgmtIterator to use the same code as the TGW when making decisions
about tablets.  This makes it easier to reason about and change the behavior
of the TGW.

In making the TGW and TabletMgmtIter share code a change was also made
to how they get information to make decisions.  A new class called
TabletManagementParameters was introduced that contains an immutable
snapshot of the information that both classes need to make decisions.
With this change the iterator and TGW are using the same information for
a pass over the metadata table, in the past the information was obtained
at different times and could have been different due to race conditions.
These race conditions were probably not harmful, but removing them makes
the code easier to reason about.

Also move some code outside of TGW away from using the
TableMgmtIterator. Need to move all code outside of TGW away from using
this iterator in order to make the code easier to maintain.  Left TODOs
in the code about this and will open follow on issues.
Copy link
Contributor

@dlmarion dlmarion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also check that ManagerAssignmentIT is still passing.

return Collections.unmodifiableMap(copy);
}

private static class JsonData {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would something less generic than JsonData help when it is declared / used? Maybe TMParmsJson or even ParamsJason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO that's not necessary because its a private inner class and is only used inside the class and never seen outside of the class.

@keith-turner
Copy link
Contributor Author

keith-turner commented Oct 30, 2023

Please also check that ManagerAssignmentIT is still passing.

@dlmarion that was a good call on running that test. Ran and it blew up. Turned out the best way to fix was to move a lot of code from using the tablet mgmt scanner to using ample. Made that change in e1da6f0. After that change the only thing left using the tabletmgmt iterator is TGW which is good.

Got LocatorIT working while cleaning up the code and removed Disabled from it.

System.out.println(
mti + " is " + state + " #walogs:" + mti.getTabletMetadata().getLogs().size());
&& tabletMetadata.getHostingGoal() == TabletHostingGoal.ONDEMAND) {
System.out.println(tabletMetadata.getExtent() + " is " + state + " #walogs:"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like state here is always going to be hosted. We could just change the print statement. Also, maybe a better name for this class is ListHostedOnDemandTablets as it does not list the unhosted ones. I think it's fine to submit a follow-on issue for this if you don't want to address this here.

@keith-turner keith-turner merged commit 5c454e1 into apache:elasticity Oct 31, 2023
@keith-turner keith-turner deleted the tablet_goal_state_refactor branch October 31, 2023 15:01
keith-turner added a commit to keith-turner/accumulo that referenced this pull request Oct 31, 2023
Updates the FATE operations that split tablets to handle walogs.  These
changes should wait for a tablet with walogs to recover before
proceeding.  The changes to actually make recovery happen were already
done in apache#3904 with changes to TabletGoalState.compute().  This change is
a WIP because it depends on apache#3847 and needs those changes to be
complete.

fixes apache#3844
keith-turner added a commit that referenced this pull request Feb 21, 2024
Updates the FATE operations that split tablets to handle walogs.  These
changes should wait for a tablet with walogs to recover before
proceeding.  The changes to actually make recovery happen were already
done in #3904 with changes to TabletGoalState.compute().  

fixes #3844
@ctubbsii ctubbsii added this to the 4.0.0 milestone Jul 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants