Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[workflow] Migrate getCellsWith{Shard,Table}ReadsSwitched, TrafficSwitchDirection and TableRemovalType to package workflow #8190

Merged
merged 3 commits into from
Jun 2, 2021

Conversation

ajm188
Copy link
Contributor

@ajm188 ajm188 commented May 25, 2021

Description

Finally picking up this workstream again. This does pretty much what it says in the title, with a couple important notes to add here:

  • In the original implementation of getCellsWithShardReadsSwitched, there was never a case where noControls would be set to false (because it's initialized to true, and there's only one branch where we say if len(parition.GetShardTabletControls()) == 0) { noControls = true }. I believe this to be a (apparently harmless) bug, so I've changed the implementation to initialize noControls to false. No tests broke when I made this change.
  • I updated the getCellsWith{Shard,Table}ReadsSwitched functions to take a topodatapb.TabletType argument rather than a string representing the tablet type. This made a call to strings.EqualFold from get....ShardReadsSwitched unnecessary, but made a call to strings.ToLower(ttype.String()) required. Just pushing the case-insensitive comparisons around, I guess.
  • When moving the enum types (TrafficSwitchDirection / TableRemovalType), I needed to change a bunch of local variables from workflow to workflowName, so that the identifier workflow could refer to the package. This made the diff much larger as a result. I also took the opportunity to reformat our imports to be a little cleaner since I was already here.

Related Issue(s)

Checklist

  • Tests were added or are not required -- N/a
  • Documentation was added or is not required

Deployment Notes

Copy link
Member

@rafael rafael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good. Good catch around the noControls. Let's take this opportunity to review that. I added one comment around that.

ctx context.Context,
keyspace string,
si *topo.ShardInfo,
ttype topodatapb.TabletType,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - I think across the codebase we use tabletType for this variable.

@@ -156,107 +127,27 @@ func (ts *trafficSwitcher) ForAllTargets(f func(source *workflow.MigrationTarget
func (wr *Wrangler) getCellsWithShardReadsSwitched(ctx context.Context, targetKeyspace string, si *topo.ShardInfo, tabletType string) (
cellsSwitched, cellsNotSwitched []string, err error) {

cells, err := wr.ts.GetCellInfoNames(ctx)
ttype, err := topoproto.ParseTabletType(tabletType)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you follow my suggestion from above, here we can:

  • Rename to tabletType to be tabletTypeStr
  • Just this variable stays ttype.

continue
}

// If reads and writes are both switched it is passible that the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r/passible/possible

// shards are not yet serving, or once reads and writes are both
// switched.
if len(partition.GetShardTabletControls()) == 0 {
noControls = true
Copy link
Member

@rafael rafael May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the noControls thing. Let's check with @sougou / @rohit-nayak-ps . I think there are other problems with the checks here. My understanding is a follows:

  • A serving type is migrated iff:
    a) The tablet_type / shard if found in the serving graph.
    b) There are no tablet controls set for this tablet_type / shard (this might be optional, see my comment at the end)

One problem I see is that having found=true and noControls, should mean the shard is migrated. I think I'm forgetting something about how this works.

The second problem I see is the following condition:

if !tabletControl.GetQueryServiceDisabled() {
  shardServedTypes = append(shardServedTypes, si.ShardName())
}

Isn't this adding to the list shards that actually have QueryService Enabled?

The presence of TabletControls should indicate that a migration is in progress, but ultimately the routing is determined by what is in the serving graph. The more I think about this, it seems that checking presence in the SrvKeyspaceGraph should be sufficient to determine if it was migrated 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rohit-nayak-ps do you have a different understanding here? I'm more familiarized with the legacy MigrateServedTypes and maybe the logic changed.

…rkflow.Server`

There is one small change in implementation here, where `noControls`
would never be `false` in the old implementation. I believe this was a
(harmless) bug, and the new implementation is more correct.

Signed-off-by: Andrew Mason <amason@slack-corp.com>
…rkflow

This required some updates of local variables from `workflow` to
`workflowName` so that I could reference the `workflow` package name.

I also reformatted imports in all the files I touched as part of this
change.

Signed-off-by: Andrew Mason <amason@slack-corp.com>
@ajm188 ajm188 force-pushed the am_more_workflow_extraction branch from 6860669 to 15e9ce6 Compare May 27, 2021 13:43
Signed-off-by: Andrew Mason <amason@slack-corp.com>
Copy link
Member

@rafael rafael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajm188 this looks good to me. There is an open question, but is actually outside the scope of this PR. We can revisit that offline with @rohit-nayak-ps.

I think we can merge this as is.

@rafael rafael merged commit 9b3c055 into vitessio:main Jun 2, 2021
@rafael rafael deleted the am_more_workflow_extraction branch June 2, 2021 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants