Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workflow: Add horizontal resharding workflow. #2393

Merged
merged 1 commit into from Jan 20, 2017

Conversation

wangyipei01
Copy link
Contributor

@wangyipei01 wangyipei01 commented Dec 20, 2016

This is a beta version implementation, which simply supports creating the horizontal workflow through UI and updating progress of each step.

In the end-to-end test (test/workflow_horizontal_resharding.py), only happy path is tested. To simplify the test, rather than fetching information from front-end and checking explicitly, I manually interacting with the UI after setting up the environments.

Unit test will be added later.

Possible Optimizations (wait for feedback from reviewers, the listed
points might be useless):

  • make UI more friendly, s.t. the user input doesn't have to input the shard
    list
  • supports action for the workflow, which allows the user to stop and
    restart the workflow
  • adding possible corner case end-to-end test

This change is Reviewable

@mberlin-bot
Copy link

Overall, the change looks good :)

I've left a couple of comments regarding style.

Regarding the workflow itself: The next step is to break it into sub nodes (see my comment about children). Can you please do that? Thanks!


Reviewed 6 of 6 files at r1.
Review status: all files reviewed at latest revision, 52 unresolved discussions, some commit checks failed.


go/vt/vtctld/workflow.go, line 40 at r1 (raw file):

		schemaswap.RegisterWorkflowFactory()

		// Register the Horizontal Resharding workflow

nit: Missing period.

Please always use full sentences in comments.

The Go comments only ask you do it for comments of declarations: https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences

But I think it's useful to apply this to all comments.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 1 at r1 (raw file):

package horizontalreshard

Please do not use abbreviations ("reshard"). Super short variable names make sense when the scope of a variable is very small e.g.

for _, s := range shards {
  // Do something with "s".
}

But for identifiers with a broader scope, abbreviations can be confusing because not everybody knows them or thinks of a different meaning.

Regarding this package in particular: I suggest to name it `resharding` because we'll probably create a workflow for the vertical split as well. (Maybe, we'll even rename it to vertical resharding or merge both workflows into one. Given that, `resharding` is the best name for this package.)

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 4 at r1 (raw file):

// Package horizontalreshard contains a workflow for automatic horizontal resharding.
// For implementation simplification, the user need to set up the vtworker and sets sharding key before using this workflow.

s/need/needs/ (third person singular)


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 4 at r1 (raw file):
I would not mention that the sharding key must be set. That sounds like too much detail.

Note that there must be one vtworker instance per source shard.

For implementation simplification

I don't think we have to excuse ourselves :) Instead, you can just state it as an assumption. E.g.

The workflow assumes that there are as many vtworker processes running and reachable via RPC as there are source shards.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 5 at r1 (raw file):

// Package horizontalreshard contains a workflow for automatic horizontal resharding.
// For implementation simplification, the user need to set up the vtworker and sets sharding key before using this workflow.
// The Input arguments for the workflow UI are:

s/Input/input/ (lower case)


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 5 at r1 (raw file):

// Package horizontalreshard contains a workflow for automatic horizontal resharding.
// For implementation simplification, the user need to set up the vtworker and sets sharding key before using this workflow.
// The Input arguments for the workflow UI are:

I suggest to delete this comment because it's redundant.

The same information is present in the "ReshardWorkflowData" struct as well. And since this is not Python, we don't have to comment fields just to document their type. This is already done in the code :)


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 6 at r1 (raw file):

keyspace the resharding is working

keyspace to be resharded


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 7 at r1 (raw file):

// The Input arguments for the workflow UI are:
//   - keyspace: string, keyspace the resharding is working
//   - source shards: string, the source shard name to be resharded (example: '0')

Note there can be multiple source shards.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 9 at r1 (raw file):

//   - source shards: string, the source shard name to be resharded (example: '0')
//   - destination shards: string, the destination shards, delimited by ',' (example if resharding to 2 shards: -80,80-)
//   - vtworker address: string, vtworker server address

See comment above. There should be one vtworker per source shard.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 17 at r1 (raw file):

	"strings"

	log "github.com/golang/glog"

Within our code base, we further break down this import group.

We're basically extending this rule: https://github.com/golang/go/wiki/CodeReviewComments#imports

I suggest:

<std libs>

<external libaries>

<own libaries>

<protobuf imports>

i.e. it should look like this:

import (
        "encoding/json"
        "flag"
        "fmt"
        "strings"
 
        "golang.org/x/net/context"
        log "github.com/golang/glog"

        "github.com/youtube/vitess/go/vt/automation"
        "github.com/youtube/vitess/go/vt/logutil"
        "github.com/youtube/vitess/go/vt/tabletmanager/tmclient"
        "github.com/youtube/vitess/go/vt/topo"
        "github.com/youtube/vitess/go/vt/topo/topoproto"
        "github.com/youtube/vitess/go/vt/vtctl"
        "github.com/youtube/vitess/go/vt/workflow"
        "github.com/youtube/vitess/go/vt/wrangler"

        workflowpb "github.com/youtube/vitess/go/vt/proto/workflow"
)

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 31 at r1 (raw file):

const (
	horizontalReshardFactoryName = "horizontal_resharding"

Please rename to horizontalReshardingFactoryName (same comment as above, please no abbreviation)


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 34 at r1 (raw file):

)

// ReshardWorkflowData is the data structure to store resharding arguments

nit: missing period


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 35 at r1 (raw file):

// ReshardWorkflowData is the data structure to store resharding arguments
type ReshardWorkflowData struct {

Please name it HorizontalReshardingWorkflowData because it's more precise. (Soon there'll probably be a VerticalSplitWorkflowData in the same package as well.)


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 39 at r1 (raw file):

	SourceShards      []string
	DestinationShards []string
	VTWorkerServer    string

Please always write "vtworker". When you need to capitalize it, please capitalize only the first letter: Vtworker

I understand that the rest of the team doesn't always follow this rule e.g. some of our docs talk about VTGate instead of vtgate. To get into a more consistent state, let's stick to what I said.

Please also omit "Server". It's okay to assume that it's always the server and not e.g. a client. Given that, I suggest to rename it to "Vtworkers".


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 42 at r1 (raw file):

}

// Workflow contains meta-information and methods controlling horizontal resharding workflow.

nit: missing definite article:

the horizontal resharding workflow

nit 2: "for controlling" or "to control"


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 43 at r1 (raw file):

// Workflow contains meta-information and methods controlling horizontal resharding workflow.
type Workflow struct {

I recommend to name this HorizontalReshardingWorkflow.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 45 at r1 (raw file):

type Workflow struct {
	// manager is the current Manager.
	// TO DO: it can be used to save checkpointer

Remove this comment and instead leave a TODO in the code where you plan to add checkpointing.

Alternatively, add such a general TODO at the beginning of a struct definition or the beginning of the file.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 54 at r1 (raw file):

	rootUINode *workflow.Node

	info *ReshardWorkflowData

Please name this field data. This is consistent with the suffix of the struct and also the rest of the code.

Within the protobuf message, which is used for the checkpoints, this data is stored in a field named "data" as well: https://github.com/youtube/vitess/blob/master/proto/workflow.proto#L46


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 57 at r1 (raw file):

}

// Run is a part of workflow.Workflow interface. This execute the horizontal resharding process and update UI message

nits:

  • the workflow.Workflow interface
  • executes
  • updates
  • the UI message
  • missing period

On a more general note: It's recommended to document what the function does first and then mention that it implements a particular interface. For example:

// Run executes the horizontal resharding workflow and updates the UI message accordingly.
// It is part of the workflow.Workflow interface.

(I understand that most of our code does not follow this guideline, but it should. If the method is trivial, it's fine to omit a comment for the method and only mention that it implements the interface.)


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 71 at r1 (raw file):

}

// executeReshard is the main entry point of the horizontal resharding process. It drived the process from start to finish

typo: drives


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 74 at r1 (raw file):

// and updates the progress of each step.
func (hw *Workflow) executeReshard(ctx context.Context, wr *wrangler.Wrangler) error {
	sourceKeyspace := hw.info.Keyspace

Please remove this and the next variable and inline it instead.

As you noticed, in this type of workflow the keyspace is identical for the source and the destination anyway.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 76 at r1 (raw file):

	sourceKeyspace := hw.info.Keyspace
	destKeyspace := hw.info.Keyspace
	sourceShardInfo := topoproto.KeyspaceShardString(sourceKeyspace, hw.info.SourceShards[0])

Please make this a list in the same way as the destination shards.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 77 at r1 (raw file):

	destKeyspace := hw.info.Keyspace
	sourceShardInfo := topoproto.KeyspaceShardString(sourceKeyspace, hw.info.SourceShards[0])
	numDestShards := len(hw.info.DestinationShards)

This variable has little value. Please inline the code instead.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 78 at r1 (raw file):

	sourceShardInfo := topoproto.KeyspaceShardString(sourceKeyspace, hw.info.SourceShards[0])
	numDestShards := len(hw.info.DestinationShards)
	destShardInfos := make([]string, numDestShards)

If a string has the keyspace and the shard, we sometimes call it keyspaceShard. Given that, you could name the variable destinationKeyspaceShards.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 79 at r1 (raw file):

	numDestShards := len(hw.info.DestinationShards)
	destShardInfos := make([]string, numDestShards)
	for i := 0; i < numDestShards; i++ {

Please use a for each loop with the range command here. Note that you can also use append and that it also works with a nil slice:

var destinationKeyspaceShards []string
for _, s := range hw.info.DestinationShards {
  destinationKeyspaceShards = append(destinationKeyspaceShards, topoproto.KeyspaceShardString(hw.info.Keyspace, s))
}

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 83 at r1 (raw file):

	}

	// Step 1: Run CopySchemaShard to copy the schema of source shard to all destination shards

"to copy the schema from the first source shard ..."


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 83 at r1 (raw file):

	}

	// Step 1: Run CopySchemaShard to copy the schema of source shard to all destination shards

I forgot to mention an important detail.

The workflow package supports hierarchical workflows and that is exactly what we want here.

For each step (copy schema shard, clone, wait for filtered replication, diff, MST), we want a sub group and within each group we want one entry per source (or where applicable) per dest shard.

On the UI side, you present this as Children of the root UI node (and then children of the children). For example, here's the schema swap code which creates a child for each shard: https://github.com/youtube/vitess/blob/master/go/vt/schemamanager/schemaswap/schema_swap.go#L422
And then, each "shardUINode" has multiple children as well.

In our case, we want nodes for the phase on the first level and a node per shard on the second level.

Please change the code accordingly.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 84 at r1 (raw file):

	// Step 1: Run CopySchemaShard to copy the schema of source shard to all destination shards
	// TODO: excludeTable information can be added to UI input parameters, s.t the user can customized excluded tables during resharding.

Please use the TODO format TODO(<ldap>): Comment here.

i.e. TODO(yipeiw): ...


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 85 at r1 (raw file):

	// Step 1: Run CopySchemaShard to copy the schema of source shard to all destination shards
	// TODO: excludeTable information can be added to UI input parameters, s.t the user can customized excluded tables during resharding.
	for _, destShardInfo := range destShardInfos {

optional: It's totally okay to use shorter variable names for variables with a narrow scope.

E.g. here I would have been fine with d or dest.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 87 at r1 (raw file):

	for _, destShardInfo := range destShardInfos {
		args := []string{"CopySchemaShard", sourceShardInfo, destShardInfo}
		if err := vtctl.RunCommand(ctx, wr, args); err != nil {

Please do not use the vtctl package. It's an unnecessary layer on top of Wrangler.

You can call wr.CopySchemaShardFromShard(...) directly here.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 89 at r1 (raw file):

		if err := vtctl.RunCommand(ctx, wr, args); err != nil {
			errorMessage := fmt.Sprintf("Horizontal Resharding: error in CopySchemaShardFromShard from %s to %s:",
				sourceShardInfo, destShardInfo) + err.Error()

Please add err.Error() to the Sprintf call and do not concatenate it:

fmt.Sprintf("Horizontal Resharding: error in CopySchemaShardFromShard from %s to %s: %v", sourceShardInfo, destShardInfo, err)


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 93 at r1 (raw file):

			return err
		}
		progressMessage := fmt.Sprintf("Horizontal Resharding: CopySchemaShardFromShard from %s to %s is finished", sourceShardInfo, destShardInfo)

has finished


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 97 at r1 (raw file):

	}

	// Step 2: Run SplitClone to clone the data within a keyspace from source shards to destination shards.

As mentioned above, please loop over the source shards here and match the n'th source shard with the n'th vtworker.

We want that because we want to execute the per shard actions in parallel.

To run things in parallel, please launch them as Go routines. Use a WaitGroup to wait for all Go routines and an ErrorRecorder to capture if any Go routine failed.

Here's an example for that: https://github.com/youtube/vitess/blob/master/go/vt/wrangler/keyspace.go#L470


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 101 at r1 (raw file):

context.TODO()

Just use ctx which is passed in to this method.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 102 at r1 (raw file):

	// This is because vtworker class doesn't cleanup the environment after execution.
	automation.ExecuteVtworker(context.TODO(), vtworkerServer, []string{"Reset"})
	// The flag min_healthy_rdonly_tablets is set to 1 (default value is 2).

Please add a TODO here that this must become a parameter of the workflow.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 136 at r1 (raw file):

	// Step 4: Run MigrateServedTypes to switch over to serving from the new shards
	servedTypeParams := []string{"rdonly", "replica", "master"}

Please use the String value of the enum types instead:

e.g. topodatapb.TabletType_MASTER

topodatapb is the import alias for "github.com/youtube/vitess/go/vt/proto/topodata"


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 150 at r1 (raw file):

	// Report successful information
	progressMessage := fmt.Sprintf("Horizontal Resharding is finished, source shard %v is resharded into destination shards %v.", sourceShardInfo, destShardInfos)
	progressMessage += "The source shard is migrated and the destination shards are serving"

nits:

  • missing leading space
  • missing period

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 163 at r1 (raw file):

}

// WorkflowFactory is the factory to register the HorizontalReshard Workflows

nit: missing period


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 175 at r1 (raw file):

	subFlags := flag.NewFlagSet(horizontalReshardFactoryName, flag.ContinueOnError)
	keyspace := subFlags.String("keyspace", "", "Name of keyspace to perform horizontal resharding")
	// The shards should be separated by ","

Please remove this comment because it won't be visible to users and therefore won't help them to understand how they need to input the data.

Instead, I recommend to update the help text on the next line:

before: list of source shards of resharding
after: comma-separated list of source shards

In doubt, see the go/vt/vtctl/vtctl.go file for inspirations from existing help texts :)


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 176 at r1 (raw file):

	keyspace := subFlags.String("keyspace", "", "Name of keyspace to perform horizontal resharding")
	// The shards should be separated by ","
	sourceShards := subFlags.String("source_shard_list", "", "list of source shards of resharding")

See also comment below. Please omit the _list suffix. Instead I suggest to name it source_shards.


test/workflow_horizontal_resharding.py, line 27 at r1 (raw file):

class TestWorkflowHorizontalResharding(worker.TestBaseSplitClone):

I think it's better to swap "Horizontal Resharding" and "Workflow" i.e. "TestHorizontalReshardingWorkflow".

The same applies to the filename.


test/workflow_horizontal_resharding.py, line 30 at r1 (raw file):

  """End-to-end test for horizontal resharding workflow.

  This test reuses worker.py, which covers the happy path

We're reusing worker.py because it contains the code to set up the environment and verify the data.

The resharding itself is executed by your workflow and we're not reusing worker.py for that.


test/workflow_horizontal_resharding.py, line 45 at r1 (raw file):

    dest_shard_list = '-80,80-'

    # uncomment this can test the workflow UI

Please do not keep debug code. This can probably be removed? If not, uncomment utils.pause on the next line please.


test/workflow_horizontal_resharding.py, line 63 at r1 (raw file):

                '%s, workflow uuid=%s' % (utils.vtctld.port, hw_uuid))

    vtctl_res = utils.run_vtctl(['WorkflowWait', hw_uuid])

Do not store return variable since you're not using it.


test/workflow_horizontal_resharding.py, line 95 at r1 (raw file):

    utils.run_vtctl(
        ['RunHealthCheck', worker.shard_0_rdonly1.tablet_alias], auto_log=True)
    utils.check_tablet_query_service(

nit: The wrapping here looks to extreme. Can you please try to fit it into as little lines as possible?


web/vtctld2/package.json, line 55 at r1 (raw file):

    "ts-node": "1.2.1",
    "tslint": "3.13.0",
    "typescript": "2.0.10"

Please revert this change.

As far as I remember, the bug which we encountered with the typescript version was fixed by changing the version in a different file (which is not under our source control).


web/vtctld2/src/app/shared/flags/workflow.flags.ts, line 32 at r1 (raw file):

source_shard_list

Please use source_shards instead.

(I know I've called it _list in the old automation code but that's unnecessary. The rest of our code doesn't use such a _list suffix as well.)


web/vtctld2/src/app/shared/flags/workflow.flags.ts, line 35 at r1 (raw file):

destination_shard_list

destination_shards


web/vtctld2/src/app/shared/flags/workflow.flags.ts, line 38 at r1 (raw file):

vtworker_server_address

vtworker_addresses


web/vtctld2/src/app/shared/flags/workflow.flags.ts, line 46 at r1 (raw file):

    super(position, id, 'Factory Name', 'Specifies the type of workflow to create.', '');
    let options = [];
    // tested the UI; TO DO: load workflow.horizontal_resharding component

Please remove this comment.


web/vtctld2/src/app/shared/flags/workflow.flags.ts, line 103 at r1 (raw file):

}

export class HorizontalReshardKeyspaceFlag extends InputFlag {

Please rename the prefix of these flags to HorizontalResharding.


web/vtctld2/src/app/shared/flags/workflow.flags.ts, line 126 at r1 (raw file):

export class HorizontalReshardVTWorkerFlag extends InputFlag {
  constructor(position: number, id: string, value= '') {
    super(position, id, 'VTWorker Server', 'server address of vtworker.', value);

s/VTWorker Server/vtworker addresses/

s/server address of vtworker./List of vtworker addresses./


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/vtctld/workflow.go, line 40 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

nit: Missing period.

Please always use full sentences in comments.

The Go comments only ask you do it for comments of declarations: https://github.com/golang/go/wiki/CodeReviewComments#comment-sentences

But I think it's useful to apply this to all comments.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 1 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please do not use abbreviations ("reshard"). Super short variable names make sense when the scope of a variable is very small e.g.

for _, s := range shards {
  // Do something with "s".
}

But for identifiers with a broader scope, abbreviations can be confusing because not everybody knows them or thinks of a different meaning.

Regarding this package in particular: I suggest to name it `resharding` because we'll probably create a workflow for the vertical split as well. (Maybe, we'll even rename it to vertical resharding or merge both workflows into one. Given that, `resharding` is the best name for this package.)
</blockquote></details>

Done

---


*Comments from [Reviewable](https://reviewable.io:443/reviews/youtube/vitess/2393)*
<!-- Sent from Reviewable.io -->

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 4 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

s/need/needs/ (third person singular)

Done


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 84 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please use the TODO format TODO(<ldap>): Comment here.

i.e. TODO(yipeiw): ...

Done


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 4 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

I would not mention that the sharding key must be set. That sounds like too much detail.

Note that there must be one vtworker instance per source shard.

For implementation simplification

I don't think we have to excuse ourselves :) Instead, you can just state it as an assumption. E.g.

The workflow assumes that there are as many vtworker processes running and reachable via RPC as there are source shards.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 5 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

s/Input/input/ (lower case)

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 5 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

I suggest to delete this comment because it's redundant.

The same information is present in the "ReshardWorkflowData" struct as well. And since this is not Python, we don't have to comment fields just to document their type. This is already done in the code :)

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 6 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

keyspace the resharding is working

keyspace to be resharded

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 7 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Note there can be multiple source shards.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 9 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

See comment above. There should be one vtworker per source shard.

should I make it into string[], like source shards?


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 17 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Within our code base, we further break down this import group.

We're basically extending this rule: https://github.com/golang/go/wiki/CodeReviewComments#imports

I suggest:

<std libs>

<external libaries>

<own libaries>

<protobuf imports>

i.e. it should look like this:

import (
        "encoding/json"
        "flag"
        "fmt"
        "strings"
 
        "golang.org/x/net/context"
        log "github.com/golang/glog"

        "github.com/youtube/vitess/go/vt/automation"
        "github.com/youtube/vitess/go/vt/logutil"
        "github.com/youtube/vitess/go/vt/tabletmanager/tmclient"
        "github.com/youtube/vitess/go/vt/topo"
        "github.com/youtube/vitess/go/vt/topo/topoproto"
        "github.com/youtube/vitess/go/vt/vtctl"
        "github.com/youtube/vitess/go/vt/workflow"
        "github.com/youtube/vitess/go/vt/wrangler"

        workflowpb "github.com/youtube/vitess/go/vt/proto/workflow"
)

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 31 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please rename to horizontalReshardingFactoryName (same comment as above, please no abbreviation)

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 34 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

nit: missing period

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 35 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please name it HorizontalReshardingWorkflowData because it's more precise. (Soon there'll probably be a VerticalSplitWorkflowData in the same package as well.)

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 42 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

nit: missing definite article:

the horizontal resharding workflow

nit 2: "for controlling" or "to control"

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 43 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

I recommend to name this HorizontalReshardingWorkflow.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 45 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Remove this comment and instead leave a TODO in the code where you plan to add checkpointing.

Alternatively, add such a general TODO at the beginning of a struct definition or the beginning of the file.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 54 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please name this field data. This is consistent with the suffix of the struct and also the rest of the code.

Within the protobuf message, which is used for the checkpoints, this data is stored in a field named "data" as well: https://github.com/youtube/vitess/blob/master/proto/workflow.proto#L46

Done.


Comments from Reviewable

@mberlin-bot
Copy link

Review status: 4 of 6 files reviewed at latest revision, 51 unresolved discussions.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 9 at r1 (raw file):

Previously, wangyipei01-bot wrote…

should I make it into string[], like source shards?

Yes, please.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/workflow_horizontal_resharding.py, line 27 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

I think it's better to swap "Horizontal Resharding" and "Workflow" i.e. "TestHorizontalReshardingWorkflow".

The same applies to the filename.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/workflow_horizontal_resharding.py, line 30 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

We're reusing worker.py because it contains the code to set up the environment and verify the data.

The resharding itself is executed by your workflow and we're not reusing worker.py for that.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/workflow_horizontal_resharding.py, line 45 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please do not keep debug code. This can probably be removed? If not, uncomment utils.pause on the next line please.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/wrangler/schema.go, line 29 at r5 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

calling wrangler

Please be more specific here. I suggest to replace "wrangler" with "CopySchemaShard".

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/wrangler/shard.go, line 356 at r6 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

This method should not go into this file because shard.go only contains methods which are related to manipulating the "Shard" object in the topology. This is not the case here.

There's a file called split.go which contains resharding related methods.

Can you please move this method there?

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/wrangler/shard.go, line 356 at r6 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please change this code such that "vtctl.go" is using this method as well and you can remove the duplicate code from there.

For example, you'll have to add maxDelay as a parameter in this method.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

web/vtctld2/src/app/shared/flags/workflow.flags.ts, line 126 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Capitalizing "vtworker" in camel case variable names is fine. Please make the second character "t" lower case then. (We treat it as one word.)

HorizontalReshardingVTWorkerFlag => HorizontalReshardingVtworkerFlag

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

web/vtctld2/src/app/shared/flags/workflow.flags.ts, line 30 at r6 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

8
6

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

web/vtctld2/src/app/shared/flags/workflow.flags.ts, line 105 at r6 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

VTWorkers

vtworkers

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 87 at r1 (raw file):

Previously, wangyipei01-bot wrote…

Does it cost lots of resource calling through vtctl? The reason I call vtctl rather than wrangler here is because in its wrap up function, it handles a lot of default arguments which are passed to wrangler function, therefore it makes my code cleaner by calling this wrap-up command function directly.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/horizontal_resharding_workflow.py, line 6 at r5 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

This pylint directive is too broad.

This way you're disabling all unused-import occuring after this line.

Please copy & paste what we have in the other files instead:

# "unittest" is used indirectly by importing "worker", but pylint does
# not grasp this.
# Import it explicitly to make pylint happy and stop it complaining about
# setUpModule, tearDownModule and the missing module docstring.
import unittest  # pylint: disable=unused-import

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/horizontal_resharding_workflow.py, line 35 at r5 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

"""Normal horizontal resharding, "happy path test"."""

I would not mention "happy path test" here because that's obvious to the reader.

In general, I would focus on a short description of what the test does e.g.:

"""Reshard from 1 to 2 shards by running the resharding workflow on vtctld."""

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/horizontal_resharding_workflow.py, line 70 at r5 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please remove trailing whitespace. This is an issue further down as well.

I recommend to install the AnyEdit Eclipse addon which automatically removes whitespace for you: http://andrei.gmxhome.de/anyedit/index.html

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/horizontal_resharding_workflow.py, line 40 at r5 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Unused code? Please remove.

no unused code now.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/horizontal_resharding_workflow.py, line 85 at r6 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

must be disabled
must be enabled

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/horizontal_resharding_workflow.py, line 49 at r6 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

hw_uuid

I think the abbrevation "hw" is difficult to interpret/guess.

How about workflow_uuid instead?
(There's only one workflow, so you don't need to repeat that its' the "Horizontal Resharding" one.)

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/horizontal_resharding_workflow.py, line 42 at r6 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

vtctl_res

I don't like "res" as abbreviation for "response". Instead, I suggest to stick with existing code.

The command ack-grep "= (utils.)?run_vtctl\(" reveals that we almost always use stdout. Therefore, please use that.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/horizontal_resharding_workflow.py, line 43 at r5 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please move the opening bracket to the previous line for several reasons:

  • to be consistent with the remaining code base
  • to save on vertical space and increase the information density per line

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/horizontal_resharding_workflow.py, line 47 at r5 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Same comment as above. Please move the closing bracket to the previous line.

For reference: In resharding.py line 718 you'll find a call where we have an array literal spanning multiple lines and the brackets are not on separate lines.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

test/horizontal_resharding_workflow.py, line 75 at r5 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

I think you can remove this. See #2477

Done.


Comments from Reviewable

@mberlin-bot
Copy link

I've reviewed everything now.

The unit test looks great. That's the way to go :)

My other comments are mostly style comments. Let's wrap these up and then move on with the implementation of the state based execution :)


Reviewed 2 of 5 files at r6, 1 of 1 files at r7.
Review status: all files reviewed at latest revision, 97 unresolved discussions.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 49 at r7 (raw file):

	ctx context.Context

	// wr is resharding.Wrangler

nit: missing period.

The comment as it is does not provide a lot of information. I suggest to remove it or add a longer comment.

In this case I suggest to remove it.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 52 at r7 (raw file):

	wr Wrangler

	// manager is the current Manager.

Same comment as above. Please remove this comment.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 66 at r7 (raw file):

	migrateUINode    *workflow.Node

	keyspace string

I suggest to structure the list of fields differently:

  • Don't use newlines between fields unless you want to separate groups of fields.
  • Let the first group be the immutable configuration of the object i.e. I would put "keyspace" and "vtworkers" in there.
  • Then you could have another group with the fields which are needed throughout the operation e.g. "ctx", "wr", "manager", "topoServer".
  • Then you could have another group with the dynamic state e.g. "logger", "subWorkflows" and the UI nodes.

go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 82 at r7 (raw file):

PerShardHorizontalResharding

This should be a per phase struct and not per shard.

But let's keep this as is for now because you're likely going to change this later.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 92 at r7 (raw file):

data *PerShardHorizontalReshardingData

This should be an anonymous field (without a name). See: http://golangtutorials.blogspot.com/2011/06/anonymous-fields-in-structs-like-object.html

This way you can reference the data fields in the code below as "hw.Keyspace" instead of "hw.data.Keyspace" i.e. you can omit "data". This makes the code more readable.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 92 at r7 (raw file):

	shardUILogger *logutil.MemoryLogger

	data *PerShardHorizontalReshardingData

Please move this field to the top of the struct for the same reasoning as above.

This is immutable data which will be set at initialization time.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 92 at r7 (raw file):

Please change this from a pointer type to a type.

This is immutable data and by not using a pointer you're expressing this.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 95 at r7 (raw file):

per source

"a per source" because it's one.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 95 at r7 (raw file):

for all source shards

for each source shard


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 104 at r7 (raw file):

workerCount

You don't need this variable.

Instead, the "range" command below will provide the index for you as well:

for i, os := range overlappingShards {

go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 117 at r7 (raw file):

		}

		err := hw.createWorkflowPerShard(sourceShard, destinationShards, hw.vtworkers[workerCount])

This line must go into the "if" on the next line.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 144 at r7 (raw file):

"copy_schema"

This feels redundant because the parent node has the same name.

Additionally, the PathName is not unique per shard. It probably has to be?

If that's the case, I suggest "shard_" + sourceShardName as name i.e. it would be "shard_-80" or "shard_c0-d0".


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 173 at r7 (raw file):

// Run executes the horizontal resharding process and updates the UI message.
// It implements the workflow.Workflow interface.
func (hw *HorizontalReshardingWorkflow) Run(ctx context.Context, manager *workflow.Manager, wi *topo.WorkflowInfo) error {

Please move this function to the top after the definition of the structs.

I recommend to write the code such that somebody can read it from top to bottom.

In this case "Run" would be the entry point to the code and it would then call other functions e.g. "createSubWorkflow()".


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 185 at r7 (raw file):

	hw.rootUINode.BroadcastChanges(true /* updateChildren */)

	// TODO(yipeiw): support action button to allow retry, stop, restart

nit: Please write this as a proper sentence i.e. capitalize "Support" and add a period at the end.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 198 at r7 (raw file):

hw.logger.Infof(errMessage)

No need for errMessage. Please write instead:

hw.logger.Infof("Horizontal Resharding: error in CopySchemaShard: %v.", err)

Also: Please always leave a space " " after the colon ":" i.e. "CopySchemaShard: %v"

Please fix this throughput the file. Same applies for "progressMessage" below.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 219 at r7 (raw file):

parallely

in parallel


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 219 at r7 (raw file):

, fails

Shorter sentences are easier to read.

Therefore, please replace the comma with a period and start a new sentence instead: "It fails when ..."


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 219 at r7 (raw file):

one of the errors when several

This comment is actually wrong because you're using the "AllErrorRecorder". It will return all occurred errors.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 220 at r7 (raw file):

fails

fail (plural because of "several")


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 221 at r7 (raw file):

subworkflow

subWorkflow to be consistent with other names in the file.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 221 at r7 (raw file):

executeFunc func(subworkflow *PerShardHorizontalResharding

FYI: Instead of passing in a function type, you should define an interface with an "Execute" interface. Then each subworkflow object would implement this and you could call it. This would be a more object oriented style of programming.

However, let's hold back on this since you're changing the code to the state approach anyway.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 222 at r7 (raw file):

var errorRecorder concurrency.AllErrorRecorder

Please use the ":=" syntax instead because that's what we're using throughout the code base as well.

Additionally, please shorten the variable name because it's a short scope and the same name is already mentioned in the type.

After these changes, it should look like this:

ec := concurrency.AllErrorRecorder{}
wg := sync.WaitGroup{}

go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 240 at r7 (raw file):

context.TODO()

Please use hw.ctx here and everywhere.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 332 at r7 (raw file):

Workflows
Workflow


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 385 at r7 (raw file):

"split_clone"

Let's keep it simple and just name it "clone".


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 389 at r7 (raw file):

"split_diff"

"diff"


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 393 at r7 (raw file):

"mst"

"migrate"


go/vt/workflow/resharding/horizontal_resharding_workflow_test.go, line 80 at r7 (raw file):

	}

	perShard := &PerShardHorizontalResharding{

As discussed offline, please do not repeat this code in the test.

Instead, I suggest to have a "Run" and a "run" function. In "Run" you can set the wrangler from the workflow library and call "run". Then in "run" you would put this code. This way, you can call "run" from your test.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 49 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

nit: missing period.

The comment as it is does not provide a lot of information. I suggest to remove it or add a longer comment.

In this case I suggest to remove it.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 52 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Same comment as above. Please remove this comment.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 240 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

context.TODO()

Please use hw.ctx here and everywhere.

Done.


Comments from Reviewable

@wangyipei01-bot
Copy link

Review status: all files reviewed at latest revision, 97 unresolved discussions.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 66 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

I suggest to structure the list of fields differently:

  • Don't use newlines between fields unless you want to separate groups of fields.
  • Let the first group be the immutable configuration of the object i.e. I would put "keyspace" and "vtworkers" in there.
  • Then you could have another group with the fields which are needed throughout the operation e.g. "ctx", "wr", "manager", "topoServer".
  • Then you could have another group with the dynamic state e.g. "logger", "subWorkflows" and the UI nodes.

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 82 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

PerShardHorizontalResharding

This should be a per phase struct and not per shard.

But let's keep this as is for now because you're likely going to change this later.

Leave it later.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 92 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

data *PerShardHorizontalReshardingData

This should be an anonymous field (without a name). See: http://golangtutorials.blogspot.com/2011/06/anonymous-fields-in-structs-like-object.html

This way you can reference the data fields in the code below as "hw.Keyspace" instead of "hw.data.Keyspace" i.e. you can omit "data". This makes the code more readable.

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 92 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please move this field to the top of the struct for the same reasoning as above.

This is immutable data which will be set at initialization time.

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 92 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please change this from a pointer type to a type.

This is immutable data and by not using a pointer you're expressing this.

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 95 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

per source

"a per source" because it's one.

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 95 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

for all source shards

for each source shard

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 104 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

workerCount

You don't need this variable.

Instead, the "range" command below will provide the index for you as well:

for i, os := range overlappingShards {

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 117 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

This line must go into the "if" on the next line.

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 144 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

"copy_schema"

This feels redundant because the parent node has the same name.

Additionally, the PathName is not unique per shard. It probably has to be?

If that's the case, I suggest "shard_" + sourceShardName as name i.e. it would be "shard_-80" or "shard_c0-d0".

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 173 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please move this function to the top after the definition of the structs.

I recommend to write the code such that somebody can read it from top to bottom.

In this case "Run" would be the entry point to the code and it would then call other functions e.g. "createSubWorkflow()".

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 185 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

nit: Please write this as a proper sentence i.e. capitalize "Support" and add a period at the end.

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 198 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

hw.logger.Infof(errMessage)

No need for errMessage. Please write instead:

hw.logger.Infof("Horizontal Resharding: error in CopySchemaShard: %v.", err)

Also: Please always leave a space " " after the colon ":" i.e. "CopySchemaShard: %v"

Please fix this throughput the file. Same applies for "progressMessage" below.

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 219 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

parallely

in parallel

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 219 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

, fails

Shorter sentences are easier to read.

Therefore, please replace the comma with a period and start a new sentence instead: "It fails when ..."

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 219 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

one of the errors when several

This comment is actually wrong because you're using the "AllErrorRecorder". It will return all occurred errors.

Done. (removed)


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 220 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

fails

fail (plural because of "several")

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 221 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

subworkflow

subWorkflow to be consistent with other names in the file.

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 221 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

executeFunc func(subworkflow *PerShardHorizontalResharding

FYI: Instead of passing in a function type, you should define an interface with an "Execute" interface. Then each subworkflow object would implement this and you could call it. This would be a more object oriented style of programming.

However, let's hold back on this since you're changing the code to the state approach anyway.

Leave it later.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 222 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

var errorRecorder concurrency.AllErrorRecorder

Please use the ":=" syntax instead because that's what we're using throughout the code base as well.

Additionally, please shorten the variable name because it's a short scope and the same name is already mentioned in the type.

After these changes, it should look like this:

ec := concurrency.AllErrorRecorder{}
wg := sync.WaitGroup{}

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 332 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Workflows
Workflow

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 385 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

"split_clone"

Let's keep it simple and just name it "clone".

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 389 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

"split_diff"

"diff"

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 393 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

"mst"

"migrate"

Done.


go/vt/workflow/resharding/horizontal_resharding_workflow_test.go, line 80 at r7 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

As discussed offline, please do not repeat this code in the test.

Instead, I suggest to have a "Run" and a "run" function. In "Run" you can set the wrangler from the workflow library and call "run". Then in "run" you would put this code. This way, you can call "run" from your test.

Leave it later after refactoring the code.


go/vt/wrangler/keyspace.go, line 29 at r5 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

Please use this constant in the definition of the flag in vtctl.go as well (line 1373).

Here's the current code which I mean:

	maxDelay := subFlags.Duration("max_delay", 30*time.Second,
		"Specifies the maximum delay, in seconds, the filtered replication of the"+
			" given destination shard should lag behind the source shard. When"+
			" higher, the command will block and wait for the delay to decrease.")

Done.


go/vt/workflow/horizontalreshard/horizontal_resharding_workflow.go, line 83 at r1 (raw file):

Previously, mberlin-bot (Michael Berlin) wrote…

I forgot to mention an important detail.

The workflow package supports hierarchical workflows and that is exactly what we want here.

For each step (copy schema shard, clone, wait for filtered replication, diff, MST), we want a sub group and within each group we want one entry per source (or where applicable) per dest shard.

On the UI side, you present this as Children of the root UI node (and then children of the children). For example, here's the schema swap code which creates a child for each shard: https://github.com/youtube/vitess/blob/master/go/vt/schemamanager/schemaswap/schema_swap.go#L422
And then, each "shardUINode" has multiple children as well.

In our case, we want nodes for the phase on the first level and a node per shard on the second level.

Please change the code accordingly.

Done.


Comments from Reviewable

@wangyipei01 wangyipei01 force-pushed the reshardworkflow branch 2 times, most recently from 6d142a8 to 13aebda Compare January 20, 2017 03:50
@mberlin-bot
Copy link

mberlin-bot commented Jan 20, 2017

:lgtm:

Very good!

Please address my last three comments and then you're good to merge it! :)


Reviewed 12 of 12 files at r8.
Review status: all files reviewed at latest revision, 29 unresolved discussions.


go/vt/vtctl/vtctl.go, line 1388 at r8 (raw file):

	}

	if err := wr.WaitForFilteredReplication(ctx, keyspace, shard, *maxDelay); err != nil {

Please simplify this:

return wr.WaitForFilteredReplication(ctx, keyspace, shard, *maxDelay)

go/vt/workflow/resharding/horizontal_resharding_workflow.go, line 156 at r8 (raw file):

Shard

Please write "shard" (all lower case) to be consistent with the other path names ("copy_schema", "clone", "diff", "migrate").


web/vtctld2/src/app/shared/flags/workflow.flags.ts, line 105 at r8 (raw file):

Vtworkers

vtworkers

As far as I can tell all flags are lowercase. Let's keep it this way.


Comments from Reviewable

Approved with PullApprove

It supports creating workflow to reshard multiple source shards.
It will show a phase-based tree-structure UI.
Unit Test is included for testing the happy path.
@wangyipei01 wangyipei01 merged commit 3b8611d into vitessio:master Jan 20, 2017
@wangyipei01 wangyipei01 deleted the reshardworkflow branch January 20, 2017 23:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants