Skip to content

roachprod: Make multiple set [provider]-zones always geo-distribute nodes#43748

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
jlinder:jhl/roachprod-multizone-and-geo
Jan 7, 2020
Merged

roachprod: Make multiple set [provider]-zones always geo-distribute nodes#43748
craig[bot] merged 1 commit intocockroachdb:masterfrom
jlinder:jhl/roachprod-multizone-and-geo

Conversation

@jlinder
Copy link
Copy Markdown
Collaborator

@jlinder jlinder commented Jan 6, 2020

Before: if multiple zones were set for a provider and --geo wasn't set, all hosts
would be started in just one zone in one region.

Why change? Because if multiple zones are set, the intention is that they be
used.

Now, --geo and --[provider]-zones work as follows for gcloud, aws and azure:

  1. when geo and zones are not set, nodes are all placed in one of the default zones
  2. when geo is set but zones aren't, nodes are spread evenly across the default zones
  3. when zones are set, nodes are spread evenly across the specified zones

Fixes #38542.

Release note: None

@jlinder jlinder added first-pr Use to mark the first PR sent by a contributor / team member. Reviewers should be mindful of this. A-roachprod labels Jan 6, 2020
@jlinder jlinder requested a review from bobvawter January 6, 2020 18:58
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@bobvawter
Copy link
Copy Markdown
Contributor

Looks like these changes conflict with some other flag-handling changes that just landed.

@jlinder jlinder force-pushed the jhl/roachprod-multizone-and-geo branch 2 times, most recently from 06a7278 to f2ef78e Compare January 6, 2020 21:19
@jlinder
Copy link
Copy Markdown
Collaborator Author

jlinder commented Jan 6, 2020

Ha. Such luck!

I've done the update. It's RFAL.

@jlinder jlinder requested a review from kenliu January 6, 2020 22:02
Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read this change correctly it's going to massively change the behavior of roachprod for users which don't specify zones. Most users never specify any zones. There is an inbuilt assumption (perhaps bad) that if you don't specify geo you're going to get a cluster in us-east2 and if you do it's going to span us-west, us-east and eu-west (for AWS and equivalent for GCP).

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bobvawter and @kenliu)

@jlinder
Copy link
Copy Markdown
Collaborator Author

jlinder commented Jan 6, 2020

This PR doesn't change how --geo is applied when no zones are specified on the command line. It seems the PR and commit descriptions need to be more extensive in describing the new behavior.

Here are the combinations for how it works:

  • if geo and zones are not set, then all nodes get placed in the same zone in one region
  • if geo is set but zones aren't set, then the nodes are geo-distributed across us-west, us-east and eu-west
  • if zones are set, then the nodes are distributed across the provided zones

I'll make the description updates.

Your question does bring up for me though that it would be good to communicate the change before I merge.

Also, are there any automated tasks that would need to be updated based on the new implementation? Or anything I should check?

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check for uses of create in https://github.com/cockroachdb/cockroach/tree/master/pkg/cmd/roachtest

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bobvawter and @kenliu)

Copy link
Copy Markdown
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it should be alright. I think the thing I missed was that you're defaulting the zones to []string{}

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bobvawter and @kenliu)

Copy link
Copy Markdown
Contributor

@bobvawter bobvawter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits:

  • The default value of nil is idiomatic for providing an empty slice value. cap(nil)==len(nil) == 0
  • You can have the commit automatically close the issue when merged if you use Fixes: #12345. There are also a number of synonyms: https://help.github.com/en/github/managing-your-work-on-github/closing-issues-using-keywords
  • It would be nice to add a table to the roachprod readme showing the interplay of the various geo vs. zones options and where a user can expect nodes to be provisioned.

Changes to tools are usually announced with a note to the eng@ mailing list and the #engineering channel.

Reviewed 4 of 4 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @kenliu)

…odes

Before: if multiple zones were set for a provider and --geo wasn't set, all
hosts would be started in just one zone in one region.

Why change? Because if multiple zones are set, the intention is that they be
used.

Now, --geo and --[provider]-zones work as follows for gcloud, aws and azure:

1. when geo and zones are not set, nodes are all placed in one of the
   default zones
2. when geo is set but zones aren't, nodes are spread evenly across the
   default zones
3. when zones are set, nodes are spread evenly across the specified zones

Fixes cockroachdb#38542.

Release note: None
@jlinder jlinder force-pushed the jhl/roachprod-multizone-and-geo branch from f2ef78e to d24e40e Compare January 7, 2020 18:18
@jlinder
Copy link
Copy Markdown
Collaborator Author

jlinder commented Jan 7, 2020

This is RFAL.

I've updated to use nil instead of []string{}, added Fixes: to the PR description and commit message, updated the PR description and commit msg to fully explain the new use expectations, added some sections to the README for geo + zone flag expectations, and checked roachtest for how it uses roachprod create (it's always setting geo when specifying zones).

Copy link
Copy Markdown
Contributor

@bobvawter bobvawter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 4 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @kenliu)

@jlinder
Copy link
Copy Markdown
Collaborator Author

jlinder commented Jan 7, 2020

bors r+

craig Bot pushed a commit that referenced this pull request Jan 7, 2020
43748: roachprod: Make multiple set [provider]-zones always geo-distribute nodes r=jlinder a=jlinder

Before: if multiple zones were set for a provider and --geo wasn't set, all hosts
would be started in just one zone in one region.

Why change? Because if multiple zones are set, the intention is that they be
used.

Now, --geo and --[provider]-zones work as follows for gcloud, aws and azure:

1. when geo and zones are not set, nodes are all placed in one of the default zones
2. when geo is set but zones aren't, nodes are spread evenly across the default zones
3. when zones are set, nodes are spread evenly across the specified zones

Fixes #38542.

Release note: None

Co-authored-by: James H. Linder <jamesl@cockroachlabs.com>
@craig
Copy link
Copy Markdown
Contributor

craig Bot commented Jan 7, 2020

Build succeeded

@craig craig Bot merged commit d24e40e into cockroachdb:master Jan 7, 2020
@jlinder jlinder deleted the jhl/roachprod-multizone-and-geo branch January 7, 2020 22:55
jlinder added a commit to jlinder/cockroach that referenced this pull request Jan 14, 2020
Before: all zones specified either in the clusterSpec or via the --zones flag
were added to the roachprod command without regard to whether --geo was set.

Why change? Because the way roachprod handled --geo and --[provider]-zones
changed. See cockroachdb#43748 .

Now:

1. if geo is not set and zones are set (via CLI flag or in the clusterSpec),
   only the first zone will be passed to roachprod.
2. if geo is set and zones are set (via CLI flag or in the clusterSpec), all
   zones will be passed to roachprod.

Fixes cockroachdb#43898

Release note: None
jlinder added a commit to jlinder/cockroach that referenced this pull request Jan 14, 2020
Before: all zones specified either in the clusterSpec or via the --zones flag
were added to the roachprod command without regard to whether --geo was set.

Why change? Because the way roachprod handled --geo and --[provider]-zones
changed. See cockroachdb#43748 .

Now:

1. if geo is not set and zones are set (via CLI flag or in the clusterSpec),
   only the first zone will be passed to roachprod.
2. if geo is set and zones are set (via CLI flag or in the clusterSpec), all
   zones will be passed to roachprod.

Fixes cockroachdb#43898

Release note: None
craig Bot pushed a commit that referenced this pull request Jan 14, 2020
43975: roachtest: only pass one zone if geo isn't set r=jlinder a=jlinder

Before: all zones specified either in the clusterSpec or via the --zones flag
were added to the roachprod command without regard to whether --geo was set.

Why change? Because the way roachprod handled --geo and --[provider]-zones
changed. See #43748 .

Now:

1. if geo is not set and zones are set (via CLI flag or in the clusterSpec),
   only the first zone will be passed to roachprod.
2. if geo is set and zones are set (via CLI flag or in the clusterSpec), all
   zones will be passed to roachprod.

Fixes #43898

Release note: None

Co-authored-by: James H. Linder <jamesl@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-roachprod first-pr Use to mark the first PR sent by a contributor / team member. Reviewers should be mindful of this.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

roachprod: Multiple zones are silently ignored without --geo

4 participants