New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubernetes: Config files + setup script for secure multiregion clusters #27092

Merged
merged 1 commit into from Aug 14, 2018

Conversation

Projects
None yet
4 participants
@a-robinson
Member

a-robinson commented Jul 1, 2018

tl;dr fill in a few constants at the top of secure.sh and then run it,
and you'll have a working secure multiregion cluster. Only tested on GKE
so far.

This relies on linking together each cluster's DNS servers so that
they're able to defer requests to each other for a special zone-scoped
namespace. This is a little hacky, but very maintainable and survivable
even if all the nodes in a cluster are taken down and brought back up
again later with different IP addresses.

The big caveat that I learned after I thought I was done, though, is
that GCE's internal load balancers only work within a region.
Unfortunately I had done my prototyping all within one region, with each
cluster just in a different zone. To get around this, I've had to switch
from exposing each DNS server on an internal IP to exposing them on
external IPs, which some users may not like, and which might tip the
scales in favor of a different solution. I'd be happy to discuss the
alternatives (hooking up a CoreDNS instance in each cluster to every
cluster's apiserver, or using istio multicluster) with anyone
interested.

I've only handled the secure cluster case in the script, but it can be
easily modified to also handle insecure clusters as an option. I should
translate the script to python for that, though. It's already more bash
than I'd want to ask people to use (and it requires bash v4, which macos
doesn't ship with by default).

Release note (general change): Configuration files and a setup script
for secure multiregion deployments on Kubernetes are now provided.

Fixes #25189

@a-robinson a-robinson requested review from bobvawter and kannanlakshmi Jul 1, 2018

@cockroach-teamcity

This comment has been minimized.

Show comment
Hide comment
@cockroach-teamcity

cockroach-teamcity Jul 1, 2018

Member

This change is Reviewable

Member

cockroach-teamcity commented Jul 1, 2018

This change is Reviewable

@kannanlakshmi

This comment has been minimized.

Show comment
Hide comment
@kannanlakshmi

kannanlakshmi Jul 12, 2018

Thanks Alex. Some comments that weren't intuitive to me though its more applicable for @jseldess as he puts together the docs for it:

  1. Would be good to specify the pre-req before starting this i.e., 2 or more K8s clusters in different zones or regions (maybe call out how zones/ regions work differently as you pointed out)
  2. Setting up appropriate RBAC for each cluster - I struggled with this even though it seems obvious in retrospect
  3. How to delete everything (e.g., namespaces) and clean up folders when you're done
  4. How to port forward to the web UI - I wasn't able to actually view it in the web UI though the CLI suggested port forwarding worked. Do you just need to port forward one of the nodes in any region? How does that work? e.g., $ kubectl port-forward cockroachdb-0 8080 --context=gke_cockroach-workers_us-central1-a_lakshmi --namespace=us-central1-a
    image

kannanlakshmi commented Jul 12, 2018

Thanks Alex. Some comments that weren't intuitive to me though its more applicable for @jseldess as he puts together the docs for it:

  1. Would be good to specify the pre-req before starting this i.e., 2 or more K8s clusters in different zones or regions (maybe call out how zones/ regions work differently as you pointed out)
  2. Setting up appropriate RBAC for each cluster - I struggled with this even though it seems obvious in retrospect
  3. How to delete everything (e.g., namespaces) and clean up folders when you're done
  4. How to port forward to the web UI - I wasn't able to actually view it in the web UI though the CLI suggested port forwarding worked. Do you just need to port forward one of the nodes in any region? How does that work? e.g., $ kubectl port-forward cockroachdb-0 8080 --context=gke_cockroach-workers_us-central1-a_lakshmi --namespace=us-central1-a
    image
@a-robinson

This comment has been minimized.

Show comment
Hide comment
@a-robinson

a-robinson Jul 23, 2018

Member

Thanks for the feedback! I've converted everything to python and added a teardown.py script for cleaning up all the resources that setup.py creates. I'll work with @jseldess on documentation. I'm interested if @bobvawter has feedback on this before going too far with it, though, since the DNS linking is not ideal.

Member

a-robinson commented Jul 23, 2018

Thanks for the feedback! I've converted everything to python and added a teardown.py script for cleaning up all the resources that setup.py creates. I'll work with @jseldess on documentation. I'm interested if @bobvawter has feedback on this before going too far with it, though, since the DNS linking is not ideal.

@a-robinson

This comment has been minimized.

Show comment
Hide comment
@a-robinson

a-robinson Jul 23, 2018

Member

How to port forward to the web UI - I wasn't able to actually view it in the web UI though the CLI suggested port forwarding worked. Do you just need to port forward one of the nodes in any region? How does that work? e.g., $ kubectl port-forward cockroachdb-0 8080 --context=gke_cockroach-workers_us-central1-a_lakshmi --namespace=us-central1-a

Yes, you should just have to port-forward to any of the nodes in any of the regions. I'm not sure what you mean by "I wasn't able to actually view it in the web UI".

Member

a-robinson commented Jul 23, 2018

How to port forward to the web UI - I wasn't able to actually view it in the web UI though the CLI suggested port forwarding worked. Do you just need to port forward one of the nodes in any region? How does that work? e.g., $ kubectl port-forward cockroachdb-0 8080 --context=gke_cockroach-workers_us-central1-a_lakshmi --namespace=us-central1-a

Yes, you should just have to port-forward to any of the nodes in any of the regions. I'm not sure what you mean by "I wasn't able to actually view it in the web UI".

@a-robinson

This comment has been minimized.

Show comment
Hide comment
@a-robinson
Member

a-robinson commented Aug 13, 2018

ping @bobvawter?

@mberhault

Reviewed 7 of 10 files at r1, 4 of 5 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


cloud/kubernetes/multiregion/client-secure.yaml, line 11 at r2 (raw file):

  containers:
  - name: cockroachdb-client
    image: cockroachdb/cockroach:v2.0.3

2.0.3? Here and elsewhere. We seem to be on 2.0.5 for the non-regional configs.


cloud/kubernetes/multiregion/cockroachdb-statefulset.yaml, line 1 at r2 (raw file):

apiVersion: v1

This is the insecure version, is it needed? All other configs (init, client) use secure mode.


cloud/kubernetes/multiregion/README.md, line 42 at r2 (raw file):

This requirement is satisfied by clusters deployed in cloud environments such as Google Kubernetes Engine, and
can also be satsified by on-prem environments depending on the [Kubernetes networking setup](https://kubernetes.io/docs/concepts/cluster-administration/networking/) used. If you want to test whether your cluster will work, you can run this basic network test:

s/satsified/satisfied/


cloud/kubernetes/multiregion/setup.py, line 9 at r2 (raw file):

# Before running the script, fill in appropriate values for all the parameters
# above the dashed line.

I think something that fails to run if not changed would be helpful. It would prevent people from just blindly running this.


cloud/kubernetes/multiregion/setup.py, line 33 at r2 (raw file):

# Path to the cockroach binary on your local machine that you want to use
# generate certificates. Defaults to trying to find cockroach in your PATH.
# TODO: CHANGE BACK

Was that meant to be addressed?

kubernetes: Config files + setup script for secure multiregion clusters
tl;dr fill in a few constants at the top of secure.py and then run it,
and you'll have a working secure multiregion cluster. Works on GKE. Does
not work on AWS.

This relies on linking together each cluster's DNS servers so that
they're able to defer requests to each other for a special zone-scoped
namespace. This is a little hacky, but very maintainable and survivable
even if all the nodes in a cluster are taken down and brought back up
again later with different IP addresses.

The big caveat that I learned after I thought I was done, though, is
that GCE's internal load balancers only work within a region.
Unfortunately I had done my prototyping all within one region, with each
cluster just in a different zone. To get around this, I've had to switch
from exposing each DNS server on an internal IP to exposing them on
external IPs, which some users may not like, and which might tip the
scales in favor of a different solution. I'd be happy to discuss the
alternatives (hooking up a CoreDNS instance in each cluster to every
cluster's apiserver, or using istio multicluster) with anyone
interested.

I've only handled the secure cluster case in the script, but it can be
easily modified to also handle insecure clusters as an option. Insecure
clusters can actually run in more environments than secure clusters,
though, and should probably be handled differently for that reason.

Release note (general change): Configuration files and a setup script
for secure multiregion deployments on GKE are now provided.

Release note: None
@a-robinson

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


cloud/kubernetes/multiregion/client-secure.yaml, line 11 at r2 (raw file):

Previously, mberhault (marc) wrote…

2.0.3? Here and elsewhere. We seem to be on 2.0.5 for the non-regional configs.

It's just fallen behind while waiting on review/feedback. Updated.


cloud/kubernetes/multiregion/cockroachdb-statefulset.yaml, line 1 at r2 (raw file):

Previously, mberhault (marc) wrote…

This is the insecure version, is it needed? All other configs (init, client) use secure mode.

No, not needed. I don't think it's likely to be wanted by anyone. Removed.


cloud/kubernetes/multiregion/README.md, line 42 at r2 (raw file):

Previously, mberhault (marc) wrote…

s/satsified/satisfied/

Done.


cloud/kubernetes/multiregion/setup.py, line 9 at r2 (raw file):

Previously, mberhault (marc) wrote…

I think something that fails to run if not changed would be helpful. It would prevent people from just blindly running this.

Done. I made the instructions clearer, kept a commented example, and added some basic input validation.


cloud/kubernetes/multiregion/setup.py, line 33 at r2 (raw file):

Previously, mberhault (marc) wrote…

Was that meant to be addressed?

Indeed, thanks.

@mberhault

This comment has been minimized.

Show comment
Hide comment
@mberhault

mberhault Aug 14, 2018

Contributor

Nice! LGTM

Contributor

mberhault commented Aug 14, 2018

Nice! LGTM

@a-robinson

This comment has been minimized.

Show comment
Hide comment
@a-robinson

a-robinson Aug 14, 2018

Member

bors r+

Member

a-robinson commented Aug 14, 2018

bors r+

craig bot pushed a commit that referenced this pull request Aug 14, 2018

Merge #27092
27092: kubernetes: Config files + setup script for secure multiregion clusters r=a-robinson a=a-robinson

tl;dr fill in a few constants at the top of secure.sh and then run it,
and you'll have a working secure multiregion cluster. Only tested on GKE
so far.

This relies on linking together each cluster's DNS servers so that
they're able to defer requests to each other for a special zone-scoped
namespace. This is a little hacky, but very maintainable and survivable
even if all the nodes in a cluster are taken down and brought back up
again later with different IP addresses.

The big caveat that I learned after I thought I was done, though, is
that GCE's internal load balancers only work within a region.
Unfortunately I had done my prototyping all within one region, with each
cluster just in a different zone. To get around this, I've had to switch
from exposing each DNS server on an internal IP to exposing them on
external IPs, which some users may not like, and which might tip the
scales in favor of a different solution. I'd be happy to discuss the
alternatives (hooking up a CoreDNS instance in each cluster to every
cluster's apiserver, or using istio multicluster) with anyone
interested.

I've only handled the secure cluster case in the script, but it can be
easily modified to also handle insecure clusters as an option. I should
translate the script to python for that, though. It's already more bash
than I'd want to ask people to use (and it requires bash v4, which macos
doesn't ship with by default).

Release note (general change): Configuration files and a setup script
for secure multiregion deployments on Kubernetes are now provided.

Fixes #25189

Co-authored-by: Alex Robinson <alexdwanerobinson@gmail.com>
@craig

This comment has been minimized.

Show comment
Hide comment
@craig

craig bot commented Aug 14, 2018

Build succeeded

@craig craig bot merged commit 1b83033 into cockroachdb:master Aug 14, 2018

3 checks passed

GitHub CI (Cockroach) TeamCity build finished
Details
bors Build succeeded
Details
license/cla Contributor License Agreement is signed.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment