Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training updates #4214

Merged
merged 12 commits into from Jan 7, 2019
Merged

Training updates #4214

merged 12 commits into from Jan 7, 2019

Conversation

jseldess
Copy link
Contributor

@jseldess jseldess commented Dec 27, 2018

Addresses #4128.

This PR updates our training content as follows:

  • Adds geo-partitioning, Kubernetes, and TPC-C modules.
  • Updates locality and replication zones and several other modules.
  • Removes training for older versions.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jseldess
Copy link
Contributor Author

@bdarnell, PTAL at what's here so far. I'll keep working on the other missing pieces.

Copy link
Member

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 131 of 132 files at r1, 3 of 3 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


v2.1/training/geo-partitioning.md, line 20 at r2 (raw file):

Make sure you have already completed [Locality and Replication Zones](locality-and-replication-zones.html) and have 9 nodes running locally across 3 distinct localities:

- `--locality=region=us-east=datacenter=us-east1`

The equals sign before datacenter should be a comma.


v2.1/training/geo-partitioning.md, line 248 at r2 (raw file):

For this service, the most effective technique for improving read and write latency is to geo-partition the data by city. In essence, this means changing the way data is mapped to ranges. Instead of an entire table and its indexes mapping to a specific range or set of ranges, all rows in the table and its indexes with a given city will map to a range or set of ranges.

1. Partition the `users` table by city:

This doesn't scale very well beyond this example. Should we discuss how in the real world you'd probably want a "region" field in the database instead of using city directly?


v2.1/training/geo-partitioning.md, line 322 at r2 (raw file):

    ~~~

    {% include copy-clipboard.html %}

It would be nicer to consolidate this into 1 copyable block (or three, one per table) instead of 18 separate copy/paste steps.


v2.1/training/geo-partitioning.md, line 441 at r2 (raw file):

~~~ sql
> SELECT * FROM [SHOW EXPERIMENTAL_RANGES FROM TABLE vehicles] \
WHERE "start_key" IS NOT NULL AND "start_key" NOT LIKE '%Prefix%';

We should add a note explaining that this excludes the empty space between the listed cities (which is using the default configuration).

Is the start_key IS NOT NULL clause necessary?


v2.1/training/geo-partitioning.md, line 542 at r2 (raw file):

    ~~~

    This simplified shutdown process is only appropriate for a lab/evaluation scenario. In a production environment, you would use `cockroach quit` to gracefully shut down each node.

I don't agree with the second sentence; in production you'd use a process manager (which internally uses kill) instead of cockroach quit. (it's the -9 instead of the use of kill that makes this a non-graceful shutdown)


v2.1/training/orchestration-with-kubernetes.md, line 20 at r2 (raw file):

[minikube](http://kubernetes.io/docs/getting-started-guides/minikube/) | This is the tool you'll use to run a Kubernetes cluster inside a VM on your local workstation.
[pod](http://kubernetes.io/docs/user-guide/pods/) | A pod is a group of one of more Docker containers. In this tutorial, all pods will run on your local workstation, each containing one Docker container running a single CockroachDB node. You'll start with 3 pods and grow to 4.
[StatefulSet](http://kubernetes.io/docs/concepts/abstractions/controllers/statefulsets/) | A StatefulSet is a group of pods treated as stateful units, where each pod has distinguishable network identity and always binds back to the same persistent storage on restart. StatefulSets are considered stable as of Kubernetes version 1.9 after reaching beta in version 1.5.

I'd remove the history lesson and replace the last sentence with "StatefulSets require Kubernetes version 1.9 or newer".


v2.1/training/orchestration-with-kubernetes.md, line 41 at r2 (raw file):

## Step 2. Start CockroachDB

To start your CockroachDB cluster, you can either use our StatefulSet configuration and related files directly, or you can use the [Helm](https://helm.sh/) package manager for Kubernetes to simplify the process.

I think for the training we should pick one instead of offering a choice. I'd also like to use secure mode here since we've moved on from the local toy mode to something closer to production.

@jseldess
Copy link
Contributor Author

TFTR, @bdarnell. I'll get back to this early in the new year.

jseldess and others added 5 commits January 3, 2019 17:31
Move scaling to orchestration into new section.
We never versioned the slides. They tend to reflect the latest
version, which is confusing in combination with older training
modules.
Copy link
Contributor

@lnhsingh lnhsingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


v2.1/training/geo-partitioning.md, line 20 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

The equals sign before datacenter should be a comma.

Done.


v2.1/training/geo-partitioning.md, line 248 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This doesn't scale very well beyond this example. Should we discuss how in the real world you'd probably want a "region" field in the database instead of using city directly?

Added note.


v2.1/training/geo-partitioning.md, line 322 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

It would be nicer to consolidate this into 1 copyable block (or three, one per table) instead of 18 separate copy/paste steps.

Done.


v2.1/training/geo-partitioning.md, line 542 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I don't agree with the second sentence; in production you'd use a process manager (which internally uses kill) instead of cockroach quit. (it's the -9 instead of the use of kill that makes this a non-graceful shutdown)

Removed second sentence.


v2.1/training/orchestration-with-kubernetes.md, line 20 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I'd remove the history lesson and replace the last sentence with "StatefulSets require Kubernetes version 1.9 or newer".

Done.


v2.1/training/orchestration-with-kubernetes.md, line 41 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I think for the training we should pick one instead of offering a choice. I'd also like to use secure mode here since we've moved on from the local toy mode to something closer to production.

@bdarnell / @jseldess, do you have a preference between using StatefulSet or Helm here? When you say secure mode, are you talking about using cockroachdb-statefulset-secure.yaml instead of cockroachdb-statefulset-insecure.yaml? Is there a secure mode if we use Helm?

Copy link
Member

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


v2.1/training/geo-partitioning.md, line 542 at r2 (raw file):

Previously, lhirata wrote…

Removed second sentence.

I noticed that the same sentence appears in other training docs too: fault tolerance, locality and replication zones, and security.


v2.1/training/orchestration-with-kubernetes.md, line 41 at r2 (raw file):

Previously, lhirata wrote…

@bdarnell / @jseldess, do you have a preference between using StatefulSet or Helm here? When you say secure mode, are you talking about using cockroachdb-statefulset-secure.yaml instead of cockroachdb-statefulset-insecure.yaml? Is there a secure mode if we use Helm?

I'll try both tomorrow and get back to you (I have a slight bias towards the statefulset config because helm is another thing to install, but our usage stats show that helm is growing fast).

Yes, for secure mode I mean using the -secure.yaml file for the statefulset config. For helm, there's a secure variable that can be set but I'm not familiar with the workflow.

Copy link
Contributor

@lnhsingh lnhsingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


v2.1/training/geo-partitioning.md, line 542 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I noticed that the same sentence appears in other training docs too: fault tolerance, locality and replication zones, and security.

Removed elsewhere.

@bdarnell
Copy link
Member

bdarnell commented Jan 4, 2019

OK, for kubernetes, let's skip helm and just use the statefulset configs. We're not yet able to use helm for things like scaling and upgrades, so I think it's simpler to leave it out and just work with kubectl.

  • Don't link to the performance-optimized yaml file; the training should always use the basic version. Instead, at the end of the page, offer a link to https://www.cockroachlabs.com/docs/stable/kubernetes-performance.html

  • Add some cleanup steps before minikube delete. They may be using minikube for other things and not want to destroy the whole thing:

    • kubectl delete -f https://raw.githubusercontent.com/cockroachdb/cockroach/master/cloud/kubernetes/cockroachdb-statefulset.yaml
    • kubectl delete job.batch/cluster-init

Copy link
Contributor

@lnhsingh lnhsingh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


v2.1/training/geo-partitioning.md, line 441 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

We should add a note explaining that this excludes the empty space between the listed cities (which is using the default configuration).

Is the start_key IS NOT NULL clause necessary?

To clarify, this example is using the default configuration?

Copy link
Member

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


v2.1/training/geo-partitioning.md, line 441 at r2 (raw file):

Previously, lhirata wrote…

To clarify, this example is using the default configuration?

No, what I mean is that we've created empty ranges in between each of our cities, and those empty ranges use the default zone configuration. For example, if we inserted a record with city='chicago' it would be inserted in the range from /"boston"/PrefixEnd to /"los angeles". That range always exists, but because in this example we don't intend to use anything outside our defined list of cities, we're hiding those in-between ranges with the "start_key" NOT LIKE '%Prefix%' clause.

The includes were resulting in broken links due to the
nested training directory. This is the easiest way to
fix the links, though it creates redundancy that is
more difficult to maintain. Perhaps we can find a better
solution later.
Copy link
Contributor Author

@jseldess jseldess left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


v2.1/training/geo-partitioning.md, line 441 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

No, what I mean is that we've created empty ranges in between each of our cities, and those empty ranges use the default zone configuration. For example, if we inserted a record with city='chicago' it would be inserted in the range from /"boston"/PrefixEnd to /"los angeles". That range always exists, but because in this example we don't intend to use anything outside our defined list of cities, we're hiding those in-between ranges with the "start_key" NOT LIKE '%Prefix%' clause.

Added a note. And Ben's right that start_key IS NOT NULL isn't necessary, so removed that.

@jseldess jseldess changed the title [WIP] Training updates Training updates Jan 7, 2019
@jseldess jseldess merged commit 5ef593d into master Jan 7, 2019
@jseldess jseldess deleted the training-updates branch January 7, 2019 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants