Create FAQs for numbering problems. #3104

knz · 2018-05-05T01:04:12Z

As discussed on cockroachdb/cockroach#9227 and other related issues.

Jesse please take this over.

cockroach-teamcity · 2018-05-05T01:04:17Z

This change is

cockroach-teamcity · 2018-05-05T01:05:53Z

http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/15ab607cae1dd2f623e6f21b49ccd09b0d52214a/

bdarnell · 2018-05-06T22:13:00Z

Reviewed 4 of 4 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved, all commit checks successful.

_includes/faq/sequential-numbers.md, line 5 at r1 (raw file):

{{site.data.alerts.callout_info}}Sequences produce <emph>unique</emph> values, however not all values are guaranteed to be produced (e.g., when a transaction is canceled after it consumes a value) and the values may be slightly reordered (e.g., when a transaction that consumes a lower sequence number commits after a transaction that consumes a higher number).{{site.data.alerts.end}}

{{site.data.alerts.callout_info}}For maximum performance, avoid using sequences to generate row IDs or indexed columns. This is because sequence values are logically close to each other and can cause contention on few data ranges during inserts. Instead, prefer <code>UUID</code>  identifiers or integer identifiers generated with <code>unique_rowid()</code>.{{site.data.alerts.end}}

unique_rowid() is still mostly sequential - it will perform more like sequences than like uuid with respect to contention. The main difference between unique_rowid() and sequences is that sequences produce smaller numbers (but generating them is much slower).

So we should recommend that if you need a roughly-ordered id, use unique_rowid() unless you need the numbers to be small. And unless you specifically need a roughly-ordered id, you're probably better off using a UUID (it's important to emphasize this point to wean people off the habits of sequential IDs from other databases).

v2.0/sql-faqs.md, line 21 at r1 (raw file):

{% include faq/sequential-numbers.md %}

## How do I totally order writes to a table over time in CockroachDB?

I don't think "totally order" is the right term to use here. All rows in CRDB are totally ordered in the mathematical sense . The question here is how to make that total order correspond with insertion order.

Comments from Reviewable

cockroach-teamcity · 2018-05-08T20:27:07Z

http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/22fea6932032f2a39f0f065aa2ba9356005c1a71/

knz · 2018-05-08T20:27:27Z

Amended based on Ben's suggestion. Also added a comparison table. RFAL.

Review status: 0 of 5 files reviewed at latest revision, 2 unresolved discussions.

_includes/faq/sequential-numbers.md, line 5 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

unique_rowid() is still mostly sequential - it will perform more like sequences than like uuid with respect to contention. The main difference between unique_rowid() and sequences is that sequences produce smaller numbers (but generating them is much slower).

So we should recommend that if you need a roughly-ordered id, use unique_rowid() unless you need the numbers to be small. And unless you specifically need a roughly-ordered id, you're probably better off using a UUID (it's important to emphasize this point to wean people off the habits of sequential IDs from other databases).

Thanks for reminding me. Amended.

v2.0/sql-faqs.md, line 21 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

I don't think "totally order" is the right term to use here. All rows in CRDB are totally ordered in the mathematical sense . The question here is how to make that total order correspond with insertion order.

Yes, you're right. Rephrased.

Comments from Reviewable

cockroach-teamcity · 2018-05-08T20:29:08Z

http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/3d26424a01d61b027601fc8df447a7bbda973c0e/

bdarnell · 2018-05-08T23:52:55Z

Reviewed 5 of 5 files at r2.
Review status: all files reviewed at latest revision, all discussions resolved, all commit checks successful.

_includes/faq/differences-between-numberings.md, line 7 at r2 (raw file):

| Value distribution             | Uniformly distributed (128 bits)        | Contains time and space (node ID) components | Dense, small values            |
| Performance cost at generation | Small, scalable                         | Small, scalable                              | Variable, can cause contention |
| Locality                       | Maximally distributed, least contention | Somewhat local, may cause INSERT contention  | Highly local, most INSERT contention |

The locality issue is not about insert contention: unique_rowid values are extremely unlikely to directly contend with each other. The difference between unique_rowid and UUID is parallelism: looking up values by a UUID can make use of many nodes because the values will be spread across many ranges. Typical queries by a more time-ordered id will create more hotspots.

v2.0/sql-faqs.md, line 37 at r2 (raw file):

- [On the Way to Better SQL Joins](https://www.cockroachlabs.com/blog/better-sql-joins-in-cockroachdb/)

## How do I get the last ID/SERIAL value inserted into a table?

Add this to the 2.1 docs too.

Comments from Reviewable

knz · 2018-05-09T15:31:47Z

Review status: 3 of 5 files reviewed at latest revision, 2 unresolved discussions.

_includes/faq/differences-between-numberings.md, line 7 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

The locality issue is not about insert contention: unique_rowid values are extremely unlikely to directly contend with each other. The difference between unique_rowid and UUID is parallelism: looking up values by a UUID can make use of many nodes because the values will be spread across many ranges. Typical queries by a more time-ordered id will create more hotspots.

I reworded, can you have another look?

v2.0/sql-faqs.md, line 37 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Add this to the 2.1 docs too.

It is there already.

Comments from Reviewable

cockroach-teamcity · 2018-05-09T15:35:34Z

http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/98e0f6612968ab7f42bdcfb75586e177cfe7e833/

cockroach-teamcity · 2018-05-09T15:46:57Z

http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/6c38dfa461f60f8e3dff716329113ab937586c83/

cockroach-teamcity · 2018-05-09T16:00:45Z

http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/66983e1cbba45eaaa3e79516882f53c787694270/

bdarnell · 2018-05-09T16:09:18Z

Review status: 0 of 5 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful.

_includes/faq/differences-between-numberings.md, line 7 at r2 (raw file):

Previously, knz (kena) wrote…

I reworded, can you have another look?

For "data locality" and "read performance", sequences and unique_rowid are equivalent. We don't need to make a distinction between them except for insert performance.

_includes/faq/differences-between-numberings.md, line 9 at r3 (raw file):

| Value distribution                  | Uniformly distributed (128 bits)        | Contains time and space (node ID) components  | Dense, small values            |
| Data locality                       | Maximally distributed                   | Values generated close in time are co-located | Highly local                   |
| INSERT performance when used as key | Highest                                 | Lower for values generated close in time      | Slowest                        |

Break "insert performance" into latency and throughput buckets. For latency, both uuid and unique_rowid have good latency while sequences are slower. For throughput, UUID has best throughput while unique_rowid and sequences are limited.

Comments from Reviewable

knz · 2018-05-09T16:43:06Z

Review status: 0 of 5 files reviewed at latest revision, 2 unresolved discussions.

_includes/faq/differences-between-numberings.md, line 7 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

For "data locality" and "read performance", sequences and unique_rowid are equivalent. We don't need to make a distinction between them except for insert performance.

Well I don't agree with that. Two rowids generated a day apart will have values far apart. Sequences are guaranteed to be close to each other. Or are you saying that the "value distance" due to time distance for rowids does not matter?

_includes/faq/differences-between-numberings.md, line 9 at r3 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Break "insert performance" into latency and throughput buckets. For latency, both uuid and unique_rowid have good latency while sequences are slower. For throughput, UUID has best throughput while unique_rowid and sequences are limited.

The latency increases under contention, doesn't it?

Comments from Reviewable

cockroach-teamcity · 2018-05-09T16:46:57Z

http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/d4acdee4ac75ce39ed25ecdc66f229796950bb0b/

bdarnell · 2018-05-09T16:51:23Z

Review status: 0 of 5 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.

_includes/faq/differences-between-numberings.md, line 7 at r2 (raw file):

Previously, knz (kena) wrote…

Well I don't agree with that. Two rowids generated a day apart will have values far apart. Sequences are guaranteed to be close to each other. Or are you saying that the "value distance" due to time distance for rowids does not matter?

The "value distance" only matters if there are other keys between them. Each key will (usually) be adjacent to the one generated before it, whether the difference between the keys is 1 or 1000.

_includes/faq/differences-between-numberings.md, line 9 at r3 (raw file):

Previously, knz (kena) wrote…

The latency increases under contention, doesn't it?

Yes. Under low traffic, UUID and unique_rowid will have the same latency. As traffic increases, unique_rowid latency will degrade while UUID insertion latency will stay the same.

Comments from Reviewable

knz · 2018-05-09T17:02:35Z

Review status: 0 of 5 files reviewed at latest revision, 2 unresolved discussions.

_includes/faq/differences-between-numberings.md, line 7 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

The "value distance" only matters if there are other keys between them. Each key will (usually) be adjacent to the one generated before it, whether the difference between the keys is 1 or 1000.

Check. Now I get it (I think).

_includes/faq/differences-between-numberings.md, line 9 at r3 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Yes. Under low traffic, UUID and unique_rowid will have the same latency. As traffic increases, unique_rowid latency will degrade while UUID insertion latency will stay the same.

Ok updated.

Comments from Reviewable

cockroach-teamcity · 2018-05-09T17:04:23Z

http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/b09f92f0d9848c3c10b69b23edfa3cdea012ebe2/

bdarnell · 2018-05-09T17:14:11Z

Review status: 0 of 5 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.

_includes/faq/differences-between-numberings.md, line 7 at r2 (raw file):

Previously, knz (kena) wrote…

Check. Now I get it (I think).

For the data locality line, I think both unique_rowid and sequences are "highly local". There's not a meaningful difference between the two.

Similarly, what difference is "somewhat time-ordered" vs "highly time-ordered" supposed to indicate? I'd say that both are equally time-ordered. I think the amount of concurrency required to have out-of-order insertions is similar for both.

Comments from Reviewable

knz · 2018-05-09T17:21:25Z

Review status: 0 of 5 files reviewed at latest revision, 2 unresolved discussions.

_includes/faq/differences-between-numberings.md, line 7 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

For the data locality line, I think both unique_rowid and sequences are "highly local". There's not a meaningful difference between the two.

Similarly, what difference is "somewhat time-ordered" vs "highly time-ordered" supposed to indicate? I'd say that both are equally time-ordered. I think the amount of concurrency required to have out-of-order insertions is similar for both.

Yes ok I agree with this too.

Comments from Reviewable

cockroach-teamcity · 2018-05-09T17:24:12Z

http://cockroach-docs-review.s3-website-us-east-1.amazonaws.com/39f8f250fa97ab63ec7a8adf55b2cf67c1d91efd/

bdarnell · 2018-05-10T17:42:29Z

Review status: 0 of 5 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.

Comments from Reviewable

jseldess

This is awesome, @knz. LGTM, with some nits and minor copyedits. Nothing glaring though, so I'll merge and make edits in a follow-up PR.

jseldess · 2018-05-10T18:12:28Z

v2.0/sql-faqs.md

+
+{% include faq/sequential-numbers.md %}
+
+~~~


Remove this extra empty code block.

jseldess · 2018-05-10T18:12:38Z

v2.1/sql-faqs.md

+
+{% include faq/sequential-numbers.md %}
+
+~~~


Remove this extra empty code block.

jseldess · 2018-05-10T18:13:36Z

_includes/faq/sequential-numbers.md

+Sequential numbers can be generated in CockroachDB using the built-in
+function `unique_rowid()` or using [SQL sequences](create-sequence.html).
+
+{{site.data.alerts.callout_info}}Unless you need roughly-ordered


Need to use <code></code> instead of backticks within a callout.

jseldess · 2018-05-10T18:17:43Z

_includes/faq/sequential-transactions.md

+write ordering can be solved with other, more distribution-friendly
+solutions instead. For example, CockroachDB's [time travel queries
+(`AS OF SYSTEM
+TIME`)](https://www.cockroachlabs.com/blog/time-travel-queries-select-witty_subtitle-the_future/)


Once https://github.com/cockroachdb/docs/pull/3018/files lands, we'll have a specific doc page to link to here, which I think it preferable to the blog post.

jseldess · 2018-05-10T18:19:09Z

_includes/faq/sequential-transactions.md

+- initially: `CREATE TABLE cnt(val INT PRIMARY KEY); INSERT INTO cnt(val) VALUES(1);`
+- in each transaction: `INSERT INTO cnt(val) SELECT max(val)+1 FROM cnt RETURNING val;`
+
+This will cause all your INSERT transactions to conflict with each


We should code-format INSERT here.

knz requested review from bdarnell and jseldess May 5, 2018 01:04

jordanlewis added the in progress label May 5, 2018

knz mentioned this pull request May 5, 2018

sql: support for a globally unique monotonically increasing logical timestamp cockroachdb/cockroach#9227

Closed

knz force-pushed the 20180505-numbering branch from 15ab607 to 22fea69 Compare May 8, 2018 20:25

knz force-pushed the 20180505-numbering branch from 22fea69 to 3d26424 Compare May 8, 2018 20:27

knz force-pushed the 20180505-numbering branch from 3d26424 to 98e0f66 Compare May 9, 2018 15:31

knz force-pushed the 20180505-numbering branch from 98e0f66 to 6c38dfa Compare May 9, 2018 15:44

knz force-pushed the 20180505-numbering branch from 6c38dfa to 66983e1 Compare May 9, 2018 15:58

knz force-pushed the 20180505-numbering branch from 66983e1 to d4acdee Compare May 9, 2018 16:42

knz force-pushed the 20180505-numbering branch from d4acdee to b09f92f Compare May 9, 2018 17:02

Create FAQs for numbering problems.

39f8f25

knz force-pushed the 20180505-numbering branch from b09f92f to 39f8f25 Compare May 9, 2018 17:21

jseldess approved these changes May 10, 2018

View reviewed changes

jseldess merged commit 4c6eb7e into cockroachdb:master May 10, 2018

jseldess mentioned this pull request May 10, 2018

Minor copyedits to numbering FAQs #3131

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create FAQs for numbering problems. #3104

Create FAQs for numbering problems. #3104

knz commented May 5, 2018

cockroach-teamcity commented May 5, 2018

cockroach-teamcity commented May 5, 2018

bdarnell commented May 6, 2018

cockroach-teamcity commented May 8, 2018

knz commented May 8, 2018

cockroach-teamcity commented May 8, 2018

bdarnell commented May 8, 2018

knz commented May 9, 2018

cockroach-teamcity commented May 9, 2018

cockroach-teamcity commented May 9, 2018

cockroach-teamcity commented May 9, 2018

bdarnell commented May 9, 2018

knz commented May 9, 2018

cockroach-teamcity commented May 9, 2018

bdarnell commented May 9, 2018

knz commented May 9, 2018

cockroach-teamcity commented May 9, 2018

bdarnell commented May 9, 2018

knz commented May 9, 2018

cockroach-teamcity commented May 9, 2018

bdarnell commented May 10, 2018

jseldess left a comment

jseldess May 10, 2018

jseldess May 10, 2018

jseldess May 10, 2018

jseldess May 10, 2018

jseldess May 10, 2018

Create FAQs for numbering problems. #3104

Create FAQs for numbering problems. #3104

Conversation

knz commented May 5, 2018

cockroach-teamcity commented May 5, 2018

cockroach-teamcity commented May 5, 2018

bdarnell commented May 6, 2018

cockroach-teamcity commented May 8, 2018

knz commented May 8, 2018

cockroach-teamcity commented May 8, 2018

bdarnell commented May 8, 2018

knz commented May 9, 2018

cockroach-teamcity commented May 9, 2018

cockroach-teamcity commented May 9, 2018

cockroach-teamcity commented May 9, 2018

bdarnell commented May 9, 2018

knz commented May 9, 2018

cockroach-teamcity commented May 9, 2018

bdarnell commented May 9, 2018

knz commented May 9, 2018

cockroach-teamcity commented May 9, 2018

bdarnell commented May 9, 2018

knz commented May 9, 2018

cockroach-teamcity commented May 9, 2018

bdarnell commented May 10, 2018

jseldess left a comment

Choose a reason for hiding this comment

jseldess May 10, 2018

Choose a reason for hiding this comment

jseldess May 10, 2018

Choose a reason for hiding this comment

jseldess May 10, 2018

Choose a reason for hiding this comment

jseldess May 10, 2018

Choose a reason for hiding this comment

jseldess May 10, 2018

Choose a reason for hiding this comment