Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 73 additions & 7 deletions _includes/faq/auto-generate-unique-ids_v1.1.html
Original file line number Diff line number Diff line change
@@ -1,21 +1,87 @@
To auto-generate unique row IDs, use the [`UUID`](uuid.html) column with the `uuid_v4()` function cast to `UUID` as the default value:
To auto-generate unique row IDs, use the [`UUID`](uuid.html) column with the `gen_random_uuid()` [function](functions-and-operators.html#id-generation-functions) as the [default value](default-value.html):

{% include copy-clipboard.html %}
~~~ sql
> CREATE TABLE new_test (id UUID PRIMARY KEY DEFAULT uuid_v4()::UUID, name STRING);
> CREATE TABLE t1 (id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name STRING);
~~~

{% include copy-clipboard.html %}
~~~ sql
> INSERT INTO t1 (name) VALUES ('a'), ('b'), ('c');
~~~

{% include copy-clipboard.html %}
~~~ sql
> SELECT * FROM t1;
~~~

~~~
+--------------------------------------+------+
| id | name |
+--------------------------------------+------+
| 60853a85-681d-4620-9677-946bbfdc8fbc | c |
| 77c9bc2e-76a5-4ebc-80c3-7ad3159466a1 | b |
| bd3a56e1-c75e-476c-b221-0da9d74d66eb | a |
+--------------------------------------+------+
(3 rows)
~~~

Alternatively, you can use the [`BYTES`](bytes.html) column with the `uuid_v4()` function as the default value instead:

{% include copy-clipboard.html %}
~~~ sql
> CREATE TABLE test (id BYTES PRIMARY KEY DEFAULT uuid_v4(), name STRING);
> CREATE TABLE t2 (id BYTES PRIMARY KEY DEFAULT uuid_v4(), name STRING);
~~~

In cases where you specifically need time-locality for unique row IDs, use the [`SERIAL`](serial.html) data type, which is an alias for [`INT`](int.html) with the `unique_rowid()` [function](functions-and-operators.html#id-generation-functions) as the [default value](default-value.html). On insert, the `unique_rowid()` function generates a default value from the timestamp and ID of the node executing the insert:
{% include copy-clipboard.html %}
~~~ sql
> INSERT INTO t2 (name) VALUES ('a'), ('b'), ('c');
~~~

{% include copy-clipboard.html %}
~~~ sql
> CREATE TABLE test (id SERIAL PRIMARY KEY, name STRING);
> SELECT * FROM t2;
~~~

~~~
+---------------------------------------------------+------+
| id | name |
+---------------------------------------------------+------+
| "\x9b\x10\xdc\x11\x9a\x9cGB\xbd\x8d\t\x8c\xf6@vP" | a |
| "\xd9s\xd7\x13\n_L*\xb0\x87c\xb6d\xe1\xd8@" | c |
| "\uac74\x1dd@B\x97\xac\x04N&\x9eBg\x86" | b |
+---------------------------------------------------+------+
(3 rows)
~~~

The `SERIAL` values are likely to be globally unique except in cases where a very large number of IDs (100,000+) are generated per node per second. Because `BYTES` and `UUID` values are 128-bit, much larger than `INT` values at 64-bit, there is virtually no chance of generating non-unique values.
In either case, generated IDs will be 128-bit, large enough for there to be virtually no chance of generating non-unique values. Also, once the table grows beyond a single key-value range (more than 64MB by default), new IDs will be scattered across all of the table's ranges and, therefore, likely across different nodes. This means that multiple nodes will share in the load.

If it's important for generated IDs to be stored in the same key-value range, you can use the [`SERIAL`](serial.html) data type, which is an alias for [`INT`](int.html) with the `unique_rowid()` [function](functions-and-operators.html#id-generation-functions) as the default value:

{% include copy-clipboard.html %}
~~~ sql
> CREATE TABLE t3 (id SERIAL PRIMARY KEY, name STRING);
~~~

{% include copy-clipboard.html %}
~~~ sql
> INSERT INTO t3 (name) VALUES ('a'), ('b'), ('c');
~~~

{% include copy-clipboard.html %}
~~~ sql
> SELECT * FROM t3;
~~~

~~~
+--------------------+------+
| id | name |
+--------------------+------+
| 293807573840855041 | a |
| 293807573840887809 | b |
| 293807573840920577 | c |
+--------------------+------+
(3 rows)
~~~

The distribution of IDs at the key-value level may also be a consideration. When using `BYTES` or `UUID` with `uuid_v4()` as the default value, consecutively generated IDs will be spread across different key-value ranges (and therefore likely across different nodes), whereas when using `INT` with `unique_rowid()` as the default value, consecutively generated IDs may end up in the same key-value range.
On insert, the `unique_rowid()` function generates a default value from the timestamp and ID of the node executing the insert. Such time-ordered values are likely to be globally unique except in cases where a very large number of IDs (100,000+) are generated per node per second.
12 changes: 6 additions & 6 deletions _includes/sidebar-data-v1.1.json
Original file line number Diff line number Diff line change
Expand Up @@ -681,6 +681,12 @@
"/${VERSION}/int.html"
]
},
{
"title": "<code>UUID</code>",
"urls": [
"/${VERSION}/uuid.html"
]
},
{
"title": "<code>SERIAL</code>",
"urls": [
Expand Down Expand Up @@ -746,12 +752,6 @@
"urls": [
"/${VERSION}/array.html"
]
},
{
"title": "<code>UUID</code>",
"urls": [
"/${VERSION}/uuid.html"
]
}
]
},
Expand Down
12 changes: 6 additions & 6 deletions _includes/sidebar-data-v1.2.json
Original file line number Diff line number Diff line change
Expand Up @@ -681,6 +681,12 @@
"/${VERSION}/int.html"
]
},
{
"title": "<code>UUID</code>",
"urls": [
"/${VERSION}/uuid.html"
]
},
{
"title": "<code>SERIAL</code>",
"urls": [
Expand Down Expand Up @@ -746,12 +752,6 @@
"urls": [
"/${VERSION}/array.html"
]
},
{
"title": "<code>UUID</code>",
"urls": [
"/${VERSION}/uuid.html"
]
}
]
},
Expand Down
2 changes: 1 addition & 1 deletion v1.1/data-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ CockroachDB supports the following data types. Click a type for more details.
Type | Description | Example
-----|-------------|--------
[`INT`](int.html) | A signed integer, up to 64 bits. | `12345`
[`UUID`](uuid.html) | <span class="version-tag">New in v1.1:</span> A 128-bit hexadecimal value. | `7f9c24e8-3b12-4fef-91e0-56a2d5a246ec`
[`SERIAL`](serial.html) | A unique 64-bit signed integer. | `148591304110702593 `
[`DECIMAL`](decimal.html) | An exact, fixed-point number. | `1.2345`
[`FLOAT`](float.html) | A 64-bit, inexact, floating-point number. | `1.2345`
Expand All @@ -22,7 +23,6 @@ Type | Description | Example
[`COLLATE`](collate.html) | The `COLLATE` feature lets you sort [`STRING`](string.html) values according to language- and country-specific rules, known as collations. | `'a1b2c3' COLLATE en`
[`BYTES`](bytes.html) | A string of binary characters. | `b'\141\061\142\062\143\063'`
[`ARRAY`](array.html) | <span class="version-tag">New in v1.1:</span> A 1-dimensional, 1-indexed, homogenous array of any non-array data type. | `{"sky","road","car"}`
[`UUID`](uuid.html) | <span class="version-tag">New in v1.1:</span> A 128-bit hexadecimal value. | `7f9c24e8-3b12-4fef-91e0-56a2d5a246ec`

## Data Type Conversions & Casts

Expand Down
16 changes: 8 additions & 8 deletions v1.1/import.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ summary: Import CSV data into your CockroachDB cluster.
toc: false
---

The `IMPORT` [statement](sql-statements.html) imports tabular data (e.g. CSVs) into a single table.
The `IMPORT` [statement](sql-statements.html) imports tabular data (e.g. CSVs) into a single table.

{{site.data.alerts.callout_danger}}<strong>This is an experimental feature</strong>. To enable it, you must run <a href="set-cluster-setting.html"><code>SET CLUSTER SETTING experimental.importcsv.enabled = true</code></a>{{site.data.alerts.end}}

Expand Down Expand Up @@ -38,7 +38,7 @@ Before using [`IMPORT`](import.html), you should have:
- The schema of the table you want to import.
- The tabular data you want to import (e.g., CSV), preferably hosted on cloud storage.
- A location to store data before it is fully imported into all your nodes (referred to in this document as a "temp" directory). This location *must* be accessible to all nodes using the same address (i.e. cannot use a node's local file storage).

For ease of use, we recommend using cloud storage. However, if that isn't readily available to you, we also have a [guide on easily creating your own file server](create-a-file-server.html).

## Details
Expand Down Expand Up @@ -79,7 +79,7 @@ It's important to note, though, that after the single machine creates the Cockro

#### Available Storage Requirements

The node's first-listed/default [`store`](start-a-node.html#store) directory must have enough available storage equal to or greater than the size of the file you're importing.
The node's first-listed/default [`store`](start-a-node.html#store) directory must have enough available storage equal to or greater than the size of the file you're importing.

On [`cockroach start`](start-a-node.html), if you set `--max-disk-temp-storage`, it must also be greater than the size of the file you're importing.

Expand Down Expand Up @@ -295,7 +295,7 @@ Convert values to SQL *NULL* if they match the specified string.
> IMPORT TABLE customers
CREATE USING 'azure://acme-co/customer-create-table.sql?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'
CSV DATA ('azure://acme-co/customer-import-data.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
WITH
temp = 'azure://acme-co/temp/?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'
;
~~~
Expand All @@ -304,7 +304,7 @@ WITH

~~~ sql
> IMPORT TABLE customers (
id SERIAL PRIMARY KEY,
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT,
INDEX name_idx (name)
)
Expand All @@ -318,7 +318,7 @@ WITH

~~~ sql
> IMPORT TABLE customers (
id SERIAL PRIMARY KEY,
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT,
INDEX name_idx (name)
)
Expand All @@ -333,7 +333,7 @@ WITH

~~~ sql
> IMPORT TABLE customers (
id SERIAL PRIMARY KEY,
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT,
INDEX name_idx (name)
)
Expand All @@ -348,7 +348,7 @@ WITH

~~~ sql
> IMPORT TABLE customers (
id SERIAL PRIMARY KEY,
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT,
INDEX name_idx (name)
)
Expand Down
6 changes: 1 addition & 5 deletions v1.1/serial.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,7 @@ toc: false

The `SERIAL` [data type](data-types.html) is a column data type that, on insert, generates a default integer from the timestamp and ID of the node executing the insert. This combination is likely to be globally unique except in extreme cases (see this [example](create-table.html#create-a-table-with-auto-generated-unique-row-ids) for more details). Also, because value generation does not require talking to other nodes, it is much faster than sequentially auto-incrementing a value, which requires distributed coordination.

{{site.data.alerts.callout_info}}
We believe this data type is a better solution than PostgreSQL's <code>SERIAL</code> and MySQL's <code>AUTO_INCREMENT</code> types, both of which auto-increment integers but not necessarily in a strictly sequential fashion (see the <a href="#auto-incrementing-is-not-always-sequential"> Auto-Incrementing Is Not Always Sequential </a> example below). However, if you find that this feature is incompatible with your application, please <a href="https://github.com/cockroachdb/cockroach/issues">open an
issue</a> or <a href="https://gitter.im/cockroachdb/cockroach">chat
with us on Gitter</a>.
{{site.data.alerts.end}}
{{site.data.alerts.callout_info}}In most cases, we recommend using the <a href="uuid.html"><code>UUID</code></a> data type with the <code>gen_random_uuid()</code> function as the default value, which generates 128-bit values (much larger than <code>SERIAL</code>'s 64-bit) and scatters them across all of a table's underlying key-value ranges, ensuring that multiple nodes share in the load. See <a href="uuid.html#create-a-table-with-auto-generated-unique-row-ids">Create a table with auto-generated unique row IDs</a> for more details.{{site.data.alerts.end}}

<div id="toc"></div>

Expand Down
8 changes: 5 additions & 3 deletions v1.1/uuid.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,24 @@ summary: The UUID data type stores 128-bit Universal Unique Identifiers.
toc: false
---

<span class="version-tag">New in v1.1:</span> The `UUID` (Universally Unique Identifier) [data type](data-types.html) stores a 128-bit value that is [unique across both space and time](https://www.ietf.org/rfc/rfc4122.txt).
<span class="version-tag">New in v1.1:</span> The `UUID` (Universally Unique Identifier) [data type](data-types.html) stores a 128-bit value that is [unique across both space and time](https://www.ietf.org/rfc/rfc4122.txt).

{{site.data.alerts.callout_success}}To auto-generate unique row IDs, we recommend using <a href="uuid.html"><code>UUID</code></a> with the <code>gen_random_uuid()</code> function as the default value. See the <a href="#create-a-table-with-auto-generated-unique-row-ids">example</a> below for more details.{{site.data.alerts.end}}

<div id="toc"></div>

## Syntax
A `UUID` value can be expressed using the following formats:

Format | Description
Format | Description
-------|-------------
Standard [RCF4122](http://www.ietf.org/rfc/rfc4122.txt)-specified format | Hyphen-seperated groups of 8, 4, 4, 4, 12 hexadecimal digits.<br><br> Example: `acde070d-8c4c-4f0d-9d8a-162843c10333`
With braces | The standard [RCF4122](http://www.ietf.org/rfc/rfc4122.txt)-specified format with braces.<br><br>Example: `{acde070d-8c4c-4f0d-9d8a-162843c10333}`
As `BYTES` | `UUID` value specified as bytes.<br><br>Example: `b'kafef00ddeadbeed'`
`UUID` used as a URN | `UUID` can be used as a Uniform Resource Name (URN). In that case, the format is [specified](https://www.ietf.org/rfc/rfc2141.txt) as "urn:uuid:" followed by standard [RCF4122](http://www.ietf.org/rfc/rfc4122.txt)-specified format.<br><br>Example: `urn:uuid:63616665-6630-3064-6465-616462656564`

## Size
A `UUID` value is 128 bits in width, but the total storage size is likely to be larger due to CockroachDB metadata.
A `UUID` value is 128 bits in width, but the total storage size is likely to be larger due to CockroachDB metadata.

## Examples

Expand Down
2 changes: 1 addition & 1 deletion v1.2/data-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ CockroachDB supports the following data types. Click a type for more details.
Type | Description | Example
-----|-------------|--------
[`INT`](int.html) | A signed integer, up to 64 bits. | `12345`
[`UUID`](uuid.html) | <span class="version-tag">New in v1.1:</span> A 128-bit hexadecimal value. | `7f9c24e8-3b12-4fef-91e0-56a2d5a246ec`
[`SERIAL`](serial.html) | A unique 64-bit signed integer. | `148591304110702593 `
[`DECIMAL`](decimal.html) | An exact, fixed-point number. | `1.2345`
[`FLOAT`](float.html) | A 64-bit, inexact, floating-point number. | `1.2345`
Expand All @@ -22,7 +23,6 @@ Type | Description | Example
[`COLLATE`](collate.html) | The `COLLATE` feature lets you sort [`STRING`](string.html) values according to language- and country-specific rules, known as collations. | `'a1b2c3' COLLATE en`
[`BYTES`](bytes.html) | A string of binary characters. | `b'\141\061\142\062\143\063'`
[`ARRAY`](array.html) | <span class="version-tag">New in v1.1:</span> A 1-dimensional, 1-indexed, homogenous array of any non-array data type. | `{"sky","road","car"}`
[`UUID`](uuid.html) | <span class="version-tag">New in v1.1:</span> A 128-bit hexadecimal value. | `7f9c24e8-3b12-4fef-91e0-56a2d5a246ec`

## Data Type Conversions & Casts

Expand Down
16 changes: 8 additions & 8 deletions v1.2/import.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ summary: Import CSV data into your CockroachDB cluster.
toc: false
---

The `IMPORT` [statement](sql-statements.html) imports tabular data (e.g. CSVs) into a single table.
The `IMPORT` [statement](sql-statements.html) imports tabular data (e.g. CSVs) into a single table.

{{site.data.alerts.callout_danger}}<strong>This is an experimental feature</strong>. To enable it, you must run <a href="set-cluster-setting.html"><code>SET CLUSTER SETTING experimental.importcsv.enabled = true</code></a>{{site.data.alerts.end}}

Expand Down Expand Up @@ -38,7 +38,7 @@ Before using [`IMPORT`](import.html), you should have:
- The schema of the table you want to import.
- The tabular data you want to import (e.g., CSV), preferably hosted on cloud storage.
- A location to store data before it is fully imported into all your nodes (referred to in this document as a "temp" directory). This location *must* be accessible to all nodes using the same address (i.e. cannot use a node's local file storage).

For ease of use, we recommend using cloud storage. However, if that isn't readily available to you, we also have a [guide on easily creating your own file server](create-a-file-server.html).

## Details
Expand Down Expand Up @@ -79,7 +79,7 @@ It's important to note, though, that after the single machine creates the Cockro

#### Available Storage Requirements

The node's first-listed/default [`store`](start-a-node.html#store) directory must have enough available storage equal to or greater than the size of the file you're importing.
The node's first-listed/default [`store`](start-a-node.html#store) directory must have enough available storage equal to or greater than the size of the file you're importing.

On [`cockroach start`](start-a-node.html), if you set `--max-disk-temp-storage`, it must also be greater than the size of the file you're importing.

Expand Down Expand Up @@ -295,7 +295,7 @@ Convert values to SQL *NULL* if they match the specified string.
> IMPORT TABLE customers
CREATE USING 'azure://acme-co/customer-create-table.sql?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'
CSV DATA ('azure://acme-co/customer-import-data.csv?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co')
WITH
WITH
temp = 'azure://acme-co/temp/?AZURE_ACCOUNT_KEY=hash&AZURE_ACCOUNT_NAME=acme-co'
;
~~~
Expand All @@ -304,7 +304,7 @@ WITH

~~~ sql
> IMPORT TABLE customers (
id SERIAL PRIMARY KEY,
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT,
INDEX name_idx (name)
)
Expand All @@ -318,7 +318,7 @@ WITH

~~~ sql
> IMPORT TABLE customers (
id SERIAL PRIMARY KEY,
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT,
INDEX name_idx (name)
)
Expand All @@ -333,7 +333,7 @@ WITH

~~~ sql
> IMPORT TABLE customers (
id SERIAL PRIMARY KEY,
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT,
INDEX name_idx (name)
)
Expand All @@ -348,7 +348,7 @@ WITH

~~~ sql
> IMPORT TABLE customers (
id SERIAL PRIMARY KEY,
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT,
INDEX name_idx (name)
)
Expand Down
6 changes: 1 addition & 5 deletions v1.2/serial.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,7 @@ toc: false

The `SERIAL` [data type](data-types.html) is a column data type that, on insert, generates a default integer from the timestamp and ID of the node executing the insert. This combination is likely to be globally unique except in extreme cases (see this [example](create-table.html#create-a-table-with-auto-generated-unique-row-ids) for more details). Also, because value generation does not require talking to other nodes, it is much faster than sequentially auto-incrementing a value, which requires distributed coordination.

{{site.data.alerts.callout_info}}
We believe this data type is a better solution than PostgreSQL's <code>SERIAL</code> and MySQL's <code>AUTO_INCREMENT</code> types, both of which auto-increment integers but not necessarily in a strictly sequential fashion (see the <a href="#auto-incrementing-is-not-always-sequential"> Auto-Incrementing Is Not Always Sequential </a> example below). However, if you find that this feature is incompatible with your application, please <a href="https://github.com/cockroachdb/cockroach/issues">open an
issue</a> or <a href="https://gitter.im/cockroachdb/cockroach">chat
with us on Gitter</a>.
{{site.data.alerts.end}}
{{site.data.alerts.callout_info}}In most cases, we recommend using the <a href="uuid.html"><code>UUID</code></a> data type with the <code>gen_random_uuid()</code> function as the default value, which generates 128-bit values (much larger than <code>SERIAL</code>'s 64-bit) and scatters them across all of a table's underlying key-value ranges, ensuring that multiple nodes share in the load. See <a href="uuid.html#create-a-table-with-auto-generated-unique-row-ids">Create a table with auto-generated unique row IDs</a> for more details.{{site.data.alerts.end}}

<div id="toc"></div>

Expand Down
Loading