Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update of english documentation #2918

Merged
merged 4 commits into from
Sep 4, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
432 changes: 237 additions & 195 deletions CHANGELOG.md

Large diffs are not rendered by default.

84 changes: 82 additions & 2 deletions docs/en/data_types/array.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,85 @@
<a name="data_type-array"></a>

# Array(T)

An array of elements of type T. The T type can be any type, including an array.
We don't recommend using multidimensional arrays, because they are not well supported (for example, you can't store multidimensional arrays in tables with a MergeTree engine).
Array of `T`-type items.

`T` can be anything, including an array. Use multi-dimensional arrays with caution. ClickHouse has limited support for multi-dimensional arrays. For example, they can't be stored in `MergeTree` tables.

## Creating an array

You can use a function to create an array:

```
array(T)
```

You can also use square brackets.

```
[]
```

Example of creating an array:

```
:) SELECT array(1, 2) AS x, toTypeName(x)

SELECT
[1, 2] AS x,
toTypeName(x)

┌─x─────┬─toTypeName(array(1, 2))─┐
│ [1,2] │ Array(UInt8) │
└───────┴─────────────────────────┘

1 rows in set. Elapsed: 0.002 sec.

:) SELECT [1, 2] AS x, toTypeName(x)

SELECT
[1, 2] AS x,
toTypeName(x)

┌─x─────┬─toTypeName([1, 2])─┐
│ [1,2] │ Array(UInt8) │
└───────┴────────────────────┘

1 rows in set. Elapsed: 0.002 sec.
```

## Working with data types

When creating an array on the fly, ClickHouse automatically defines the argument type as the narrowest data type that can store all the listed arguments. If there are any [NULL](../query_language/syntax.md#null-literal) or [Nullable](nullable.md#data_type-nullable) type arguments, the type of array elements is [Nullable](nullable.md#data_type-nullable).

If ClickHouse couldn't determine the data type, it will generate an exception. For instance, this will happen when trying to create an array with strings and numbers simultaneously (`SELECT array(1, 'a')`).

Examples of automatic data type detection:

```
:) SELECT array(1, 2, NULL) AS x, toTypeName(x)

SELECT
[1, 2, NULL] AS x,
toTypeName(x)

┌─x──────────┬─toTypeName(array(1, 2, NULL))─┐
│ [1,2,NULL] │ Array(Nullable(UInt8)) │
└────────────┴───────────────────────────────┘

1 rows in set. Elapsed: 0.002 sec.
```

If you try to create an array of incompatible data types, ClickHouse throws an exception:

```
:) SELECT array(1, 'a')

SELECT [1, 'a']

Received exception from server (version 1.1.54388):
Code: 386. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: There is no supertype for types UInt8, String because some of them are String/FixedString and some of them are not.

0 rows in set. Elapsed: 0.246 sec.
```

2 changes: 2 additions & 0 deletions docs/en/data_types/datetime.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
<a name="data_type-datetime"></a>

# DateTime

Date with time. Stored in four bytes as a Unix timestamp (unsigned). Allows storing values in the same range as for the Date type. The minimal value is output as 0000-00-00 00:00:00.
Expand Down
97 changes: 90 additions & 7 deletions docs/en/data_types/enum.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,101 @@
# Enum
<a name="data_type-enum"></a>

Enum8 or Enum16. A finite set of string values that can be stored more efficiently than the `String` data type.
# Enum8, Enum16

Example:
Includes the `Enum8` and `Enum16` types. `Enum` saves the final set of pairs of `'string' = integer`. In ClickHouse , all operations with the `Enum` data type are performed as if with numbers, although the user is working with string constants. This is more effective in terms of performance than working with the `String` data type.

```text
Enum8('hello' = 1, 'world' = 2)
- `Enum8` is described by pairs of `'String' = Int8`.
- `Enum16` is described by pairs of `'String' = Int16`.

## Usage examples

Here we create a table with an `Enum8('hello' = 1, 'world' = 2)` type column.

```
CREATE TABLE t_enum
(
x Enum8('hello' = 1, 'world' = 2)
)
ENGINE = TinyLog
```

This column `x` can only store the values that are listed in the type definition: `'hello'` or `'world'`. If you try to save a different value, ClickHouse generates an exception.

```
:) INSERT INTO t_enum Values('hello'),('world'),('hello')

INSERT INTO t_enum VALUES

Ok.

3 rows in set. Elapsed: 0.002 sec.

:) insert into t_enum values('a')

INSERT INTO t_enum VALUES


Exception on client:
Code: 49. DB::Exception: Unknown element 'a' for type Enum8('hello' = 1, 'world' = 2)
```

When you query data from the table, ClickHouse outputs the string values from `Enum`.

```
SELECT * FROM t_enum

┌─x─────┐
│ hello │
│ world │
│ hello │
└───────┘
```

- A data type with two possible values: 'hello' and 'world'.
If you need to see the numeric equivalents of the rows, you must cast the type.

```
SELECT CAST(x, 'Int8') FROM t_enum

┌─CAST(x, 'Int8')─┐
│ 1 │
│ 2 │
│ 1 │
└─────────────────┘
```

To create an Enum value in a query, you also need the `CAST` function.

```
SELECT toTypeName(CAST('a', 'Enum8(\'a\' = 1, \'b\' = 2)'))

┌─toTypeName(CAST('a', 'Enum8(\'a\' = 1, \'b\' = 2)'))─┐
│ Enum8('a' = 1, 'b' = 2) │
└──────────────────────────────────────────────────────┘
```

## General rules and usage

Each of the values is assigned a number in the range `-128 ... 127` for `Enum8` or in the range `-32768 ... 32767` for `Enum16`. All the strings and numbers must be different. An empty string is allowed. If this type is specified (in a table definition), numbers can be in an arbitrary order. However, the order does not matter.

In RAM, this type of column is stored in the same way as `Int8` or `Int16` of the corresponding numerical values.
Neither the string nor the numeric value in an `Enum` can be [NULL](../query_language/syntax.md#null-literal).

`An Enum` can be passed to a [Nullable](nullable.md#data_type-nullable) type. So if you create a table using the query

```
CREATE TABLE t_enum_nullable
(
x Nullable( Enum8('hello' = 1, 'world' = 2) )
)
ENGINE = TinyLog
```

it can store not only `'hello'` and `'world'`, but `NULL`, as well.

```
INSERT INTO t_enum_null Values('hello'),('world'),(NULL)
```

In RAM, an `Enum` column is stored in the same way as `Int8` or `Int16` of the corresponding numerical values.
When reading in text form, ClickHouse parses the value as a string and searches for the corresponding string from the set of Enum values. If it is not found, an exception is thrown. When reading in text format, the string is read and the corresponding numeric value is looked up. An exception will be thrown if it is not found.
When writing in text form, it writes the value as the corresponding string. If column data contains garbage (numbers that are not from the valid set), an exception is thrown. When reading and writing in binary form, it works the same way as for Int8 and Int16 data types.
The implicit default value is the value with the lowest number.
Expand Down
3 changes: 2 additions & 1 deletion docs/en/data_types/float.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ We recommend that you store data in integer form whenever possible. For example,
```sql
SELECT 1 - 0.9
```

```
┌───────minus(1, 0.9)─┐
│ 0.09999999999999998 │
Expand Down Expand Up @@ -66,5 +67,5 @@ SELECT 0 / 0
└──────────────┘
```

See the rules for ` NaN` sorting in the section [ORDER BY clause](../query_language/select.md#query_language-queries-order_by).
See the rules for `NaN` sorting in the section [ORDER BY clause](../query_language/select.md#query_language-queries-order_by).

2 changes: 2 additions & 0 deletions docs/en/data_types/int_uint.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
<a name="data_type-int"></a>

# UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64

Fixed-length integers, with or without a sign.
Expand Down
63 changes: 63 additions & 0 deletions docs/en/data_types/nullable.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
<a name="data_type-nullable"></a>

# Nullable(TypeName)

Allows you to work with the `TypeName` value or without it ([NULL](../query_language/syntax.md#null-literal)) in the same variable, including storage of `NULL` tables with the `TypeName` values. For example, a `Nullable(Int8)` type column can store `Int8` type values, and the rows that don't have a value will store `NULL`.

For a `TypeName`, you can't use composite data types [Array](array.md#data_type is array) and [Tuple](tuple.md#data_type-tuple). Composite data types can contain `Nullable` type values, such as `Array(Nullable(Int8))`.

A `Nullable` type field can't be included in indexes.

`NULL` is the default value for the `Nullable` type, unless specified otherwise in the ClickHouse server configuration.

##Storage features

For storing `Nullable` type values, ClickHouse uses:

- A separate file with `NULL` masks (referred to as the mask).
- The file with the values.

The mask determines what is in a data cell: `NULL` or a value.

When the mask indicates that `NULL` is stored in a cell, the file with values stores the default value for the data type. So if the field has the type `Nullable(Int8)`, the cell will store the default value for `Int8`. This feature increases storage capacity.

!!! Note:
Using `Nullable` almost always reduces performance, so keep this in mind when designing your databases.

## Usage example

```
:) CREATE TABLE t_null(x Int8, y Nullable(Int8)) ENGINE TinyLog

CREATE TABLE t_null
(
x Int8,
y Nullable(Int8)
)
ENGINE = TinyLog

Ok.

0 rows in set. Elapsed: 0.012 sec.

:) INSERT INTO t_null VALUES (1, NULL)

INSERT INTO t_null VALUES

Ok.

1 rows in set. Elapsed: 0.007 sec.

:) SELECT x + y from t_null

SELECT x + y
FROM t_null

┌─plus(x, y)─┐
│ ᴺᵁᴸᴸ │
│ 5 │
└────────────┘

2 rows in set. Elapsed: 0.144 sec.
```

20 changes: 20 additions & 0 deletions docs/en/data_types/special_data_types/nothing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
<a name="special_data_type-nothing"></a>

# Nothing

The only purpose of this data type is to represent [NULL](../../query_language/syntax.md#null-literal), i.e., no value.

You can't create a `Nothing` type value, because it is used where a value is not expected. For example, `NULL` is written as `Nullable(Nothing)` ([Nullable](../../data_types/nullable.md#data_type-nullable) — this is the data type that allows storing `NULL` in tables.) The `Nothing` type is also used to denote empty arrays:

```bash
:) SELECT toTypeName(Array())

SELECT toTypeName([])

┌─toTypeName(array())─┐
│ Array(Nothing) │
└─────────────────────┘

1 rows in set. Elapsed: 0.062 sec.
```

2 changes: 2 additions & 0 deletions docs/en/data_types/string.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
<a name="data_types-string"></a>

# String

Strings of an arbitrary length. The length is not limited. The value can contain an arbitrary set of bytes, including null bytes.
Expand Down
52 changes: 50 additions & 2 deletions docs/en/data_types/tuple.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,54 @@
<a name="data_type-tuple"></a>

# Tuple(T1, T2, ...)

Tuples can't be written to tables (other than Memory tables). They are used for temporary column grouping. Columns can be grouped when an IN expression is used in a query, and for specifying certain formal parameters of lambda functions. For more information, see "IN operators" and "Higher order functions".
A tuple of elements of any [type](index.md#data_types). There can be one or more types of elements in a tuple.

You can't store tuples in tables (other than Memory tables). They are used for temporary column grouping. Columns can be grouped when an IN expression is used in a query, and for specifying certain formal parameters of lambda functions. For more information, see the sections [IN operators](../query_language/select.md#in_operators) and [Higher order functions](../query_language/functions/higher_order_functions.md#higher_order_functions).

Tuples can be the result of a query. In this case, for text formats other than JSON, values are comma-separated in brackets. In JSON formats, tuples are output as arrays (in square brackets).

## Creating a tuple

You can use a function to create a tuple

```
tuple(T1, T2, ...)
```

Example of creating a tuple:

```
:) SELECT tuple(1,'a') AS x, toTypeName(x)

SELECT
(1, 'a') AS x,
toTypeName(x)

┌─x───────┬─toTypeName(tuple(1, 'a'))─┐
│ (1,'a') │ Tuple(UInt8, String) │
└─────────┴───────────────────────────┘

1 rows in set. Elapsed: 0.021 sec.
```

## Working with data types

When creating a tuple on the fly, ClickHouse automatically detects the type of each argument as the minimum of the types which can store the argument value. If the argument is [NULL](../query_language/syntax.md#null-literal), the type of the tuple element is [Nullable](nullable.md#data_type-nullable).

Example of automatic data type detection:

```
SELECT tuple(1,NULL) AS x, toTypeName(x)

SELECT
(1, NULL) AS x,
toTypeName(x)

┌─x────────┬─toTypeName(tuple(1, NULL))──────┐
│ (1,NULL) │ Tuple(UInt8, Nullable(Nothing)) │
└──────────┴─────────────────────────────────┘

Tuples can be output as the result of running a query. In this case, for text formats other than JSON\*, values are comma-separated in brackets. In JSON\* formats, tuples are output as arrays (in square brackets).
1 rows in set. Elapsed: 0.002 sec.
```

2 changes: 1 addition & 1 deletion docs/en/development/build_osx.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ For the latest stable version, switch to the `stable` branch.
```bash
mkdir build
cd build
cmake .. -DCMAKE_CXX_COMPILER=`which g++-8` -DCMAKE_C_COMPILER=`which gcc-8`
cmake .. -DCMAKE_CXX_COMPILER=`which g++-8`-DCMAKE_C_COMPILER=`which gcc-8`
ninja
cd ..
```
Expand Down
Loading