Skip to content

Commit

Permalink
WIP on docs (#3813)
Browse files Browse the repository at this point in the history
* CLICKHOUSE-4063: less manual html @ index.md

* CLICKHOUSE-4063: recommend markdown="1" in README.md

* CLICKHOUSE-4003: manually purge custom.css for now

* CLICKHOUSE-4064: expand <details> before any print (including to pdf)

* CLICKHOUSE-3927: rearrange interfaces/formats.md a bit

* CLICKHOUSE-3306: add few http headers

* Remove copy-paste introduced in #3392

* Hopefully better chinese fonts #3392

* get rid of tabs @ custom.css

* Apply comments and patch from #3384

* Add jdbc.md to ToC and some translation, though it still looks badly incomplete

* minor punctuation

* Add some backlinks to official website from mirrors that just blindly take markdown sources

* Do not make fonts extra light

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's//g' {}

* find . -name '*.md' -type f | xargs -I{} perl -pi -e 's/ sql/g' {}

* Remove outdated stuff from roadmap.md

* Not so light font on front page too

* Refactor Chinese formats.md to match recent changes in other languages

* Update some links on front page

* Remove some outdated comment

* Add twitter link to front page

* More front page links tuning

* Add Amsterdam meetup link

* Smaller font to avoid second line

* Add Amsterdam link to README.md

* Proper docs nav translation

* Back to 300 font-weight except Chinese

* fix docs build

* Update Amsterdam link

* remove symlinks

* more zh punctuation

* apply lost comment by @zhang2014

* Apply comments by @zhang2014 from #3417

* Remove Beijing link

* rm incorrect symlink

* restore content of docs/zh/operations/table_engines/index.md

* CLICKHOUSE-3751: stem terms while searching docs

* CLICKHOUSE-3751: use English stemmer in non-English docs too

* CLICKHOUSE-4135 fix

* Remove past meetup link

* Add blog link to top nav

* Add ContentSquare article link

* Add form link to front page + refactor some texts

* couple markup fixes

* minor

* Introduce basic ODBC driver page in docs

* More verbose 3rd party libs disclaimer

* Put third-party stuff into a separate folder

* Separate third-party stuff in ToC too

* Update links

* Move stuff that is not really (only) a client library into a separate page

* Add clickhouse-hdfs-loader link

* Some introduction for "interfaces" section

* Rewrite tcp.md

* http_interface.md -> http.md

* fix link

* Remove unconvenient error for now

* try to guess anchor instead of failing

* remove symlink

* Remove outdated info from introduction

* remove ru roadmap.md

* replace ru roadmap.md with symlink

* Update roadmap.md

* lost file

* Title case in toc_en.yml

* Sync "Functions" ToC section with en

* Remove reference to pretty old ClickHouse release from docs

* couple lost symlinks in fa

* Close quote in proper place

* Rewrite en/getting_started/index.md

* Sync en<>ru getting_started/index.md

* minor changes

* Some gui.md refactoring

* Translate DataGrip section to ru

* Translate DataGrip section to zh

* Translate DataGrip section to fa

* Translate DBeaver section to fa

* Translate DBeaver section to zh

* Split third-party GUI to open-source and commercial

* Mention some RDBMS integrations + ad-hoc translation fixes

* Add rel="external nofollow" to outgoing links from docs

* Lost blank lines

* Fix class name

* More rel="external nofollow"

* Apply suggestions by @sundy-li

* Mobile version of front page improvements

* test

* test 2

* test 3

* Update LICENSE

* minor docs fix

* Highlight current article as suggested by @sundy-li

* fix link destination

* Introduce backup.md (only "en" for now)

* Mention INSERT+SELECT in backup.md

* Some improvements for replication.md

* Add backup.md to toc

* Mention clickhouse-backup tool

* Mention LightHouse in third-party GUI list

* Introduce interfaces/third-party/proxy.md

* Add clickhouse-bulk to proxy.md

* Major extension of integrations.md contents

* fix link target

* remove unneeded file

* better toc item name

* fix markdown

* better ru punctuation

* Add yet another possible backup approach

* Simplify copying permalinks to headers

* Support non-eng link anchors in docs + update some deps

* Generate anchors for single-page mode automatically

* Remove anchors to top of pages

* Remove anchors that nobody links to

* build fixes

* fix few links

* restore css

* fix some links

* restore gifs

* fix lost words

* more docs fixes

* docs fixes

* NULL anchor

* update urllib3 dependency

* more fixes
  • Loading branch information
blinkov committed Dec 12, 2018
1 parent a80376c commit 16ca492
Show file tree
Hide file tree
Showing 203 changed files with 933 additions and 907 deletions.
2 changes: 1 addition & 1 deletion docs/en/data_types/array.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ SELECT

## Working with data types

When creating an array on the fly, ClickHouse automatically defines the argument type as the narrowest data type that can store all the listed arguments. If there are any [NULL](../query_language/syntax.md#null-literal) or [Nullable](nullable.md#data_type-nullable) type arguments, the type of array elements is [Nullable](nullable.md#data_type-nullable).
When creating an array on the fly, ClickHouse automatically defines the argument type as the narrowest data type that can store all the listed arguments. If there are any [NULL](../query_language/syntax.md#null-literal) or [Nullable](nullable.md#data_type-nullable) type arguments, the type of array elements is [Nullable](nullable.md).

If ClickHouse couldn't determine the data type, it will generate an exception. For instance, this will happen when trying to create an array with strings and numbers simultaneously (`SELECT array(1, 'a')`).

Expand Down
1 change: 0 additions & 1 deletion docs/en/data_types/decimal.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
<a name="data_type-decimal"></a>

# Decimal(P, S), Decimal32(S), Decimal64(S), Decimal128(S)

Expand Down
5 changes: 2 additions & 3 deletions docs/en/data_types/enum.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
<a name="data_type-enum"></a>

# Enum8, Enum16

Expand Down Expand Up @@ -77,9 +76,9 @@ SELECT toTypeName(CAST('a', 'Enum8(\'a\' = 1, \'b\' = 2)'))

Each of the values is assigned a number in the range `-128 ... 127` for `Enum8` or in the range `-32768 ... 32767` for `Enum16`. All the strings and numbers must be different. An empty string is allowed. If this type is specified (in a table definition), numbers can be in an arbitrary order. However, the order does not matter.

Neither the string nor the numeric value in an `Enum` can be [NULL](../query_language/syntax.md#null-literal).
Neither the string nor the numeric value in an `Enum` can be [NULL](../query_language/syntax.md).

An `Enum` can be contained in [Nullable](nullable.md#data_type-nullable) type. So if you create a table using the query
An `Enum` can be contained in [Nullable](nullable.md) type. So if you create a table using the query

```
CREATE TABLE t_enum_nullable
Expand Down
2 changes: 1 addition & 1 deletion docs/en/data_types/float.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ SELECT 0 / 0
└──────────────┘
```

See the rules for `NaN` sorting in the section [ORDER BY clause](../query_language/select.md#query_language-queries-order_by).
See the rules for `NaN` sorting in the section [ORDER BY clause](../query_language/select.md).


[Original article](https://clickhouse.yandex/docs/en/data_types/float/) <!--hide-->
1 change: 0 additions & 1 deletion docs/en/data_types/int_uint.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
<a name="data_type-int"></a>

# UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ CREATE TABLE t
) ENGINE = ...
```

[uniq](../../query_language/agg_functions/reference.md#agg_function-uniq), anyIf ([any](../../query_language/agg_functions/reference.md#agg_function-any)+[If](../../query_language/agg_functions/combinators.md#agg-functions-combine-if)) and [quantiles](../../query_language/agg_functions/reference.md#agg_function-quantiles) are the aggregate functions supported in ClickHouse.
[uniq](../../query_language/agg_functions/reference.md#agg_function-uniq), anyIf ([any](../../query_language/agg_functions/reference.md#agg_function-any)+[If](../../query_language/agg_functions/combinators.md#agg-functions-combine-if)) and [quantiles](../../query_language/agg_functions/reference.md) are the aggregate functions supported in ClickHouse.

## Usage

Expand Down Expand Up @@ -60,7 +60,7 @@ SELECT uniqMerge(state) FROM (SELECT uniqState(UserID) AS state FROM table GROUP

## Usage Example

See [AggregatingMergeTree](../../operations/table_engines/aggregatingmergetree.md#table_engine-aggregatingmergetree) engine description.
See [AggregatingMergeTree](../../operations/table_engines/aggregatingmergetree.md) engine description.


[Original article](https://clickhouse.yandex/docs/en/data_types/nested_data_structures/aggregatefunction/) <!--hide-->
4 changes: 2 additions & 2 deletions docs/en/data_types/nullable.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

# Nullable(TypeName)

Allows to store special marker ([NULL](../query_language/syntax.md#null-literal)) that denotes "missing value" alongside normal values allowed by `TypeName`. For example, a `Nullable(Int8)` type column can store `Int8` type values, and the rows that don't have a value will store `NULL`.
Allows to store special marker ([NULL](../query_language/syntax.md)) that denotes "missing value" alongside normal values allowed by `TypeName`. For example, a `Nullable(Int8)` type column can store `Int8` type values, and the rows that don't have a value will store `NULL`.

For a `TypeName`, you can't use composite data types [Array](array.md#data_type is array) and [Tuple](tuple.md#data_type-tuple). Composite data types can contain `Nullable` type values, such as `Array(Nullable(Int8))`.
For a `TypeName`, you can't use composite data types [Array](array.md#data_type is array) and [Tuple](tuple.md). Composite data types can contain `Nullable` type values, such as `Array(Nullable(Int8))`.

A `Nullable` type field can't be included in table indexes.

Expand Down
3 changes: 1 addition & 2 deletions docs/en/data_types/special_data_types/nothing.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
<a name="special_data_type-nothing"></a>

# Nothing

The only purpose of this data type is to represent cases where value is not expected. So you can't create a `Nothing` type value.

For example, literal [NULL](../../query_language/syntax.md#null-literal) has type of `Nullable(Nothing)`. See more about [Nullable](../../data_types/nullable.md#data_type-nullable).
For example, literal [NULL](../../query_language/syntax.md#null-literal) has type of `Nullable(Nothing)`. See more about [Nullable](../../data_types/nullable.md).

The `Nothing` type can also used to denote empty arrays:

Expand Down
1 change: 0 additions & 1 deletion docs/en/data_types/string.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
<a name="data_types-string"></a>

# String

Expand Down
5 changes: 2 additions & 3 deletions docs/en/data_types/tuple.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
<a name="data_type-tuple"></a>

# Tuple(T1, T2, ...)

A tuple of elements, each having an individual [type](index.md#data_types).

You can't store tuples in tables (other than Memory tables). They are used for temporary column grouping. Columns can be grouped when an IN expression is used in a query, and for specifying certain formal parameters of lambda functions. For more information, see the sections [IN operators](../query_language/select.md#query_language-in_operators) and [Higher order functions](../query_language/functions/higher_order_functions.md#higher_order_functions).
You can't store tuples in tables (other than Memory tables). They are used for temporary column grouping. Columns can be grouped when an IN expression is used in a query, and for specifying certain formal parameters of lambda functions. For more information, see the sections [IN operators](../query_language/select.md) and [Higher order functions](../query_language/functions/higher_order_functions.md#higher_order_functions).

Tuples can be the result of a query. In this case, for text formats other than JSON, values are comma-separated in brackets. In JSON formats, tuples are output as arrays (in square brackets).

Expand Down Expand Up @@ -34,7 +33,7 @@ SELECT

## Working with data types

When creating a tuple on the fly, ClickHouse automatically detects the type of each argument as the minimum of the types which can store the argument value. If the argument is [NULL](../query_language/syntax.md#null-literal), the type of the tuple element is [Nullable](nullable.md#data_type-nullable).
When creating a tuple on the fly, ClickHouse automatically detects the type of each argument as the minimum of the types which can store the argument value. If the argument is [NULL](../query_language/syntax.md#null-literal), the type of the tuple element is [Nullable](nullable.md).

Example of automatic data type detection:

Expand Down
1 change: 0 additions & 1 deletion docs/en/getting_started/example_datasets/ontime.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
<a name="example_datasets-ontime"></a>

# OnTime

Expand Down
16 changes: 8 additions & 8 deletions docs/en/interfaces/formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ Only a small set of symbols are escaped. You can easily stumble onto a string va

Arrays are written as a list of comma-separated values in square brackets. Number items in the array are fomratted as normally, but dates, dates with times, and strings are written in single quotes with the same escaping rules as above.

[NULL](../query_language/syntax.md#null-literal) is formatted as `\N`.
[NULL](../query_language/syntax.md) is formatted as `\N`.

<a name="tabseparatedraw"></a>

Expand Down Expand Up @@ -141,7 +141,7 @@ SearchPhrase=curtain designs count()=1064
SearchPhrase=baku count()=1000
```

[NULL](../query_language/syntax.md#null-literal) is formatted as `\N`.
[NULL](../query_language/syntax.md) is formatted as `\N`.

``` sql
SELECT * FROM t_null FORMAT TSKV
Expand Down Expand Up @@ -267,7 +267,7 @@ If the query contains GROUP BY, rows_before_limit_at_least is the exact number o

This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).

ClickHouse supports [NULL](../query_language/syntax.md#null-literal), which is displayed as `null` in the JSON output.
ClickHouse supports [NULL](../query_language/syntax.md), which is displayed as `null` in the JSON output.

See also the JSONEachRow format.

Expand Down Expand Up @@ -361,7 +361,7 @@ Outputs data as Unicode-art tables, also using ANSI-escape sequences for setting
A full grid of the table is drawn, and each row occupies two lines in the terminal.
Each result block is output as a separate table. This is necessary so that blocks can be output without buffering results (buffering would be necessary in order to pre-calculate the visible width of all the values).

[NULL](../query_language/syntax.md#null-literal) is output as `ᴺᵁᴸᴸ`.
[NULL](../query_language/syntax.md) is output as `ᴺᵁᴸᴸ`.

``` sql
SELECT * FROM t_null
Expand Down Expand Up @@ -457,11 +457,11 @@ FixedString is represented simply as a sequence of bytes.

Array is represented as a varint length (unsigned [LEB128](https://en.wikipedia.org/wiki/LEB128)), followed by successive elements of the array.

For [NULL](../query_language/syntax.md#null-literal) support, an additional byte containing 1 or 0 is added before each [Nullable](../data_types/nullable.md#data_type-nullable) value. If 1, then the value is `NULL` and this byte is interpreted as a separate value. If 0, the value after the byte is not `NULL`.
For [NULL](../query_language/syntax.md#null-literal) support, an additional byte containing 1 or 0 is added before each [Nullable](../data_types/nullable.md) value. If 1, then the value is `NULL` and this byte is interpreted as a separate value. If 0, the value after the byte is not `NULL`.

## Values

Prints every row in brackets. Rows are separated by commas. There is no comma after the last row. The values inside the brackets are also comma-separated. Numbers are output in decimal format without quotes. Arrays are output in square brackets. Strings, dates, and dates with times are output in quotes. Escaping rules and parsing are similar to the [TabSeparated](#tabseparated) format. During formatting, extra spaces aren't inserted, but during parsing, they are allowed and skipped (except for spaces inside array values, which are not allowed). [NULL](../query_language/syntax.md#null-literal) is represented as `NULL`.
Prints every row in brackets. Rows are separated by commas. There is no comma after the last row. The values inside the brackets are also comma-separated. Numbers are output in decimal format without quotes. Arrays are output in square brackets. Strings, dates, and dates with times are output in quotes. Escaping rules and parsing are similar to the [TabSeparated](#tabseparated) format. During formatting, extra spaces aren't inserted, but during parsing, they are allowed and skipped (except for spaces inside array values, which are not allowed). [NULL](../query_language/syntax.md) is represented as `NULL`.

The minimum set of characters that you need to escape when passing data in Values ​​format: single quotes and backslashes.

Expand All @@ -473,7 +473,7 @@ This is the format that is used in `INSERT INTO t VALUES ...`, but you can also

Prints each value on a separate line with the column name specified. This format is convenient for printing just one or a few rows, if each row consists of a large number of columns.

[NULL](../query_language/syntax.md#null-literal) is output as `ᴺᵁᴸᴸ`.
[NULL](../query_language/syntax.md) is output as `ᴺᵁᴸᴸ`.

Example:

Expand Down Expand Up @@ -618,7 +618,7 @@ struct Message {
}
```

Schema files are in the file that is located in the directory specified in [ format_schema_path](../operations/server_settings/settings.md#server_settings-format_schema_path) in the server configuration.
Schema files are in the file that is located in the directory specified in [ format_schema_path](../operations/server_settings/settings.md) in the server configuration.

Deserialization is effective and usually doesn't increase the system load.

Expand Down
12 changes: 11 additions & 1 deletion docs/en/interfaces/third-party/gui.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,16 @@ The following features are planned for development:
- Cluster management.
- Monitoring replicated and Kafka tables.

### LightHouse

[LightHouse](https://github.com/VKCOM/lighthouse) is a lightweight web interface for ClickHouse.

Features:

- Table list with filtering and metadata.
- Table preview with filtering and sorting.
- Read-only queries execution.

## Commercial

### DBeaver
Expand All @@ -63,4 +73,4 @@ Features:
- Refactorings.
- Search and Navigation.

[Original article](https://clickhouse.yandex/docs/en/interfaces/third-party_gui/) <!--hide-->
[Original article](https://clickhouse.yandex/docs/en/interfaces/third-party/gui/) <!--hide-->
34 changes: 31 additions & 3 deletions docs/en/interfaces/third-party/integrations.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,47 @@
# Integration Libraries from Third-party Developers

!!! warning "Disclaimer"
Yandex does **not** maintain the libraries listed below and haven't done any extensive testing to ensure their quality.
Yandex does **not** maintain the tools and libraries listed below and haven't done any extensive testing to ensure their quality.

## Infrastructure Products

- Relational database management systems
- [MySQL](https://www.mysql.com)
- [ProxySQL](https://github.com/sysown/proxysql/wiki/ClickHouse-Support)
- [clickhouse-mysql-data-reader](https://github.com/Altinity/clickhouse-mysql-data-reader)
- [PostgreSQL](https://www.postgresql.org)
- [infi.clickhouse_fdw](https://github.com/Infinidat/infi.clickhouse_fdw) (uses [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm))
- Object store
- S3
- [MSSQL](https://en.wikipedia.org/wiki/Microsoft_SQL_Server)
- [ClickHouseMigrator](https://github.com/zlzforever/ClickHouseMigrator)
- Message queues
- [Kafka](https://kafka.apache.org)
- [clickhouse_sinker](https://github.com/housepower/clickhouse_sinker) (uses [Go client](https://github.com/kshvakov/clickhouse/))
- Object storages
- [S3](https://en.wikipedia.org/wiki/Amazon_S3)
- [clickhouse-backup](https://github.com/AlexAkulov/clickhouse-backup)
- Monitoring
- [Graphite](https://graphiteapp.org)
- [graphouse](https://github.com/yandex/graphouse)
- [carbon-clickhouse](https://github.com/lomik/carbon-clickhouse)
- [Grafana](https://grafana.com/)
- [clickhouse-grafana](https://github.com/Vertamedia/clickhouse-grafana)
- [Prometheus](https://prometheus.io/)
- [clickhouse_exporter](https://github.com/f1yegor/clickhouse_exporter)
- [PromHouse](https://github.com/Percona-Lab/PromHouse)
- Logging
- [fluentd](https://www.fluentd.org)
- [loghouse](https://github.com/flant/loghouse) (for [Kubernetes](https://kubernetes.io))

## Programming Language Ecosystems

- Python
- [SQLAlchemy](https://www.sqlalchemy.org)
- [sqlalchemy-clickhouse](https://github.com/cloudflare/sqlalchemy-clickhouse) (uses [infi.clickhouse_orm](https://github.com/Infinidat/infi.clickhouse_orm))
- [pandas](https://pandas.pydata.org)
- [pandahouse](https://github.com/kszucs/pandahouse)
- R
- [dplyr](https://db.rstudio.com/dplyr/)
- [RClickhouse](https://github.com/IMSMWU/RClickhouse) (uses [clickhouse-cpp](https://github.com/artpaul/clickhouse-cpp))
- Java
- [Hadoop](http://hadoop.apache.org)
- [clickhouse-hdfs-loader](https://github.com/jaykelin/clickhouse-hdfs-loader) (uses [JDBC](../jdbc.md))
Expand Down
39 changes: 39 additions & 0 deletions docs/en/interfaces/third-party/proxy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Proxy Servers from Third-party Developers

## chproxy

[chproxy](https://github.com/Vertamedia/chproxy), is an http proxy and load balancer for ClickHouse database.

Features:

* Per-user routing and response caching.
* Flexible limits.
* Automatic SSL cerificate renewal.

Implemented in Go.

## KittenHouse

[KittenHouse](https://github.com/VKCOM/kittenhouse) is designed to be a local proxy between ClickHouse and application server in case it's impossible or inconvenient to buffer INSERT data on your application side.

Features:

* In-memory and on-disk data buffering.
* Per-table routing.
* Load-balancing and health checking.

Implemented in Go.

## ClickHouse-Bulk

[ClickHouse-Bulk](https://github.com/nikepan/clickhouse-bulk) is a simple ClickHouse insert collector.

Features:

* Group requests and send by threshold or interval.
* Multiple remote servers.
* Basic authentication.

Implemented in Go.

[Original article](https://clickhouse.yandex/docs/en/interfaces/third-party/proxy/) <!--hide-->
2 changes: 1 addition & 1 deletion docs/en/introduction/distinctive_features.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,6 @@ ClickHouse provides various ways to trade accuracy for performance:

Uses asynchronous multimaster replication. After being written to any available replica, data is distributed to all the remaining replicas in the background. The system maintains identical data on different replicas. Recovery after most failures is performed automatically, and in complex cases — semi-automatically.

For more information, see the section [Data replication](../operations/table_engines/replication.md#table_engines-replication).
For more information, see the section [Data replication](../operations/table_engines/replication.md).

[Original article](https://clickhouse.yandex/docs/en/introduction/distinctive_features/) <!--hide-->
2 changes: 1 addition & 1 deletion docs/en/introduction/performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

According to internal testing results at Yandex, ClickHouse shows the best performance (both the highest throughput for long queries and the lowest latency on short queries) for comparable operating scenarios among systems of its class that were available for testing. You can view the test results on a [separate page](https://clickhouse.yandex/benchmark.html).

This has also been confirmed by numerous independent benchmarks. They are not difficult to find using an internet search, or you can see [our small collection of related links](https://clickhouse.yandex/#independent-bookmarks).
This has also been confirmed by numerous independent benchmarks. They are not difficult to find using an internet search, or you can see [our small collection of related links](https://clickhouse.yandex/#independent-benchmarks).

## Throughput for a Single Large Query

Expand Down

0 comments on commit 16ca492

Please sign in to comment.