diff --git a/docs/best-practices/json_type.md b/docs/best-practices/json_type.md index aafbb64b48d..b620e0db06e 100644 --- a/docs/best-practices/json_type.md +++ b/docs/best-practices/json_type.md @@ -9,7 +9,7 @@ show_related_blogs: true doc_type: 'reference' --- -ClickHouse now offers a native JSON column type designed for semi-structured and dynamic data. It's important to clarify that **this is a column type, not a data format**—you can insert JSON into ClickHouse as a string or via supported formats like [JSONEachRow](/docs/interfaces/formats/JSONEachRow), but that does not imply using the JSON column type. Users should only use the JSON type when the structure of their data is dynamic, not when they simply happen to store JSON. +ClickHouse now offers a native JSON column type designed for semi-structured and dynamic data. It's important to clarify that **this is a column type, not a data format**—you can insert JSON into ClickHouse as a string or via supported formats like [JSONEachRow](/interfaces/formats/JSONEachRow), but that does not imply using the JSON column type. Users should only use the JSON type when the structure of their data is dynamic, not when they simply happen to store JSON. ## When to use the JSON type {#when-to-use-the-json-type} diff --git a/docs/chdb/guides/querying-s3-bucket.md b/docs/chdb/guides/querying-s3-bucket.md index 23bfff26465..4a7fde5ea27 100644 --- a/docs/chdb/guides/querying-s3-bucket.md +++ b/docs/chdb/guides/querying-s3-bucket.md @@ -49,7 +49,7 @@ To do this, we can use the [`s3` table function](/sql-reference/table-functions/ If you pass just the bucket name it will throw an exception. ::: -We're also going to use the [`One`](/interfaces/formats#data-format-one) input format so that the file isn't parsed, instead a single row is returned per file and we can access the file via the `_file` virtual column and the path via the `_path` virtual column. +We're also going to use the [`One`](/interfaces/formats/One) input format so that the file isn't parsed, instead a single row is returned per file and we can access the file via the `_file` virtual column and the path via the `_path` virtual column. ```python import chdb diff --git a/docs/integrations/data-ingestion/clickpipes/kinesis.md b/docs/integrations/data-ingestion/clickpipes/kinesis.md index 052497fd04f..b6c31be26bd 100644 --- a/docs/integrations/data-ingestion/clickpipes/kinesis.md +++ b/docs/integrations/data-ingestion/clickpipes/kinesis.md @@ -86,7 +86,7 @@ You have familiarized yourself with the [ClickPipes intro](./index.md) and setup ## Supported data formats {#supported-data-formats} The supported formats are: -- [JSON](../../../interfaces/formats.md/#json) +- [JSON](/interfaces/formats/JSON) ## Supported data types {#supported-data-types} diff --git a/docs/integrations/data-ingestion/data-formats/arrow-avro-orc.md b/docs/integrations/data-ingestion/data-formats/arrow-avro-orc.md index 86c333d4d83..74e503d8efb 100644 --- a/docs/integrations/data-ingestion/data-formats/arrow-avro-orc.md +++ b/docs/integrations/data-ingestion/data-formats/arrow-avro-orc.md @@ -15,7 +15,7 @@ Apache has released multiple data formats actively used in analytics environment ClickHouse supports reading and writing [Apache Avro](https://avro.apache.org/) data files, which are widely used in Hadoop systems. -To import from an [avro file](assets/data.avro), we should use [Avro](/interfaces/formats.md/#data-format-avro) format in the `INSERT` statement: +To import from an [avro file](assets/data.avro), we should use [Avro](/interfaces/formats/Avro) format in the `INSERT` statement: ```sql INSERT INTO sometable @@ -70,7 +70,7 @@ LIMIT 3; ### Avro messages in Kafka {#avro-messages-in-kafka} -When Kafka messages use Avro format, ClickHouse can read such streams using [AvroConfluent](/interfaces/formats.md/#data-format-avro-confluent) format and [Kafka](/engines/table-engines/integrations/kafka.md) engine: +When Kafka messages use Avro format, ClickHouse can read such streams using [AvroConfluent](/interfaces/formats/AvroConfluent) format and [Kafka](/engines/table-engines/integrations/kafka.md) engine: ```sql CREATE TABLE some_topic_stream @@ -87,7 +87,7 @@ kafka_format = 'AvroConfluent'; ## Working with Arrow format {#working-with-arrow-format} -Another columnar format is [Apache Arrow](https://arrow.apache.org/), also supported by ClickHouse for import and export. To import data from an [Arrow file](assets/data.arrow), we use the [Arrow](/interfaces/formats.md/#data-format-arrow) format: +Another columnar format is [Apache Arrow](https://arrow.apache.org/), also supported by ClickHouse for import and export. To import data from an [Arrow file](assets/data.arrow), we use the [Arrow](/interfaces/formats/Arrow) format: ```sql INSERT INTO sometable @@ -107,7 +107,7 @@ Also, check [data types matching](/interfaces/formats/Arrow#data-types-matching) ### Arrow data streaming {#arrow-data-streaming} -The [ArrowStream](/interfaces/formats.md/#data-format-arrow-stream) format can be used to work with Arrow streaming (used for in-memory processing). ClickHouse can read and write Arrow streams. +The [ArrowStream](/interfaces/formats/ArrowStream) format can be used to work with Arrow streaming (used for in-memory processing). ClickHouse can read and write Arrow streams. To demonstrate how ClickHouse can stream Arrow data, let's pipe it to the following python script (it reads input stream in Arrow streaming format and outputs the result as a Pandas table): @@ -140,7 +140,7 @@ We've used `arrow-stream` as a possible source of Arrow streaming data. ## Importing and exporting ORC data {#importing-and-exporting-orc-data} -[Apache ORC](https://orc.apache.org/) format is a columnar storage format typically used for Hadoop. ClickHouse supports importing as well as exporting [Orc data](assets/data.orc) using [ORC format](/interfaces/formats.md/#data-format-orc): +[Apache ORC](https://orc.apache.org/) format is a columnar storage format typically used for Hadoop. ClickHouse supports importing as well as exporting [Orc data](assets/data.orc) using [ORC format](/interfaces/formats/ORC): ```sql SELECT * diff --git a/docs/integrations/data-ingestion/data-formats/binary.md b/docs/integrations/data-ingestion/data-formats/binary.md index e720dbce392..1d12035d1ab 100644 --- a/docs/integrations/data-ingestion/data-formats/binary.md +++ b/docs/integrations/data-ingestion/data-formats/binary.md @@ -16,7 +16,7 @@ We're going to use some_data [table](assets/some_data.sql) and [data](assets/som ## Exporting in a Native ClickHouse format {#exporting-in-a-native-clickhouse-format} -The most efficient data format to export and import data between ClickHouse nodes is [Native](/interfaces/formats.md/#native) format. Exporting is done using `INTO OUTFILE` clause: +The most efficient data format to export and import data between ClickHouse nodes is [Native](/interfaces/formats/Native) format. Exporting is done using `INTO OUTFILE` clause: ```sql SELECT * FROM some_data @@ -74,7 +74,7 @@ FORMAT Native ## Exporting to RowBinary {#exporting-to-rowbinary} -Another binary format supported is [RowBinary](/interfaces/formats.md/#rowbinary), which allows importing and exporting data in binary-represented rows: +Another binary format supported is [RowBinary](/interfaces/formats/RowBinary), which allows importing and exporting data in binary-represented rows: ```sql SELECT * FROM some_data @@ -101,7 +101,7 @@ LIMIT 5 └────────────────────────────────┴────────────┴──────┘ ``` -Consider using [RowBinaryWithNames](/interfaces/formats.md/#rowbinarywithnames), which also adds a header row with a columns list. [RowBinaryWithNamesAndTypes](/interfaces/formats.md/#rowbinarywithnamesandtypes) will also add an additional header row with column types. +Consider using [RowBinaryWithNames](/interfaces/formats/RowBinaryWithNames), which also adds a header row with a columns list. [RowBinaryWithNamesAndTypes](/interfaces/formats/RowBinaryWithNamesAndTypes) will also add an additional header row with column types. ### Importing from RowBinary files {#importing-from-rowbinary-files} To load data from a RowBinary file, we can use a `FROM INFILE` clause: @@ -115,7 +115,7 @@ FORMAT RowBinary ## Importing single binary value using RawBLOB {#importing-single-binary-value-using-rawblob} Suppose we want to read an entire binary file and save it into a field in a table. -This is the case when the [RawBLOB format](/interfaces/formats.md/#rawblob) can be used. This format can be directly used with a single-column table only: +This is the case when the [RawBLOB format](/interfaces/formats/RawBLOB) can be used. This format can be directly used with a single-column table only: ```sql CREATE TABLE images(data String) ENGINE = Memory @@ -152,7 +152,7 @@ Note that we had to use `LIMIT 1` because exporting more than a single value wil ## MessagePack {#messagepack} -ClickHouse supports importing and exporting to [MessagePack](https://msgpack.org/) using the [MsgPack](/interfaces/formats.md/#msgpack). To export to MessagePack format: +ClickHouse supports importing and exporting to [MessagePack](https://msgpack.org/) using the [MsgPack](/interfaces/formats/MsgPack). To export to MessagePack format: ```sql SELECT * @@ -173,7 +173,7 @@ FORMAT MsgPack -To work with [Protocol Buffers](/interfaces/formats.md/#protobuf) we first need to define a [schema file](assets/schema.proto): +To work with [Protocol Buffers](/interfaces/formats/Protobuf) we first need to define a [schema file](assets/schema.proto): ```protobuf syntax = "proto3"; @@ -185,7 +185,7 @@ message MessageType { }; ``` -Path to this schema file (`schema.proto` in our case) is set in a `format_schema` settings option for the [Protobuf](/interfaces/formats.md/#protobuf) format: +Path to this schema file (`schema.proto` in our case) is set in a `format_schema` settings option for the [Protobuf](/interfaces/formats/Protobuf) format: ```sql SELECT * FROM some_data @@ -194,7 +194,7 @@ FORMAT Protobuf SETTINGS format_schema = 'schema:MessageType' ``` -This saves data to the [proto.bin](assets/proto.bin) file. ClickHouse also supports importing Protobuf data as well as nested messages. Consider using [ProtobufSingle](/interfaces/formats.md/#protobufsingle) to work with a single Protocol Buffer message (length delimiters will be omitted in this case). +This saves data to the [proto.bin](assets/proto.bin) file. ClickHouse also supports importing Protobuf data as well as nested messages. Consider using [ProtobufSingle](/interfaces/formats/ProtobufSingle) to work with a single Protocol Buffer message (length delimiters will be omitted in this case). ## Cap'n Proto {#capn-proto} @@ -212,7 +212,7 @@ struct PathStats { } ``` -Now we can import and export using [CapnProto](/interfaces/formats.md/#capnproto) format and this schema: +Now we can import and export using [CapnProto](/interfaces/formats/CapnProto) format and this schema: ```sql SELECT diff --git a/docs/integrations/data-ingestion/data-formats/csv-tsv.md b/docs/integrations/data-ingestion/data-formats/csv-tsv.md index c7b49429ee3..8d5d414d433 100644 --- a/docs/integrations/data-ingestion/data-formats/csv-tsv.md +++ b/docs/integrations/data-ingestion/data-formats/csv-tsv.md @@ -31,7 +31,7 @@ To import data from the [CSV file](assets/data_small.csv) to the `sometable` tab clickhouse-client -q "INSERT INTO sometable FORMAT CSV" < data_small.csv ``` -Note that we use [FORMAT CSV](/interfaces/formats.md/#csv) to let ClickHouse know we're ingesting CSV formatted data. Alternatively, we can load data from a local file using the [FROM INFILE](/sql-reference/statements/insert-into.md/#inserting-data-from-a-file) clause: +Note that we use [FORMAT CSV](/interfaces/formats/CSV) to let ClickHouse know we're ingesting CSV formatted data. Alternatively, we can load data from a local file using the [FROM INFILE](/sql-reference/statements/insert-into.md/#inserting-data-from-a-file) clause: ```sql INSERT INTO sometable @@ -59,7 +59,7 @@ head data-small-headers.csv "Aegithina_tiphia","2018-02-01",34 ``` -To import data from this file, we can use [CSVWithNames](/interfaces/formats.md/#csvwithnames) format: +To import data from this file, we can use [CSVWithNames](/interfaces/formats/CSVWithNames) format: ```bash clickhouse-client -q "INSERT INTO sometable FORMAT CSVWithNames" < data_small_headers.csv @@ -153,17 +153,17 @@ SELECT * FROM file('nulls.csv') ## TSV (tab-separated) files {#tsv-tab-separated-files} -Tab-separated data format is widely used as a data interchange format. To load data from a [TSV file](assets/data_small.tsv) to ClickHouse, the [TabSeparated](/interfaces/formats.md/#tabseparated) format is used: +Tab-separated data format is widely used as a data interchange format. To load data from a [TSV file](assets/data_small.tsv) to ClickHouse, the [TabSeparated](/interfaces/formats/TabSeparated) format is used: ```bash clickhouse-client -q "INSERT INTO sometable FORMAT TabSeparated" < data_small.tsv ``` -There's also a [TabSeparatedWithNames](/interfaces/formats.md/#tabseparatedwithnames) format to allow working with TSV files that have headers. And, like for CSV, we can skip the first X lines using the [input_format_tsv_skip_first_lines](/operations/settings/settings-formats.md/#input_format_tsv_skip_first_lines) option. +There's also a [TabSeparatedWithNames](/interfaces/formats/TabSeparatedWithNames) format to allow working with TSV files that have headers. And, like for CSV, we can skip the first X lines using the [input_format_tsv_skip_first_lines](/operations/settings/settings-formats.md/#input_format_tsv_skip_first_lines) option. ### Raw TSV {#raw-tsv} -Sometimes, TSV files are saved without escaping tabs and line breaks. We should use [TabSeparatedRaw](/interfaces/formats.md/#tabseparatedraw) to handle such files. +Sometimes, TSV files are saved without escaping tabs and line breaks. We should use [TabSeparatedRaw](/interfaces/formats/TabSeparatedRaw) to handle such files. ## Exporting to CSV {#exporting-to-csv} @@ -183,7 +183,7 @@ FORMAT CSV "2016_Greater_Western_Sydney_Giants_season","2017-05-01",86 ``` -To add a header to the CSV file, we use the [CSVWithNames](/interfaces/formats.md/#csvwithnames) format: +To add a header to the CSV file, we use the [CSVWithNames](/interfaces/formats/CSVWithNames) format: ```sql SELECT * @@ -273,7 +273,7 @@ All column types will be treated as a `String` in this case. ### Exporting and importing CSV with explicit column types {#exporting-and-importing-csv-with-explicit-column-types} -ClickHouse also allows explicitly setting column types when exporting data using [CSVWithNamesAndTypes](/interfaces/formats.md/#csvwithnamesandtypes) (and other *WithNames formats family): +ClickHouse also allows explicitly setting column types when exporting data using [CSVWithNamesAndTypes](/interfaces/formats/CSVWithNamesAndTypes) (and other *WithNames formats family): ```sql SELECT * @@ -308,7 +308,7 @@ Now ClickHouse identifies column types based on a (second) header row instead of ## Custom delimiters, separators, and escaping rules {#custom-delimiters-separators-and-escaping-rules} -In sophisticated cases, text data can be formatted in a highly custom manner but still have a structure. ClickHouse has a special [CustomSeparated](/interfaces/formats.md/#format-customseparated) format for such cases, which allows setting custom escaping rules, delimiters, line separators, and starting/ending symbols. +In sophisticated cases, text data can be formatted in a highly custom manner but still have a structure. ClickHouse has a special [CustomSeparated](/interfaces/formats/CustomSeparated) format for such cases, which allows setting custom escaping rules, delimiters, line separators, and starting/ending symbols. Suppose we have the following data in the file: @@ -341,7 +341,7 @@ LIMIT 3 └───────────────────────────┴────────────┴─────┘ ``` -We can also use [CustomSeparatedWithNames](/interfaces/formats.md/#customseparatedwithnames) to get headers exported and imported correctly. Explore [regex and template](templates-regex.md) formats to deal with even more complex cases. +We can also use [CustomSeparatedWithNames](/interfaces/formats/CustomSeparatedWithNames) to get headers exported and imported correctly. Explore [regex and template](templates-regex.md) formats to deal with even more complex cases. ## Working with large CSV files {#working-with-large-csv-files} diff --git a/docs/integrations/data-ingestion/data-formats/json/exporting.md b/docs/integrations/data-ingestion/data-formats/json/exporting.md index a1e1627f5be..f13344b357b 100644 --- a/docs/integrations/data-ingestion/data-formats/json/exporting.md +++ b/docs/integrations/data-ingestion/data-formats/json/exporting.md @@ -8,7 +8,7 @@ doc_type: 'guide' # Exporting JSON -Almost any JSON format used for import can be used for export as well. The most popular is [`JSONEachRow`](/interfaces/formats.md/#jsoneachrow): +Almost any JSON format used for import can be used for export as well. The most popular is [`JSONEachRow`](/interfaces/formats/JSONEachRow): ```sql SELECT * FROM sometable FORMAT JSONEachRow @@ -19,7 +19,7 @@ SELECT * FROM sometable FORMAT JSONEachRow {"path":"Ahmadabad-e_Kalij-e_Sofla","month":"2017-01-01","hits":3} ``` -Or we can use [`JSONCompactEachRow`](/interfaces/formats#jsoncompacteachrow) to save disk space by skipping column names: +Or we can use [`JSONCompactEachRow`](/interfaces/formats/JSONCompactEachRow) to save disk space by skipping column names: ```sql SELECT * FROM sometable FORMAT JSONCompactEachRow @@ -32,7 +32,7 @@ SELECT * FROM sometable FORMAT JSONCompactEachRow ## Overriding data types as strings {#overriding-data-types-as-strings} -ClickHouse respects data types and will export JSON accordingly to standards. But in cases where we need to have all values encoded as strings, we can use the [JSONStringsEachRow](/interfaces/formats.md/#jsonstringseachrow) format: +ClickHouse respects data types and will export JSON accordingly to standards. But in cases where we need to have all values encoded as strings, we can use the [JSONStringsEachRow](/interfaces/formats/JSONStringsEachRow) format: ```sql SELECT * FROM sometable FORMAT JSONStringsEachRow @@ -56,7 +56,7 @@ SELECT * FROM sometable FORMAT JSONCompactStringsEachRow ## Exporting metadata together with data {#exporting-metadata-together-with-data} -General [JSON](/interfaces/formats.md/#json) format, which is popular in apps, will export not only resulting data but column types and query stats: +General [JSON](/interfaces/formats/JSON) format, which is popular in apps, will export not only resulting data but column types and query stats: ```sql SELECT * FROM sometable FORMAT JSON @@ -93,7 +93,7 @@ SELECT * FROM sometable FORMAT JSON } ``` -The [JSONCompact](/interfaces/formats.md/#jsoncompact) format will print the same metadata but use a compacted form for the data itself: +The [JSONCompact](/interfaces/formats/JSONCompact) format will print the same metadata but use a compacted form for the data itself: ```sql SELECT * FROM sometable FORMAT JSONCompact @@ -127,11 +127,11 @@ SELECT * FROM sometable FORMAT JSONCompact } ``` -Consider [`JSONStrings`](/interfaces/formats.md/#jsonstrings) or [`JSONCompactStrings`](/interfaces/formats.md/#jsoncompactstrings) variants to encode all values as strings. +Consider [`JSONStrings`](/interfaces/formats/JSONStrings) or [`JSONCompactStrings`](/interfaces/formats/JSONCompactStrings) variants to encode all values as strings. ## Compact way to export JSON data and structure {#compact-way-to-export-json-data-and-structure} -A more efficient way to have data, as well as it's structure, is to use [`JSONCompactEachRowWithNamesAndTypes`](/interfaces/formats.md/#jsoncompacteachrowwithnamesandtypes) format: +A more efficient way to have data, as well as it's structure, is to use [`JSONCompactEachRowWithNamesAndTypes`](/interfaces/formats/JSONCompactEachRowWithNamesAndTypes) format: ```sql SELECT * FROM sometable FORMAT JSONCompactEachRowWithNamesAndTypes diff --git a/docs/integrations/data-ingestion/data-formats/json/formats.md b/docs/integrations/data-ingestion/data-formats/json/formats.md index b92c64c5365..3ebca753362 100644 --- a/docs/integrations/data-ingestion/data-formats/json/formats.md +++ b/docs/integrations/data-ingestion/data-formats/json/formats.md @@ -145,7 +145,7 @@ ENGINE = MergeTree ORDER BY tuple(month, path) ``` -To import a list of JSON objects, we can use a [`JSONEachRow`](/interfaces/formats.md/#jsoneachrow) format (inserting data from [list.json](../assets/list.json) file): +To import a list of JSON objects, we can use a [`JSONEachRow`](/interfaces/formats/JSONEachRow) format (inserting data from [list.json](../assets/list.json) file): ```sql INSERT INTO sometable @@ -191,7 +191,7 @@ cat objects.json } ``` -ClickHouse can load data from this kind of data using the [`JSONObjectEachRow`](/interfaces/formats.md/#jsonobjecteachrow) format: +ClickHouse can load data from this kind of data using the [`JSONObjectEachRow`](/interfaces/formats/JSONObjectEachRow) format: ```sql INSERT INTO sometable FROM INFILE 'objects.json' FORMAT JSONObjectEachRow; @@ -241,7 +241,7 @@ cat arrays.json ["1971-72_Utah_Stars_season", "2016-10-01", 1] ``` -In this case, ClickHouse will load this data and attribute each value to the corresponding column based on its order in the array. We use [`JSONCompactEachRow`](/interfaces/formats.md/#jsoncompacteachrow) format for this: +In this case, ClickHouse will load this data and attribute each value to the corresponding column based on its order in the array. We use [`JSONCompactEachRow`](/interfaces/formats/JSONCompactEachRow) format for this: ```sql SELECT * FROM sometable @@ -269,7 +269,7 @@ cat columns.json } ``` -ClickHouse uses the [`JSONColumns`](/interfaces/formats.md/#jsoncolumns) format to parse data formatted like that: +ClickHouse uses the [`JSONColumns`](/interfaces/formats/JSONColumns) format to parse data formatted like that: ```sql SELECT * FROM file('columns.json', JSONColumns) @@ -282,7 +282,7 @@ SELECT * FROM file('columns.json', JSONColumns) └────────────────────────────┴────────────┴──────┘ ``` -A more compact format is also supported when dealing with an [array of columns](../assets/columns-array.json) instead of an object using [`JSONCompactColumns`](/interfaces/formats.md/#jsoncompactcolumns) format: +A more compact format is also supported when dealing with an [array of columns](../assets/columns-array.json) instead of an object using [`JSONCompactColumns`](/interfaces/formats/JSONCompactColumns) format: ```sql SELECT * FROM file('columns-array.json', JSONCompactColumns) @@ -321,7 +321,7 @@ ENGINE = MergeTree ORDER BY () ``` -Now we can load data from the file into this table using [`JSONAsString`](/interfaces/formats.md/#jsonasstring) format to keep JSON objects instead of parsing them: +Now we can load data from the file into this table using [`JSONAsString`](/interfaces/formats/JSONAsString) format to keep JSON objects instead of parsing them: ```sql INSERT INTO events (data) @@ -431,7 +431,7 @@ ClickHouse will throw exceptions in cases of inconsistent JSON and table columns ClickHouse allows exporting to and importing data from [BSON](https://bsonspec.org/) encoded files. This format is used by some DBMSs, e.g. [MongoDB](https://github.com/mongodb/mongo) database. -To import BSON data, we use the [BSONEachRow](/interfaces/formats.md/#bsoneachrow) format. Let's import data from [this BSON file](../assets/data.bson): +To import BSON data, we use the [BSONEachRow](/interfaces/formats/BSONEachRow) format. Let's import data from [this BSON file](../assets/data.bson): ```sql SELECT * FROM file('data.bson', BSONEachRow) diff --git a/docs/integrations/data-ingestion/data-formats/json/loading.md b/docs/integrations/data-ingestion/data-formats/json/loading.md index 25aa01cf6a2..7986925f96f 100644 --- a/docs/integrations/data-ingestion/data-formats/json/loading.md +++ b/docs/integrations/data-ingestion/data-formats/json/loading.md @@ -15,7 +15,7 @@ The following examples provide a very simple example of loading structured and s ## Loading structured JSON {#loading-structured-json} -In this section, we assume the JSON data is in [`NDJSON`](https://github.com/ndjson/ndjson-spec) (Newline delimited JSON) format, known as [`JSONEachRow`](/interfaces/formats#jsoneachrow) in ClickHouse, and well structured i.e. the column names and types are fixed. `NDJSON` is the preferred format for loading JSON due to its brevity and efficient use of space, but others are supported for both [input and output](/interfaces/formats#json). +In this section, we assume the JSON data is in [`NDJSON`](https://github.com/ndjson/ndjson-spec) (Newline delimited JSON) format, known as [`JSONEachRow`](/interfaces/formats/JSONEachRow) in ClickHouse, and well structured i.e. the column names and types are fixed. `NDJSON` is the preferred format for loading JSON due to its brevity and efficient use of space, but others are supported for both [input and output](/interfaces/formats/JSON). Consider the following JSON sample, representing a row from the [Python PyPI dataset](https://clickpy.clickhouse.com/): diff --git a/docs/integrations/data-ingestion/data-formats/parquet.md b/docs/integrations/data-ingestion/data-formats/parquet.md index b4fdc91b32f..394339b1930 100644 --- a/docs/integrations/data-ingestion/data-formats/parquet.md +++ b/docs/integrations/data-ingestion/data-formats/parquet.md @@ -27,7 +27,7 @@ Before loading data, we can use [file()](/sql-reference/functions/files.md/#file DESCRIBE TABLE file('data.parquet', Parquet); ``` -We've used [Parquet](/interfaces/formats.md/#data-format-parquet) as a second argument, so ClickHouse knows the file format. This will print columns with the types: +We've used [Parquet](/interfaces/formats/Parquet) as a second argument, so ClickHouse knows the file format. This will print columns with the types: ```response ┌─name─┬─type─────────────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐ diff --git a/docs/integrations/data-ingestion/data-formats/sql.md b/docs/integrations/data-ingestion/data-formats/sql.md index 038196ea3bb..96629fbaf8d 100644 --- a/docs/integrations/data-ingestion/data-formats/sql.md +++ b/docs/integrations/data-ingestion/data-formats/sql.md @@ -12,7 +12,7 @@ ClickHouse can be easily integrated into OLTP database infrastructures in many w ## Creating SQL dumps {#creating-sql-dumps} -Data can be dumped in SQL format using [SQLInsert](/interfaces/formats.md/#sqlinsert). ClickHouse will write data in `INSERT INTO VALUES(...` form and use [`output_format_sql_insert_table_name`](/operations/settings/settings-formats.md/#output_format_sql_insert_table_name) settings option as a table name: +Data can be dumped in SQL format using [SQLInsert](/interfaces/formats/SQLInsert). ClickHouse will write data in `INSERT INTO
VALUES(...` form and use [`output_format_sql_insert_table_name`](/operations/settings/settings-formats.md/#output_format_sql_insert_table_name) settings option as a table name: ```sql SET output_format_sql_insert_table_name = 'some_table'; @@ -43,7 +43,7 @@ SET output_format_sql_insert_max_batch_size = 1000; ### Exporting a set of values {#exporting-a-set-of-values} -ClickHouse has [Values](/interfaces/formats.md/#data-format-values) format, which is similar to SQLInsert, but omits an `INSERT INTO table VALUES` part and returns only a set of values: +ClickHouse has [Values](/interfaces/formats/Values) format, which is similar to SQLInsert, but omits an `INSERT INTO table VALUES` part and returns only a set of values: ```sql SELECT * FROM some_data LIMIT 3 FORMAT Values @@ -54,7 +54,7 @@ SELECT * FROM some_data LIMIT 3 FORMAT Values ## Inserting data from SQL dumps {#inserting-data-from-sql-dumps} -To read SQL dumps, [MySQLDump](/interfaces/formats.md/#mysqldump) is used: +To read SQL dumps, [MySQLDump](/interfaces/formats/MySQLDump) is used: ```sql SELECT * diff --git a/docs/integrations/data-ingestion/data-formats/templates-regex.md b/docs/integrations/data-ingestion/data-formats/templates-regex.md index 4a9a6a8d522..f77b69522af 100644 --- a/docs/integrations/data-ingestion/data-formats/templates-regex.md +++ b/docs/integrations/data-ingestion/data-formats/templates-regex.md @@ -24,7 +24,7 @@ head error.log 2023/01/16 05:34:55 [error] client: 9.9.7.6, server: example.com "GET /h5/static/cert/icon_yanzhengma.png HTTP/1.1" ``` -We can use a [Template](/interfaces/formats.md/#format-template) format to import this data. We have to define a template string with values placeholders for each row of input data: +We can use a [Template](/interfaces/formats/Template) format to import this data. We have to define a template string with values placeholders for each row of input data: ```response