Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/best-practices/json_type.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,7 @@ ORDER BY doc.update_date
We provide a type hint for the `update_date` column in the JSON definition, as we use it in the ordering/primary key. This helps ClickHouse to know that this column won't be null and ensures it knows which `update_date` sub-column to use (there may be multiple for each type, so this is ambiguous otherwise).
:::

We can insert into this table and view the subsequently inferred schema using the [`JSONAllPathsWithTypes`](/sql-reference/functions/json-functions#jsonallpathswithtypes) function and [`PrettyJSONEachRow`](/interfaces/formats/PrettyJSONEachRow) output format:
We can insert into this table and view the subsequently inferred schema using the [`JSONAllPathsWithTypes`](/sql-reference/functions/json-functions#JSONAllPathsWithTypes) function and [`PrettyJSONEachRow`](/interfaces/formats/PrettyJSONEachRow) output format:

```sql
INSERT INTO arxiv FORMAT JSONAsObject
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ input_format_parquet_case_insensitive_column_matching = 1 -- Column matching bet
:::note Note on nested column structures
The `VARIANT` and `OBJECT` columns in the original Snowflake table schema will be output as JSON strings by default, forcing us to cast these when inserting them into ClickHouse.

Nested structures such as `some_file` are converted to JSON strings on copy by Snowflake. Importing this data requires us to transform these structures to Tuples at insert time in ClickHouse, using the [JSONExtract function](/sql-reference/functions/json-functions#jsonextract) as shown above.
Nested structures such as `some_file` are converted to JSON strings on copy by Snowflake. Importing this data requires us to transform these structures to Tuples at insert time in ClickHouse, using the [JSONExtract function](/sql-reference/functions/json-functions#JSONExtract) as shown above.
:::

## Test successful data export {#3-testing-successful-data-export}
Expand Down
1 change: 1 addition & 0 deletions docs/getting-started/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@ by https://github.com/ClickHouse/clickhouse-docs/blob/main/scripts/autogenerate-
| [Foursquare places](/getting-started/example-datasets/foursquare-places) | Dataset with over 100 million records containing information about places on a map, such as shops, restaurants, parks, playgrounds, and monuments. |
| [GitHub Events Dataset](/getting-started/example-datasets/github-events) | Dataset containing all events on GitHub from 2011 to Dec 6 2020, with a size of 3.1 billion records. |
| [Hacker News dataset](/getting-started/example-datasets/hacker-news) | Dataset containing 28 million rows of hacker news data. |
| [Hacker News Vector Search dataset](/getting-started/example-datasets/hackernews-vector-search-dataset) | Dataset containing 28+ million Hacker News postings & their vector embeddings |
| [LAION 5B dataset](/getting-started/example-datasets/laion-5b-dataset) | Dataset containing 100 million vectors from the LAION 5B dataset |
| [Laion-400M dataset](/getting-started/example-datasets/laion-400m-dataset) | Dataset containing 400 million images with English image captions |
| [New York Public Library "What's on the Menu?" Dataset](/getting-started/example-datasets/menus) | Dataset containing 1.3 million records of historical data on the menus of hotels, restaurants and cafes with the dishes along with their prices. |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ SELECT JSONExtractString(tags, 'holidays') AS holidays FROM people
1 row in set. Elapsed: 0.002 sec.
```

Notice how the functions require both a reference to the `String` column `tags` and a path in the JSON to extract. Nested paths require functions to be nested e.g. `JSONExtractUInt(JSONExtractString(tags, 'car'), 'year')` which extracts the column `tags.car.year`. The extraction of nested paths can be simplified through the functions [`JSON_QUERY`](/sql-reference/functions/json-functions#json_query) and [`JSON_VALUE`](/sql-reference/functions/json-functions#json_value).
Notice how the functions require both a reference to the `String` column `tags` and a path in the JSON to extract. Nested paths require functions to be nested e.g. `JSONExtractUInt(JSONExtractString(tags, 'car'), 'year')` which extracts the column `tags.car.year`. The extraction of nested paths can be simplified through the functions [`JSON_QUERY`](/sql-reference/functions/json-functions#JSON_QUERY) and [`JSON_VALUE`](/sql-reference/functions/json-functions#json_value).

Consider the extreme case with the `arxiv` dataset where we consider the entire body to be a `String`.

Expand Down
3 changes: 3 additions & 0 deletions scripts/settings/autogenerate-settings.sh
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,7 @@ if [ -f "$FUNCTION_SQL_FILE" ]; then
"Encryption"
"Hash"
"Introspection"
"JSON"
)

for CATEGORY in "${FUNCTION_CATEGORIES[@]}"; do
Expand Down Expand Up @@ -376,6 +377,7 @@ insert_src_files=(
"encryption-functions.md"
"hash-functions.md"
"introspection-functions.md"
"json-functions.md"
)

insert_dest_files=(
Expand All @@ -394,6 +396,7 @@ insert_dest_files=(
"docs/sql-reference/functions/encryption-functions.md"
"docs/sql-reference/functions/hash-functions.md"
"docs/sql-reference/functions/introspection.md"
"docs/sql-reference/functions/json-functions.md"
)

echo "[$SCRIPT_NAME] Inserting generated markdown content between AUTOGENERATED_START and AUTOGENERATED_END tags"
Expand Down
Loading