ClickHouse ignores settings from subquery #63078

rjoelnorgren · 2024-04-27T22:10:01Z

Describe what's wrong
SETTINGS clause from a subquery does not apply to the query.

How to reproduce
No subquery

SELECT *
FROM numbers(10)
SETTINGS optimize_read_in_order = 0

-- Query id: d8f9702d-2f45-4976-a1c0-f0401bcf9593

SELECT Settings
FROM system.query_log
WHERE query_id = 'd8f9702d-2f45-4976-a1c0-f0401bcf9593'
LIMIT 1

--    ┌─Settings───────────────────────┐
-- 1. │ {'optimize_read_in_order':'0'} │
--    └────────────────────────────────┘

With subquery

SELECT *
FROM
(
    SELECT *
    FROM numbers(10)
    SETTINGS optimize_read_in_order = 0
)

-- Query id: 98714456-aa89-48b9-a5e0-1ebadc866aaa
;

SELECT Settings
FROM system.query_log
WHERE query_id = '98714456-aa89-48b9-a5e0-1ebadc866aaa'
LIMIT 1


--    ┌─Settings─┐
-- 1. │ {}       │
--    └──────────┘

Expected behavior

Settings are applied to the subquery.

alexey-milovidov · 2024-04-28T01:42:42Z

Settings are scoped. If a setting is specified in a subquery, it applies to the scope of the subquery.

rjoelnorgren · 2024-04-28T02:20:38Z

@alexey-milovidov, sorry my example was not an actual reproduction. More involved example in which I found this was a problem. optimize_reader_in_order = 0 is ~10x improvement in performance for this query pattern. Appears setting is not applied in subquery example. Problematic for some patterns used in https://github.com/ClickHouse/dbt-clickhouse.

no subquery

SELECT
    item_id,
    location_id,
    created_at,
    metric - lagInFrame(metric) OVER prior AS delta
FROM my_table
FINAL
WHERE (created_at >= (now() - toIntervalDay(7))) AND (created_at <= now())
WINDOW prior AS (PARTITION BY item_id, location_id ORDER BY created_at ASC RANGE BETWEEN 86400 PRECEDING AND 1 PRECEDING)
SETTINGS optimize_read_in_order = 0
FORMAT `Null`

-- optimize_read_in_order = 0
-- 0 rows in set. Elapsed: 4.400 sec. Processed 682.14 million rows, 12.28 GB (155.02 million rows/s., 2.79 GB/s.)
-- Peak memory usage: 10.99 GiB.

-- optimize_in_order = 1
-- 0 rows in set. Elapsed: 46.819 sec. Processed 682.14 million rows, 12.28 GB (14.57 million rows/s., 262.26 MB/s.)
-- Peak memory usage: 676.24 MiB.

subquery

SELECT *
FROM
(
    SELECT
        item_id,
        location_id,
        created_at,
        metric - lagInFrame(metric) OVER prior AS delta
    FROM my_table
    FINAL
    WHERE (created_at >= (now() - toIntervalDay(7))) AND (created_at <= now())
    WINDOW prior AS (PARTITION BY item_id, location_id ORDER BY created_at ASC RANGE BETWEEN 86400 PRECEDING AND 1 PRECEDING)
    SETTINGS optimize_read_in_order = 0
)
FORMAT `Null`

-- optimize_read_in_order = 0
-- 0 rows in set. Elapsed: 45.771 sec. Processed 682.14 million rows, 12.28 GB (14.90 million rows/s., 268.26 MB/s.)
-- Peak memory usage: 688.16 MiB.

-- optimize_read_in_order = 1
-- 0 rows in set. Elapsed: 45.059 sec. Processed 682.14 million rows, 12.28 GB (15.14 million rows/s., 272.50 MB/s.)
-- Peak memory usage: 692.87 MiB.

ddl

CREATE TABLE my_table
(
    `item_id` UInt64 CODEC(Delta(8), ZSTD(1)),
    `location_id` UInt16 CODEC(Delta(2), ZSTD(1)),
    `metric` SimpleAggregateFunction(max, UInt32) CODEC(T64, ZSTD(1)),
    `created_at` DateTime CODEC(Delta(4), ZSTD(1))
)
ENGINE = AggregatingMergeTree
PARTITION BY toStartOfWeek(created_at)
ORDER BY (item_id, location_id, created_at)
SETTINGS index_granularity = 8192

italodamato · 2024-06-05T11:42:45Z

yes I've also just noticed the same using optimize_aggregation_in_order=1 in the subquery

rjoelnorgren added the potential bug To be reviewed by developers and confirmed/rejected. label Apr 27, 2024

alexey-milovidov closed this as completed Apr 28, 2024

alexey-milovidov added the invalid label Apr 28, 2024

alexey-milovidov self-assigned this Apr 28, 2024

alexey-milovidov removed the potential bug To be reviewed by developers and confirmed/rejected. label Apr 28, 2024

alexey-milovidov reopened this Apr 28, 2024

alexey-milovidov added unexpected behaviour and removed invalid labels Apr 28, 2024

den-crane mentioned this issue May 23, 2024

The view function with settings does not work as expected after upgrading to version 24.3.3.102. #64263

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ClickHouse ignores settings from subquery #63078

ClickHouse ignores settings from subquery #63078

rjoelnorgren commented Apr 27, 2024

alexey-milovidov commented Apr 28, 2024

rjoelnorgren commented Apr 28, 2024

italodamato commented Jun 5, 2024

ClickHouse ignores settings from subquery #63078

ClickHouse ignores settings from subquery #63078

Comments

rjoelnorgren commented Apr 27, 2024

alexey-milovidov commented Apr 28, 2024

rjoelnorgren commented Apr 28, 2024

italodamato commented Jun 5, 2024