You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Increase the search conditions to achieve the purpose of narrowing the search scope, but the result is that more data is parsed, and the query becomes slower
#13401
Open
MrSsunlight opened this issue
Aug 6, 2020
· 0 comments
The default.test_log table is a MergeTree() table,
The default.rule table is the Dictionary() table.
Query sql statement:
WITH
toDateTime('2020-08-05 00:00:00') As start_time,
toDateTime('2020-08-05 23:59:59') As end_time
SELECT
tnow,
if(ipType = 0, IPv4NumToString(ipv4), IPv6NumToString(ipv6)) AS ip
FROM default.test_log cl
WHERE
cl.tnow >= start_time AND
cl.tnow <= end_time AND
ip in (
SELECT
ra.addr
FROM default.rule ra
WHERE
(ra.is_deleted == 0 or ra.is_deleted == 1) and (ra.name = 'test')
)
LIMIT 0, 10
The difference between the two query sql is that one more condition (ra.name ='test') is added;
set send_logs_level='trace'
A: Filter condition does not contain (ra.name='test'):
<Debug> InterpreterSelectQuery: MergeTreeWhereOptimizer: condition "(if(ipType = 0, IPv4NumToString(ipv4), IPv6NumToString(ipv6)) AS ip) IN ((SELECT ra.addr FROM default.rule AS ra WHERE (ra.is_deleted = 0) OR (ra.is_deleted = 1)) AS _subquery9)" moved to PREWHERE
<Trace> AccessRightsContext (admin): Access granted: SELECT(addr, is_deleted) ON default.rule
<Trace> AccessRightsContext (admin): Access granted: SELECT(addr, is_deleted) ON default.rule
<Trace> AccessRightsContext (admin): Access granted: SELECT(tnow, ipType, ipv4, ipv6) ON default.test_log
<Debug> default.test_log (SelectExecutor): Key condition: unknown, (column 0 in [1596556800, +inf)), (column 0 in (-inf, 1596643199]), and, and, unknown, and
<Debug> default.test_log (SelectExecutor): MinMax index condition: unknown, (column 0 in [1596556800, +inf)), (column 0 in (-inf, 1596643199]), and, and, unknown, and
<Debug> default.test_log (SelectExecutor): Selected 4 parts by date, 4 parts by key, 87898 marks to read from 4 ranges
<Trace> default.test_log (SelectExecutor): Reading approx. 1437378474 rows with 12 streams
<Trace> InterpreterSelectQuery: FetchColumns -> Complete
<Trace> CreatingSetsBlockInputStream: Creating set.
<Trace> InterpreterSelectQuery: FetchColumns -> Complete
<Debug> CreatingSetsBlockInputStream: Created. Set with 477382 entries from 477382 rows. In 0.497 sec.
<Information> executeQuery: Read 903366 rows, 20.78 MiB in 1.300 sec., 695092 rows/sec., 15.99 MiB/sec.
<Debug> MemoryTracker: Peak memory usage (for query): 146.56 MiB.
B: Filter conditions include (ra.name ='test'):
<Debug> InterpreterSelectQuery: MergeTreeWhereOptimizer: condition "(if(ipType = 0, IPv4NumToString(ipv4), IPv6NumToString(ipv6)) AS ip) IN ((SELECT ra.addr FROM default.rule AS ra WHERE ((ra.is_deleted = 0) OR (ra.is_deleted = 1)) AND (ra.name = 'test')) AS _subquery10)" moved to PREWHERE
<Trace> AccessRightsContext (admin): Access granted: SELECT(addr, name, is_deleted) ON default.rule
<Trace> AccessRightsContext (admin): Access granted: SELECT(addr, name, is_deleted) ON default.rule
<Trace> AccessRightsContext (admin): Access granted: SELECT(tnow, ipType, ipv4, ipv6) ON default.test_log
<Debug> default.test_log (SelectExecutor): Key condition: unknown, (column 0 in [1596556800, +inf)), (column 0 in (-inf, 1596643199]), and, and, unknown, and
<Debug> default.test_log (SelectExecutor): MinMax index condition: unknown, (column 0 in [1596556800, +inf)), (column 0 in (-inf, 1596643199]), and, and, unknown, and
<Debug> default.test_log (SelectExecutor): Selected 4 parts by date, 4 parts by key, 87898 marks to read from 4 ranges
<Trace> default.test_log (SelectExecutor): Reading approx. 1437378474 rows with 12 streams
<Trace> InterpreterSelectQuery: FetchColumns -> Complete
<Trace> CreatingSetsBlockInputStream: Creating set.
<Trace> InterpreterSelectQuery: FetchColumns -> Complete
<Debug> CreatingSetsBlockInputStream: Created. Set with 1 entries from 1 rows. In 0.644 sec.
<Debug> MergeTreeReadPool: Slow read, event №1: read 1048576 bytes in 1.081 sec., 0.970 MB/s.
<Debug> MergeTreeReadPool: Slow read, event №2: read 1048576 bytes in 1.687 sec., 0.622 MB/s.
<Debug> MergeTreeReadPool: Will lower number of threads to 11
<Information> executeQuery: Read 1412403247 rows, 27.63 GiB in 34.523 sec., 40912481 rows/sec., 819.54 MiB/sec.
<Debug> MemoryTracker: Peak memory usage (for query): 126.37 MiB.
The difference between A and B is CreatingSetsBlockInputStream: Created. Set with * entries from * rows ;
The number of results of '(ra.is_deleted == 0 or ra.is_deleted == 1) and (ra.name = 'test')' is much smaller than that of '(ra.is_deleted == 0 or ra.is_deleted == 1)', But the 'Created. Set entries' of A needs more than the query of B.<Debug> CreatingSetsBlockInputStream: Created. Set with 477382 entries from 477382 rows. In 0.497 sec and <Debug> CreatingSetsBlockInputStream: Created. Set with 1 entries from 1 rows. In 0.644 sec;
What is the logic of CreatingSetsBlockInputStream: Created. Set with 1 entrie ? Why I added a conditional restriction but got more data?
The text was updated successfully, but these errors were encountered:
system: centos7
version: 20.3.11.97
The default.test_log table is a MergeTree() table,
The default.rule table is the Dictionary() table.
Query sql statement:
The difference between the two query sql is that one more condition (ra.name ='test') is added;
set send_logs_level='trace'
A: Filter condition does not contain (ra.name='test'):
B: Filter conditions include (ra.name ='test'):
The difference between A and B is CreatingSetsBlockInputStream: Created. Set with * entries from * rows ;
The number of results of
'(ra.is_deleted == 0 or ra.is_deleted == 1) and (ra.name = 'test')'
is much smaller than that of'(ra.is_deleted == 0 or ra.is_deleted == 1)'
, But the 'Created. Set entries' of A needs more than the query of B.<Debug> CreatingSetsBlockInputStream: Created. Set with 477382 entries from 477382 rows. In 0.497 sec
and<Debug> CreatingSetsBlockInputStream: Created. Set with 1 entries from 1 rows. In 0.644 sec
;What is the logic of
CreatingSetsBlockInputStream: Created. Set with 1 entrie
? Why I added a conditional restriction but got more data?The text was updated successfully, but these errors were encountered: