Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CROSS JOIN using unexpectedly large amount of memory #12571

Open
gkristic opened this issue Jul 17, 2020 · 14 comments
Open

CROSS JOIN using unexpectedly large amount of memory #12571

gkristic opened this issue Jul 17, 2020 · 14 comments
Labels
comp-joins JOINs memory When memory usage is higher than expected minor Priority: minor st-hold We've paused the work on issue for some reason unexpected behaviour

Comments

@gkristic
Copy link

gkristic commented Jul 17, 2020

I've been having issues trying to find prefix relationships between strings in a table, using startsWith on top of a cross join. Although the dataset is about 20GiB uncompressed, ClickHouse keeps trying to allocate memory well past the mark where the entire table would fit in it. I tried in a box with 512GiB of RAM (running ClickHouse exclusively) and the query still aborts due of memory limits. My original query was more complicated and involved aggregation functions, that I blamed at first. But as I simplified the case, I noticed that the problem persists even without aggregation of any kind. In this scenario I'd expect ClickHouse to loop over the data and stream the results right away, but queries still abort instead.

Setup: I'm using ClickHouse in Docker. It's almost exactly the standard yandex/clickhouse-server image in Docker Hub (20.5.2.7), except that I changed the max memory for queries to take advantage of the extra memory in the instance I'm using. Specifically, here's my Dockerfile:

FROM yandex/clickhouse-server:20.5.2.7
RUN sed -i '/max_memory_usage/s/[[:digit:]]\+/494000000000/' /etc/clickhouse-server/users.xml

The maximum amount of total memory is set through the 0.9 ratio that comes preset in the standard image.

Dataset: The table I'm using has only two columns: an ID (UInt64) and a name (String). In my original case the strings are all strictly different and vary in length (average ~2KiB), but I reduced the problem further to a dataset that can be easily generated on demand. In this case all strings are equal and 2KiB long. Here's the exact table structure:

clickhouse-client -q 'create database testdb'
clickhouse-client -q 'create table testdb.xs (id UInt64, name String) engine=MergeTree() order by id'

You can populate the data with:

X="$(dd if=/dev/zero bs=2048 count=1 status=none | tr '\0' X)"
seq 10000000 | sed "s/\$/,$X/" | clickhouse-client -q 'insert into testdb.xs format CSV'

Query: Here's the query I run:

select q.id, p.id
from xs q, xs p
where startsWith(q.name, p.name);

The problem seems to be related to the presence of startsWith. If I remove it and go for the full cross join, or change it to avoid the reflexive cases (q.id != p.id), ClickHouse starts returning results immdiately. However, the experience with this query is that it blocks until it aborts due to memory. It doesn't print even a single result. I set send_logs_level to trace to have a better perspective, and here's what I got (summarized below for brevity; full output here).

8c0c3408045c :) select q.id, p.id
:-] from xs q, xs p
:-] where startsWith(q.name, p.name);

SELECT
    q.id,
    p.id
FROM xs AS q
, xs AS p
WHERE startsWith(q.name, p.name)

[8c0c3408045c] 2020.07.17 20:13:04.317559 [ 91 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> executeQuery: (from 172.17.0.3:58310) select q.id, p.id from xs q, xs p where startsWith(q.name, p.name);
↙ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.)
← Progress: 71.38 thousand rows, 147.40 MB (641.09 thousand rows/s., 1.32 GB/s.)  0%
  . . .
[8c0c3408045c] 2020.07.17 20:13:04.504685 [ 155 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> MemoryTracker: Current memory usage (total): 1.01 GiB.
[8c0c3408045c] 2020.07.17 20:13:04.507148 [ 137 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> MemoryTracker: Current memory usage (for query): 2.00 GiB.
↑ Progress: 1.14 million rows, 2.35 GB (3.64 million rows/s., 7.52 GB/s.)  11%
  . . .
[8c0c3408045c] 2020.07.17 20:13:06.040223 [ 105 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> CreatingSetsBlockInputStream: Created Join with 10000000 entries from 10000000 rows in 1.709907685 sec.
[8c0c3408045c] 2020.07.17 20:13:06.101276 [ 101 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> MemoryTracker: Current memory usage (total): 16.00 GiB.
[8c0c3408045c] 2020.07.17 20:13:06.109714 [ 104 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> MemoryTracker: Current memory usage (for query): 32.01 GiB.
  . . .
[8c0c3408045c] 2020.07.17 20:13:21.054537 [ 123 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> MemoryTracker: Current memory usage (total): 420.67 GiB.
[8c0c3408045c] 2020.07.17 20:13:21.112827 [ 123 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> MemoryTracker: Current memory usage (for query): 459.00 GiB.
[8c0c3408045c] 2020.07.17 20:13:21.113220 [ 123 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> MemoryTracker: Current memory usage (total): 424.73 GiB.
[8c0c3408045c] 2020.07.17 20:13:21.132869 [ 105 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> MemoryTracker: Current memory usage (for query): 459.01 GiB.
[8c0c3408045c] 2020.07.17 20:14:16.628530 [ 91 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Error> executeQuery: Code: 241, e.displayText() = DB::Exception: Memory limit (for query) exceeded: would use 463.01 GiB (attempt to allocate chunk of 4294967296 bytes), maximum: 460.07 GiB (version 20.5.2.7 (official build)) (from 172.17.0.3:58310) (in query: select q.id, p.id from xs q, xs p where startsWith(q.name, p.name);), Stack trace (when copying this message, always include the lines below):

0. Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x10ed0da0 in /usr/bin/clickhouse
1. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x95c923d in /usr/bin/clickhouse
2. ? @ 0x95c8228 in /usr/bin/clickhouse
3. MemoryTracker::alloc(long) @ 0x95c6724 in /usr/bin/clickhouse
4. Allocator<false, false>::realloc(void*, unsigned long, unsigned long, unsigned long) @ 0x96118fb in /usr/bin/clickhouse
5. ? @ 0xe10356c in /usr/bin/clickhouse
6. DB::ColumnString::insertFrom(DB::IColumn const&, unsigned long) @ 0xe10ceec in /usr/bin/clickhouse
7. DB::HashJoin::joinBlockImplCross(DB::Block&, std::__1::shared_ptr<DB::ExtraBlock>&) const @ 0xdbd7784 in /usr/bin/clickhouse
8. DB::HashJoin::joinBlock(DB::Block&, std::__1::shared_ptr<DB::ExtraBlock>&) @ 0xdc15c1f in /usr/bin/clickhouse
9. DB::ExpressionAction::execute(DB::Block&, bool, std::__1::shared_ptr<DB::ExtraBlock>&) const @ 0xdd46de3 in /usr/bin/clickhouse
10. DB::ExpressionActions::execute(DB::Block&, std::__1::shared_ptr<DB::ExtraBlock>&, unsigned long&) const @ 0xdd4980e in /usr/bin/clickhouse
11. DB::InflatingExpressionTransform::readExecute(DB::Chunk&) @ 0xe991b3d in /usr/bin/clickhouse
12. DB::InflatingExpressionTransform::transform(DB::Chunk&) @ 0xe992539 in /usr/bin/clickhouse
13. DB::ISimpleTransform::work() @ 0xe6d7b50 in /usr/bin/clickhouse
14. ? @ 0xe70aa21 in /usr/bin/clickhouse
15. ? @ 0xe70f1a6 in /usr/bin/clickhouse
16. ? @ 0xe70f7f2 in /usr/bin/clickhouse
17. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x95f6e97 in /usr/bin/clickhouse
18. ? @ 0x95f5383 in /usr/bin/clickhouse
19. start_thread @ 0x76db in /lib/x86_64-linux-gnu/libpthread-2.27.so
20. clone @ 0x12188f in /lib/x86_64-linux-gnu/libc-2.27.so


Received exception from server (version 20.5.2):
Code: 241. DB::Exception: Received from server:9000. DB::Exception: Memory limit (for query) exceeded: would use 463.01 GiB (attempt to allocate chunk of 4294967296 bytes), maximum: 460.07 GiB.

0 rows in set. Elapsed: 72.313 sec. Processed 10.16 million rows, 20.99 GB (140.54 thousand rows/s., 290.22 MB/s.)

I tried with the standard Yandex.Metrica dataset, using the URL field in kind of the same way:

select q.URL, p.URL
from hits_v1 q, hits_v1 p
where startsWith(q.URL, p.URL);

In this case it sent ~10K results almost immediately (for URL http://public_search) and then kept processing silently. I didn't let it finish because I had to shutdown the instance. But I let it run for at least 10 minutes and memory use seemed pretty stable; most of the time between 30 to 40GiB, peaking at ~50GiB. It still seems extremely high, though, considering that the uncompressed dataset is 6GiB. Also, in my case above the string is 2KiB for all rows, whereas in this dataset the URL column is empty for most of them (more than 7 out of 8 million), and most of those that are non-empty are shorter than 200 bytes.

For contrast:

SELECT
    database,
    formatReadableSize(sum(bytes_on_disk)) AS on_disk,
    formatReadableSize(sum(data_compressed_bytes)) AS data_compressed,
    formatReadableSize(sum(data_uncompressed_bytes)) AS data_uncompressed
FROM parts
WHERE database IN ('testdb', 'datasets')
GROUP BY database

┌─database─┬─on_disk────┬─data_compressed─┬─data_uncompressed─┐
│ testdb   │ 115.80 MiB │ 115.70 MiB      │ 19.17 GiB         │
│ datasets │ 1.18 GiB   │ 1.18 GiB        │ 5.73 GiB          │
└──────────┴────────────┴─────────────────┴───────────────────┘

2 rows in set. Elapsed: 0.006 sec.

Let me know if there's anything I can help with.

@4ertus2
Copy link
Contributor

4ertus2 commented Jul 20, 2020

I've tryed to reproduce the issue on my laptop. First of all it returns some data immediately. But I think there's no guaranty for it.

│ 1294337 │ 1304334 │
│ 1294337 │ 1304335 │
│ 1294337 │ 1304336 │
└─────────┴─────────┘
Cancelling query.
  Showed first 10000.
Query was cancelled.

110000000 rows in set. Elapsed: 22.331 sec.
│  1 │  9998 │
│  1 │  9999 │
│  1 │ 10000 │
└────┴───────┘
Cancelling query.
  Showed first 10000.
Query was cancelled.

230000000 rows in set. Elapsed: 39.441 sec.

The query should generate 10^7 * 10^7 rows. In case my laptop (in debug version) generates ~10^7 rows per second we need 10^7 seconds (~116 days) to finish.

Then I started the query and leaved it working for several minutes. My laptop has 16Gb of memory. And the query used about 2Gb without growth.

2020.07.20 16:11:42.000129 [ 587649 ] {0a5d8a2e-0362-414e-982e-0fecd824ee22} <Debug> MemoryTracker: Current memory usage (for query): 2.03 GiB.
Query was cancelled.

2940000000 rows in set. Elapsed: 491.853 sec. Processed 10.13 million rows, 172.23 MB (20.60 thousand rows/s., 350.16 KB/s.) 

I've tested the issue at newer debug version. But it should not matter in context of memory usage: there were no significant changes in CROSS JOIN logic since 20.5.

@4ertus2 4ertus2 added the st-need-repro We were not able to reproduce the problem, please help us. label Jul 20, 2020
@den-crane
Copy link
Contributor

@4ertus2 it does not.

The right table is huge
formatReadableSize(sum(data_uncompressed_bytes))
19.17 GiB

it does not fit into memory, join with a single row (left table)

version 20.7.1

set max_memory_usage = 30000000000
set join_algorithm = 'hash'
SET send_logs_level = 'trace' ;

select q.id, p.id from (select 1 id, '' name) q cross join xs p where startsWith(q.name, p.name);
DB::Exception: Memory limit (for query) exceeded: would use 27.94 GiB

@4ertus2
Copy link
Contributor

4ertus2 commented Jul 20, 2020

@den-crane

There's a message that right table has been loaded into memory and we've got out of memory in join phase not in right table insertion.

[8c0c3408045c] 2020.07.17 20:13:06.040223 [ 105 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> CreatingSetsBlockInputStream: Created Join with 10000000 entries from 10000000 rows in 1.709907685 sec.
[8c0c3408045c] 2020.07.17 20:13:06.101276 [ 101 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> MemoryTracker: Current memory usage (total): 16.00 GiB.
[8c0c3408045c] 2020.07.17 20:13:06.109714 [ 104 ] {927d1881-3b29-4a59-8725-92f1f8a453ac} <Debug> MemoryTracker: Current memory usage (for query): 32.01 GiB.

I cannot reproduce this issue. If we have just a huge right table we should not get CreatingSetsBlockInputStream: Created Join with record in logs.

@den-crane
Copy link
Contributor

10 mil rows * 2kb = raw data 20 GB, it will take 100GB RAM in a hash table

10 000
select q.id, p.id from (select * from xs limit 1) q cross join (select * from xs limit 10000) p where startsWith(q.name, p.name) format Null;

Peak memory usage (for query): 125.42 MiB.

100 000
select q.id, p.id from (select * from xs limit 1) q cross join (select * from xs limit 100000) p where startsWith(q.name, p.name) format Null;
Peak memory usage (for query): 975.44 MiB.

1 000 000
select q.id, p.id from (select * from xs limit 1) q cross join (select * from xs limit 1000000) p where startsWith(q.name, p.name) format Null;
Peak memory usage (for query): 7.26 GiB.

10 000 000
select q.id, p.id from (select * from xs limit 1) q cross join (select * from xs limit 10000000) p where startsWith(q.name, p.name) format Null;
Memory limit (for query) exceeded: would use 27.95 GiB (attempt to allocate chunk of 8454144 bytes), maximum: 27.94 GiB: (while reading column name)

@4ertus2
Copy link
Contributor

4ertus2 commented Jul 20, 2020

My bad. I've made something wrong with population scripts and get empy strings name inside the table. It should fail on my laptop with 2kb strings.

10 mil rows * 2kb = raw data 20 GB, it will take 100GB RAM in a hash table

It does not need hash table for CROSS JOIN, columns only. So probably 32Gb in query log is OK: 20Gb + some overhead in memory allocations in columns.

Any case it's expected that it fails at right table creation phase and it's not expected that memory is still growing in CROSS JOIN phase.

@4ertus2 4ertus2 removed the st-need-repro We were not able to reproduce the problem, please help us. label Jul 20, 2020
@4ertus2 4ertus2 self-assigned this Jul 20, 2020
@4ertus2 4ertus2 changed the title Query using unexpectedly large amount of memory CROSS JOIN using unexpectedly large amount of memory Jul 20, 2020
@4ertus2 4ertus2 added unexpected behaviour and removed question Question? labels Jul 20, 2020
@4ertus2
Copy link
Contributor

4ertus2 commented Jul 20, 2020

OK, it looks I've reproduced the issue.

@gkristic could you run the query with max_threads = 1 set. Would it fail with it or not?

select q.id, p.id, p.name
from xs q, xs p
where startsWith(q.name, p.name)
settings max_threads = 1

@gkristic
Copy link
Author

Sure! I've just run that query with the same setup as before. It produced a first batch of results rather quickly. (It returned the first 10K rows.) Then it kept processing but without printing anything else other than the stats, with understandably decreasing rates for rows/s and MB/s. The output is not comfortable to see with all those useless X's, though 😃 But I let it ran for minutes and I didn't see resident memory grow past ~80GiB. Is this enough for what you wanted to know?

Thanks for taking a look at this @4ertus2 and @den-crane! Let me know if there's anything else I can do.

@gkristic
Copy link
Author

Let me quantify the "I let it ran for minutes"... It was ~20 minutes.

@gkristic
Copy link
Author

I ran it again with format Null so that I could capture the trace without all the interference from the annoying X's, in case it gives you more information. Available here.

@4ertus2
Copy link
Contributor

4ertus2 commented Jul 21, 2020

Thanx.

Summary:
Right table in memory: 31.05 GiB. Not very good but expected (10 mil rows * 2kb = raw data 20 GB)
Additional memory usage even with with max_threads = 1: more than 60 GiB. And it grows. Not expected.

Summary 2:
Probably we need lazy load for big strings (some kind of BLOB logic).

@4ertus2 4ertus2 removed their assignment Oct 16, 2020
@alexey-milovidov alexey-milovidov added the memory When memory usage is higher than expected label Feb 21, 2021
@alexey-milovidov
Copy link
Member

The query

SELECT
    q.id,
    p.id
FROM testdb.xs AS q
, testdb.xs AS p
WHERE (rand() % 100000000) = 123

works alright, the memory usage is minimal (4.70 GiB).

Something is related to the condition on startsWith.

@alexey-milovidov alexey-milovidov added the minor Priority: minor label Feb 21, 2021
@alexey-milovidov
Copy link
Member

This query requires a lot of RAM:

select q.id, p.id from testdb.xs q, testdb.xs p WHERE concat(p.name, '') = '';

And this query is working fine:

select q.id, p.id from testdb.xs q, testdb.xs p WHERE concat(q.name, '') = '';

@alexey-milovidov
Copy link
Member

The reason is that full table (with both id and name) is required to be fetched in memory.
I can confirm what @4ertus2 has already written.

@alexey-milovidov
Copy link
Member

Postponed until #21047.

@alexey-milovidov alexey-milovidov added the st-hold We've paused the work on issue for some reason label Feb 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp-joins JOINs memory When memory usage is higher than expected minor Priority: minor st-hold We've paused the work on issue for some reason unexpected behaviour
Projects
None yet
Development

No branches or pull requests

4 participants