large virtual memory spike on BLOB column select #6462
Closed
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Date: 2017-11-09 16:40:36 +0100
From: Anton Kravchenko <<kravchenko.anton86>>
To: SQL devs <>
Version: 11.27.9 (Jul2017-SP2)
CC: kravchenko.anton86, @njnes
Last updated: 2017-12-14 14:46:07 +0100
Comment 25859
Date: 2017-11-09 16:40:36 +0100
From: Anton Kravchenko <<kravchenko.anton86>>
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.89 Safari/537.36
Build Identifier:
'select v_blob from blob_table' produces a virtual memory spike ~800Gb
[using a 'limit' reproduces virtual memory spike too]
From 'select * from storage()' it looks like a blob column size is ~25Gb.
And 'create table new_blob_table as (select v_blob from blob_table) with
data;'
consumes only ~50Gb of virtual memory.
Also for a table with non-blob column both
'select * from nonblob_table'
and
'create new_nonblobtable as (select * from nonblob_table) with data;'
consume the ~same virtual memory.
Reproducible: Always
Steps to Reproduce:
hex_blob='47'*2000
hex_blob_chunk=(hex_blob+'\n')*1000
f=open('blob_data.txt','w')
for irow in range(10000):
f.write(hex_blob_chunk)
f.close()
to load data into Monet blob_table
create table blob_table(v_blob blob);
COPY 10000000 RECORDS INTO blob_table FROM '/home/blob_data.txt' using
delimiters ',','\n' NULL AS '' LOCKED;
to produce virtual memory spike of ~800GB
select * from blob_table limit 50000;
Actual Results:
virtual memory spike of ~800GB
Expected Results:
virtual memory of ~25GB [MonetDB storage size of the blob column]
MonetDB 5 server v11.27.9 "Jul2017-SP2" (64-bit, 128-bit integers)
Copyright (c) 1993-July 2008 CWI
Copyright (c) August 2008-2017 MonetDB B.V., all rights reserved
Visit https://www.monetdb.org/ for further information
Found 503.6GiB available memory, 32 available cpu cores
Libraries:
libpcre: 8.39 2016-06-14 (compiled with 8.32)
openssl: OpenSSL 1.0.2k 26 Jan 2017 (compiled with OpenSSL 1.0.2l 25 May 2017)
libxml2: 2.9.1 (compiled with 2.9.1)
Compiled by: akravchenko@cent7-1 (x86_64-pc-linux-gnu)
Compilation: gcc -O3 -fomit-frame-pointer -pipe -D_FORTIFY_SOURCE=2
Linking : /usr/bin/ld -m elf_x86_64
Comment 25872
Date: 2017-11-12 16:25:57 +0100
From: @njnes
the spike is caused by concurrent worker threads. They all make a slice of the inputs, combined with aggressive heap-copying this results in gdk_nr_threads*25GB (on my test system 200GB, with 8 threads). You have 32 ie leading to even more copies.
Comment 25886
Date: 2017-11-16 10:11:32 +0100
From: MonetDB Mercurial Repository <>
Changeset 230412789aec made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.
For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=230412789aec
Changeset description:
Comment 25888
Date: 2017-11-16 10:35:39 +0100
From: @sjoerdmullender
The query now runs very fast and does not produce a spike in VM use.
For the slicing that also shouldn't happen, see bug #6470.
The text was updated successfully, but these errors were encountered: