-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with log2m >= 18 #5
Comments
Hi Metin, This is a bug on our end, though it's one we'd need to seriously consider "fixing". Currently, we allocate 128KB as an internal buffer for each I imagine the only real solution to this problem would be to allow for a compile-time flag indicating the maximum log2m to be supported (since I believe PG only allows a fixed size for the backing memory buffer.) Is there a pressing business need for this for you or can we just change the docs to say that the maximum log2m is 17? |
@metdos Hi, haven't heard back from you. Is this a pressing issue for you (that is, do you require log2m > 17?) or is this just experimentation? If it's simply experimentation, I'm inclined to just change the documentation to say the max is 17 and be done with it. We haven't found anyone else who uses a log2m > 16, so I'd love to be able to defer addressing this issue. I should also correct what I said before:
This is not correct. A pointer is possible, but we suspect that it would cause a significant performance degradation compared to the union method currently implemented. |
Hey Timon, my apologies for the late reply. @TimonK We were considering possible use cases and currently there is not any urgent business need. You can change the documentation. If a need for it appears, I will ping you or reopen issue. Thanks a lot. |
Great extension, thanks for making it open-source.
I came across with this issue while playing around with it:
When I use bigger values than 18 for log2m parameter in hll_add_agg function it crashes. Here is how to replicate problem:
I tried this on PostgreSQL 9.2.4
-- First create dataset:
wget http://examples.citusdata.com/tpch_2_13_0.tar.gz
tar xvfz tpch_2_13_0.tar.gz
cd tpch_2_13_0
gmake
./dbgen -f -s 1 -T L
-- Create table and load data
CREATE TABLE LINEITEM
(
l_orderkey BIGINT not null,
l_partkey INTEGER not null,
l_suppkey INTEGER not null,
l_linenumber INTEGER not null,
l_quantity DECIMAL(15,2) not null,
l_extendedprice DECIMAL(15,2) not null,
l_discount DECIMAL(15,2) not null,
l_tax DECIMAL(15,2) not null,
l_returnflag CHAR(1) not null,
l_linestatus CHAR(1) not null,
l_shipdate DATE not null,
l_commitdate DATE not null,
l_receiptdate DATE not null,
l_shipinstruct CHAR(25) not null,
l_shipmode CHAR(10) not null,
l_comment VARCHAR(44) not null
);
COPY lineitem FROM 'YOUR_PATH/tpch_2_13_0/lineitem.tbl' WITH DELIMITER '|';
-- This query works without a problem:
SELECT
hll_cardinality(hll_add_agg(hll_hash_bigint(l_orderkey, 0), 17))
FROM
lineitem;
-- This query crashes:
SELECT
hll_cardinality(hll_add_agg(hll_hash_bigint(l_orderkey, 0), 18))
FROM
lineitem;
The text was updated successfully, but these errors were encountered: