Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault with query null value #1991

Closed
ikingnet opened this issue Nov 30, 2018 · 12 comments
Closed

Segmentation fault with query null value #1991

ikingnet opened this issue Nov 30, 2018 · 12 comments
Milestone

Comments

@ikingnet
Copy link

ikingnet commented Nov 30, 2018

QUERY SQL: SELECT * FROM "rt_view_request_domain_stat" where stat_time is null limit 10

Segmentation fault (PID 107021)
PostgreSQL version: 11.0
PipelineDB version: unknown at revision unknown
query: SELECT * FROM "rt_view_request_domain_stat"  where stat_time is null limit 10
backtrace:
/opt/pg/pgsql11.0/lib/pipelinedb.so(debug_segfault+0x1c)[0x7fd20a18160c]
/lib64/libpthread.so.0(+0xf100)[0x7fd210fb3100]
postgres: hadoop postgres 10.20.64.44(25266) SELECT(MakeExpandedObjectReadOnlyInternal+0x0)[0x77a950]
postgres: hadoop postgres 10.20.64.44(25266) SELECT[0x5eb5e5]
postgres: hadoop postgres 10.20.64.44(25266) SELECT(ExecScan+0x179)[0x5f9a49]
postgres: hadoop postgres 10.20.64.44(25266) SELECT[0x60f936]
postgres: hadoop postgres 10.20.64.44(25266) SELECT(standard_ExecutorRun+0x13d)[0x5f241d]
/opt/pg/pgsql11.0/lib/pg_stat_statements.so(+0x2ee7)[0x7fd20a4b3ee7]
postgres: hadoop postgres 10.20.64.44(25266) SELECT[0x74094f]
postgres: hadoop postgres 10.20.64.44(25266) SELECT(PortalRun+0x18c)[0x7405fc]
postgres: hadoop postgres 10.20.64.44(25266) SELECT[0x73f6a5]
postgres: hadoop postgres 10.20.64.44(25266) SELECT(PostgresMain+0xb07)[0x73d3b7]
postgres: hadoop postgres 10.20.64.44(25266) SELECT(PostmasterMain+0x1b69)[0x6c1cf9]
postgres: hadoop postgres 10.20.64.44(25266) SELECT[0x6385f1]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7fd210c03b15]
postgres: hadoop postgres 10.20.64.44(25266) SELECT[0x476dc9]
@derekjn
Copy link
Contributor

derekjn commented Nov 30, 2018

@ikingnet can you share the CV definition that this is being run against? We can't tell from the stack trace what the target column might be.

@ikingnet
Copy link
Author

ikingnet commented Dec 4, 2018

CREATE SCHEMA bi_monitor;
CREATE FOREIGN TABLE bi_monitor.rt_stream_nginx_access (log jsonb) SERVER pipelinedb;

CREATE VIEW bi_monitor.rt_view_request_domain_stat WITH (action=materialize) AS
SELECT to_timestamp(log#>>'{timestamp}', 'YYYY-MM-DD HH24:MI:00') as stat_time,
log#>>'{c-host}' as request_domain,
COUNT(1) as pv,
SUM(cast(log#>>'{c-bytes}' as numeric)) as recive_bytes,
SUM(cast(log#>>'{s-bytes}' as numeric)) as send_bytes,
COUNT(DISTINCT (log#>>'{c-ip}',log#>>'{c-port}')) as client_conns,
round(avg(cast(log#>>'{c-cost-time}' as numeric)),4) as avg_time,
max(cast(log#>>'{c-cost-time}' as numeric)) as max_time,
percentile_cont(array[0.5, 0.95, 0.99]) WITHIN GROUP (ORDER BY cast(log#>>'{c-cost-time}' as numeric)) as per_99th
FROM bi_monitor.rt_stream_nginx_access
GROUP BY stat_time,request_domain;

CREATE INDEX ON bi_monitor.rt_view_request_domain_stat (stat_time);
CREATE INDEX ON bi_monitor.rt_view_request_domain_stat (request_domain,stat_time);

@derekjn
Copy link
Contributor

derekjn commented Dec 4, 2018

@ikingnet do you have any example INSERTs by chance? If not we'll just create some ourselves based on the columns you're selecting...

@derekjn
Copy link
Contributor

derekjn commented Dec 4, 2018

@ikingnet Currently attempting to repro with this:

INSERT INTO rt_stream_nginx_access (log) SELECT
('{
  "c-host": "c-host-' || (random() * 1000000)::int::text || '",
  "c-port": ' || (random() * 10000)::int::text || ',
  "c-bytes": ' || (random() * 100)::int::text || ',
  "s-bytes": ' || (random() * 100)::int::text || ',
  "c-ip": "127.0.0.1",
  "c-cost-time": ' || (random() * 10000)::int::text || '
 }')::jsonb
FROM generate_series(1, 1000000) x;

Nothing yet, but can you share the plan that was used when the crash happened? e.g.,

EXPLAIN SELECT * FROM "rt_view_request_domain_stat" where stat_time is null limit 10

@derekjn derekjn added this to the 1.1.0 milestone Dec 4, 2018
@derekjn
Copy link
Contributor

derekjn commented Dec 4, 2018

@ikingnet from the stack trace it looks like your installation was built from source--is that correct? If so, were there any modifications to the code (anything at all) and can you share the steps you used to build this binary?

@ikingnet
Copy link
Author

ikingnet commented Dec 4, 2018

@derekjn

  1. INSERT SQL: INSERT INTO bi_monitor.rt_stream_nginx_access_test VALUES(?)
    json data like this {"@version":"1","@timestamp":"2018-11-27T08:12:05.708Z","path":"/home/soft/openresty/nginx/logs/test_demo_com_apigateway.log","host":"10.20.120.76","type":"apigw","timestamp":"2018-11-27T16:12:05+08:00","version":1,"c-httpprotocol":"HTTP/1.1","s-ip":"10.20.69.104:38080","c-ip":"222.216.179.179","s-upstrem":"test_demo_com_log","c-bytes":3109,"s-bytes":212,"s-response_time":0.001,"c-cost-time":0.001,"c-host":"test.demo.com","c-uri-stem":"/v2/log","c-request-args":"b0=4f","c-useragent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36 115Browser/9.0.0","c-method":"GET","c-referer":"","c-requestid":"-","s-status":"500"}
  2. EXPLAIN SELECT * FROM "rt_view_request_domain_stat" where stat_time is null limit 10
    execute result:
Limit  (cost=0.42..13.70 rows=10 width=216)
  ->  Index Scan using rt_view_request_domain_stat_mrel_stat_time_idx on rt_view_request_domain_stat_mrel  (cost=0.42..1442.32 rows=1086 width=216)
        Index Cond: (stat_time IS NULL)
  1. build pipelinedb:
git clone https://github.com/pipelinedb/pipelinedb.git
cd pipelinedb
pip install -r src/test/py/requirements.txt
make USE_PGXS=1
make install

@ikingnet
Copy link
Author

ikingnet commented Dec 4, 2018

SET pipelinedb.matrels_writable TO ON;
delete from bi_monitor.rt_view_request_domain_stat_mrel where stat_time is null;

after execute this sql,the query SELECT * FROM "rt_view_request_domain_stat" where stat_time is null limit 10 will be ok.

@derekjn
Copy link
Contributor

derekjn commented Dec 6, 2018

@ikingnet do you have an INSERT statements that can reproduce this issue from an empty CV? We haven't been able to reproduce it yet based on the example data here. It's possible that your CV had some data corruption, which can happen from things like forced system shutdowns (e.g. kill -9). Is that possible?

@derekjn
Copy link
Contributor

derekjn commented Dec 21, 2018

@ikingnet I just wanted to follow up here to see if you're able to provide the INSERT statements in the above comment.

@ikingnet
Copy link
Author

@derekjn

CREATE VIEW bi_monitor.rt_view_request_url_stat_test WITH (action=materialize) AS
SELECT to_timestamp(log#>>'{timestamp}', 'YYYY-MM-DD HH24:MI:00') as stat_time,
log#>>'{c-host}' as request_domain,
log#>>'{c-uri-stem}' as request_uri,
COUNT(*) as pv,
SUM(cast(log#>>'{c-bytes}' as numeric)) as recive_bytes,
SUM(cast(log#>>'{s-bytes}' as numeric)) as send_bytes,
COUNT(DISTINCT (log#>>'{c-ip}',log#>>'{c-port}')) as client_conns,
round(avg(cast(log#>>'{c-cost-time}' as numeric)),4) as avg_time,
max(cast(log#>>'{c-cost-time}' as numeric)) as max_time,
percentile_cont(array[0.5, 0.95, 0.99]) WITHIN GROUP (ORDER BY cast(log#>>'{c-cost-time}' as numeric)) as per_99th
FROM bi_monitor.rt_stream_nginx_access_test
WHERE log#>>'{c-method}' != 'OPTIONS'
GROUP BY stat_time,request_domain,request_uri;

CREATE INDEX ON bi_monitor.rt_view_request_url_stat_test (stat_time);
CREATE INDEX ON bi_monitor.rt_view_request_url_stat_test (request_domain,request_uri,stat_time);

--the field 's-bytes' is missing 
INSERT INTO bi_monitor.rt_stream_nginx_access_test values('{"@timestamp":"2018-12-26T02:13:46.435Z","beat":{"hostname":"BX-P120-129","name":"BX-P120-129","version":"5.5.0"},"c-bytes":174,"c-cost-time":0.006000,"c-host":"test.com","c-httpprotocol":"HTTP/1.1","c-ip":"-","c-method":"POST","c-request-args":"-","c-uri-stem":"/v1/work/getWorkList","c-useragent":"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36","input_type":"log","offset":2228875739,"s-ip":"10.20.120.69:80","s-response_time":0.006000,"s-status":"302","source":"/home/logs/test.com.log","timestamp":"2018-12-26T10:13:45+08:00","true":"","type":"apigw","version":1}')

-- this query will be get segmentation fault
select * from bi_monitor.rt_view_request_url_stat_test limit 10

@ikingnet
Copy link
Author

if I modify this SUM(cast(log#>>'{s-bytes}' as numeric)) as send_bytes to SUM(COALESCE(cast(log#>>'{s-bytes}' as numeric),0)) as send_bytes
this query will be ok
select * from bi_monitor.rt_view_request_url_stat_test limit 10

@derekjn
Copy link
Contributor

derekjn commented Feb 4, 2019

@ikingnet this has been fixed by ead5f49.

@derekjn derekjn closed this as completed Feb 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants