Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lost data when load data #35

Open
amutu opened this issue Apr 16, 2014 · 2 comments
Open

lost data when load data #35

amutu opened this issue Apr 16, 2014 · 2 comments

Comments

@amutu
Copy link

amutu commented Apr 16, 2014

this may be the reason of #34
when load data use table_load(),some columns of cs has less data than the ori table.
this is the reproduce step:

crash=> select crashlog_truncate();

crashlog_truncate

(1 row)

crash=> select crashlog_load(filter := 'logtime >= $start$2014-04-14$start$ and logtime < $end$2014-04-14 01:00:00$end$'); crashlog_load ---------------
902405
(1 row)

---------------!!!here ,class1 and class2 has different cnt compare with other columns.----
crash=> select * from (select x.clientversion a ,cs_count(x.uin) b,cs_count(x.uin) c,cs_count(x.logtime) d, cs_count(x.class1) e, cs_count(x.class2) f, cs_count(x.class3) g ,cs_count(x.revision) h,cs_count(x.phoneid) i from wx_version,crashlog_get(clientversion) as x where x is not null) t where a = 604176600 and (b <> c or b<> d or b<> e or b<> f or b <> g or b<> h or b<> i);
a | b | c | d | e | f | g | h | i
-----------+-----+-----+-----+---+----+-----+-----+-----
604176600 | 171 | 171 | 171 | 5 | 29 | 171 | 171 | 171
(1 row)

-----102 id has this problem
crash=> select count(1) from (select x.clientversion a ,cs_count(x.uin) b,cs_count(x.uin) c,cs_count(x.logtime) d, cs_count(x.class1) e, cs_count(x.class2) f, cs_count(x.class3) g ,cs_count(x.revision) h,cs_count(x.phoneid) i from wx_version,crashlog_get(clientversion) as x where x is not null) t where b <> c or b<> d or b<> e or b<> f or b <> g or b<> h or b<> i;

count

102
(1 row)

-----ori table cnt
crash=> select x.clientversion,count(x.uin) b,count(x.uin) c,count(x.logtime) d, count(x.class1) e, count(x.class2) f, count(x.class3) g ,count(x.revision) h,count(x.phoneid) i from crashlog_p_20140414 x where logtime <= '2014-04-14 01:00:00' and clientversion = 604176600 group by 1; clientversion | b | c | d | e | f | g | h | i
---------------+-----+-----+-----+-----+-----+-----+-----+-----
604176600 | 171 | 171 | 171 | 171 | 171 | 171 | 171 | 171
(1 row)

@amutu
Copy link
Author

amutu commented Apr 16, 2014

I can't reproduce this bug on 64fb76f tree,so I think it is the "Support load of unlimited varying size strings to columnar store using dictionary" commit introduce the bug.

in the bug branch ,the imcs.dictionary_size is set to 0

@knizhnik
Copy link
Owner

I have no idea how IMCS dictionary can some influence on this problem. It works only if you are importing to columnar store attributes with unlimited type (VARCHAR). But you are not using such attributes, are you?

I tried to reproduce the problem but didn't succeed. Please notice that snapshot of your data I am using is different from what you have now. So I have to adjust your queries:

postgres=# select crash_log_load(filter := 'logtime >= $start$2014-04-09$start$ and logtime < $end$2014-04-09 01:00:00$end$');

crash_log_load

     889937

(1 row)
postgres=# select * from (select x.clientversion a ,cs_count(x.uin) b,cs_count(x.uin) c,cs_count(x.logtime) d, cs_count(x.class1) e, cs_count(x.class2) f, cs_count(x.class3) g ,cs_count(x.revision) h,cs_count(x.phoneid) i from wx_version,crash_log_get(clientversion) as x where x is not null) t where a = 604176600 and (b <> c or b<> d or b<> e or b<> f or b <> g or b<> h or b<> i);
a | b | c | d | e | f | g | h | i
---+---+---+---+---+---+---+---+---
(0 rows)

select count(1) from (select x.clientversion a ,cs_count(x.uin) b,cs_count(x.uin) c,cs_count(x.logtime) d, cs_count(x.class1) e, cs_count(x.class2) f, cs_count(x.class3) g ,cs_count(x.revision) h,cs_count(x.phoneid) i from wx_version,crash_log_get(clientversion) as x where x is not null) t where b <> c or b<> d or b<> e or b<> f or b <> g or b<> h or b<> i;

count

 0

(1 row)

May be the problem is caused by "pinning" of old version of IMCS data in shared memory? Did your try to restart Postgres and clear all shared memory segments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants