Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load some char(64) data casue backend crash #10

Closed
amutu opened this issue Feb 28, 2014 · 3 comments
Closed

load some char(64) data casue backend crash #10

amutu opened this issue Feb 28, 2014 · 3 comments

Comments

@amutu
Copy link

amutu commented Feb 28, 2014

CREATE TABLE error4 (
ts timestamp without time zone,
data character(64)
);

postgres=# select data::bytea from test.error4;

data

\x0b2a2a2a212a2a2ac2a12a2a2ae480a12a2a2ae480a12a2a2ad0a12a2a2a202a2a2ac2a12a2a2a212a2a2ad0a12a2a2ae492a02a2a2ad0a12a2a2ae480a02a2a2ae482a02a2a2ae482a12a2a2ac2a02a2a2a
(1 row)

postgres=# select length(data::bytea) from test.error4;

length

 82

(1 row)

postgres=# select length(data) from test.error4;

length

 64

(1 row)

postgres=# select data from test.error4;

data

\x0B**!**¡_䀡䀡**С** **¡_!_С**_䒠**_С**_䀠**䂠**䂡** **
(1 row)

I find the len of columnar_store_load() get 82 instead of 64,this cause imcs_append_char() seg fault.
str = (char*)vimVARDATA(t);
len = VARSIZE(t) - VARHDRSZ;
if (attr_type_oid[i] == BPCHAROID) {
while (len != 0 && str[len-1] == ' ') {
len -= 1;
}
}
imcs_append_char(ts, str, len);-----!!!-here the len is 82,cause memset seg fault.

@amutu
Copy link
Author

amutu commented Feb 28, 2014

I think it is about some wide char,because:
char_length(data) get 64,but octet_length get 82.

@knizhnik
Copy link
Owner

I have added check for too long string to avoid server crash in such cases.
But you are right the source of the problem is that CHARACTER type in PostgreSQL by default corresponds to unicode character and IMCS stores bytes.
One of the possible workarounds is to increase size of field:

CREATE TABLE error4 (
ts timestamp without time zone,
data character(100)
);

It will not have any influence on storing this data in PostgreSQL (since all strings are stored as varying length data in any case), but IMCS will use larger element size and o your string will fit in it.

Another solution will be to automatically multiply size of type on maximal number of bytes needed to represent wide character. But this multiplier can be quite larger - some exotic Unicode character requires more than 4 multibytes characters. So I do not like this idea.

@amutu
Copy link
Author

amutu commented Feb 28, 2014

thanks for your explainning,I will increase the char type.I think this ticket can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants