Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERL-421: ODBC truncates data for LONGTEXT and BLOB #3321

Open
OTP-Maintainer opened this issue May 10, 2017 · 8 comments · May be fixed by #4822
Open

ERL-421: ODBC truncates data for LONGTEXT and BLOB #3321

OTP-Maintainer opened this issue May 10, 2017 · 8 comments · May be fixed by #4822
Labels
bug Issue is reported as a bug help wanted Issue not worked on by OTP; help wanted from the community priority:low team:PS Assigned to OTP team PS

Comments

@OTP-Maintainer
Copy link

Original reporter: lea
Affected version: OTP-19.1.1
Component: odbc
Migrated from: https://bugs.erlang.org/browse/ERL-421


Tested with MySQL 5.6.36 on CentOS 6.9 64-bit and rpm esl-erlang-19.3-1.x86_64.

If you have either a LONGTEXT or BLOB column, content exceeding 8001 bytes will be truncated when reading from the database.

+To reproduce:+
Create a table customerField in a database named tull and write content of e.g. 10000 characters in a LONGTEXT column. Then, from the erlang shell:

odbc:start().                                                                                           
Connect="DSN=sqlconnection;UID=root;PWD=;MULTI_STATEMENTS=1;BIG_PACKETS=1;charset=utf8mb4;database=tull".
{ok,Ref} = odbc:connect(Connect,[]).                                                                    
odbc:sql_query(Ref,"select value from sessionField ").                                                  

+Reason for problem:+
Looking at ../lib/odbc/c_src/odbcserver.c and odbcserver.h:

In function encode_column_name_list() there is an if clause:
	if(sql_type == SQL_LONGVARCHAR || sql_type == SQL_LONGVARBINARY || sql_type == SQL_WLONGVARCHAR)
	    size = MAXCOLSIZE;

where MAXCLOSIZE is defined to 8001.

+Suggested solution:+
Some kind of dynamic reading using SQLGetData.
@OTP-Maintainer
Copy link
Author

ingela said:

Probably a good idea, but not highly priortized

@OTP-Maintainer
Copy link
Author

arcusfelis said:

Long strings/binaries work with mssql, but not with pgsql.
It would be good to know why do we need MAXCLOSIZE in the first place.

@OTP-Maintainer
Copy link
Author

ingela said:

This code is so old I can not remember, maybe it was a limit that existed way back that is no longer relevant.  Maybe the ODBC
API functions is now smarter? Anyone that knows are welcome to contribute. 


@OTP-Maintainer
Copy link
Author

arcusfelis said:

I've worked the library.
https://github.com/arcusfelis/eodbc

MAXCOLSIZE is 8000 is probably because it's max length of old type VARBINARY(8000), so the result would "fit" (not really, because it's encoded using HEX, so it should be 16000).

I've implemented reading data using chunking API. Which allows to work with long data types.
It's not most efficient code in the world, and there would be performance problems with really long data types.

It would be good to have tests for odbc with different databases and data types, something like 

https://github.com/esl/MongooseIM/blob/master/big_tests/tests/rdbms_SUITE.erl

but less MongooseIM specific :)

@OTP-Maintainer
Copy link
Author

jan.chochol said:

For this problem, we use following patch:
{code:c}
diff --git a/lib/odbc/c_src/odbcserver.c b/lib/odbc/c_src/odbcserver.c
index 8c799f6ff1..ee9c7db5e4 100644
--- a/lib/odbc/c_src/odbcserver.c
+++ b/lib/odbc/c_src/odbcserver.c
@@ -1305,7 +1305,7 @@ static db_result_msg encode_column_name_list(SQLSMALLINT num_of_columns,
 				       &nullable)))
 	    DO_EXIT(EXIT_DESC);
 
-	if(sql_type == SQL_LONGVARCHAR || sql_type == SQL_LONGVARBINARY || sql_type == SQL_WLONGVARCHAR)
+	if(size == 0 && (sql_type == SQL_LONGVARCHAR || sql_type == SQL_LONGVARBINARY || sql_type == SQL_WLONGVARCHAR))
 	    size = MAXCOLSIZE;
     
 	(columns(state)[i]).type.decimal_digits = dec_digits;
{code}

At least for our used databases (Vertica and PostgreSQL)  {{size}} is correctly returned from {{SQLDescribeCol}}.

@OTP-Maintainer
Copy link
Author

ingela said:

If anyone likes this patch please make a PR

@OTP-Maintainer OTP-Maintainer added bug Issue is reported as a bug help wanted Issue not worked on by OTP; help wanted from the community team:PS Assigned to OTP team PS priority:medium labels Feb 10, 2021
@winks
Copy link

winks commented Jan 28, 2022

I spent some time debugging this today (on today's master), and here are some findings:

When connecting to Postgres 14.1, and inspecting some values in the aforementioned file lib/odbc/c_src/odbcserver.c:

# create table barb (bfield bytea);
# select bfield from barb;
size 0, sql_type -4 (SQL_LONGVARBINARY)

# create table bart (tfield text);
# select tfield from bart;
size 8190, sql_type -1 (SQL_LONGVARCHAR)

So the above mentioned patch by jan.chochol seems to be fine for Postgres.

When connecting to Oracle, unfortunately it looks quite different:

# create table barb (bfield blob); insert ... (a BLOB of length 8765)
# select bfield from barb;
size 2147483647, sql_type -4 (SQL_LONGVARBINARY)

# create table bart (bfield clob); insert ... (a CLOB of length 16005)
# select tfield from bart;
size 2147483647, sql_type -1 (SQL_LONGVARCHAR)

Playing around and setting size = MAXCOLSIZE*3 seems to give me the data but that's just another arbitrary limit. (But the length() of the returned data is correct, as long as it's below the MAXCOLSIZE*3)

Maybe I can also try mysql, but I haven't done it so far.

I also read a few other codebases, so for reference here are calls to SQLDescribeCol in

In PyODBC it just seems to work, but I couldn't yet figure out what's different.

In PHP based on the type of the field it's set to retrieve data later, not unlike retrive_binary_data (sic!) here, but to me it seems as if (columns(state)[i]).type.c is never set to SQL_BINARY and thus retrive_binary_data is not invoked.

TLDR: I think the proposed patch is not a full fix, at least not when Oracle is involved.

@garazdawi
Copy link
Contributor

Have you also checked with #4822? It may fix some more things, but I don't think it fixes all things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug help wanted Issue not worked on by OTP; help wanted from the community priority:low team:PS Assigned to OTP team PS
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants