Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLGetData with SQL_C_WCHAR string truncation and invalid StrLen_or_Ind value #6672

Closed
monetdb-team opened this issue Nov 30, 2020 · 0 comments
Closed

Comments

@monetdb-team
Copy link

@monetdb-team monetdb-team commented Nov 30, 2020

Date: 2018-12-20 14:59:03 +0100
From: jpastuszek
To: clients devs <>
Version: 11.31.11 (Aug2018-SP1)

Last updated: 2019-01-14 17:29:07 +0100

Comment 26740

Date: 2018-12-20 14:59:03 +0100
From: jpastuszek

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:65.0) Gecko/20100101 Firefox/65.0
Build Identifier:

I have a problem with ODBC driver while querying for long STRING (WCHAR and similar).
I am using an ODBC wrapper that will call SQLGetData with 512 bytes long buffer expecting SQL_SUCCESS_WITH_INFO if data was truncated to fit in that buffer.
Unfortunately the call returns SQL_SUCCESS even if the queried string is longer than the buffer provided and the StrLen_or_Ind value points outside of that buffer (depending on actual length of the string multiplied by 2 (UTF-16)).
This makes it impossible to retrieve the full string as additional calls to SQLGetData for same column will return SQL_NO_DATA.

The query I use for testing is:
SELECT 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec magna ligula, sollicitudin nec ultricies sit amet, luctus id est. Nunc cursus pellentesque blandit. Nullam dolor nulla, cursus cursus mattis id, scelerisque eu nisl. Vivamus ornare metus nullam.';

The string is 257 chars long. After the call I get this information from debugging I put around the call:

SQLGetData: col: 1 type: SQL_C_WCHAR start_pos: 0 buf_len: 512
result: SQL_SUCCESS indicator: 514

From ODBCDEBUG=odbc.log

ODBCInitResult: querytype Q_TABLE, rowcount 1
SQLNumResultCols 0x7fbd24402b20
SQLDescribeCol 0x7fbd24402b20 1 0x7ffee227f1cc 512 0x7ffee227f3cc 0x7ffee227f3ce 0x7ffee227f3d0 0x7ffee227f3d8 0x7ffee227f3da
SQLFreeStmt 0x7fbd24402b20 SQL_RESET_PARAMS
SQLFetch 0x7fbd24402b20
SQLGetData 0x7fbd24402b20 1 SQL_C_WCHAR 0x10e423000 512 0x7ffee2283408
Writing 514 bytes to 0x10e423000
SQLFreeHandle Stmt 0x7fbd24402b20
SQLDisconnect 0x7fbd24600c30

Now the buffer contains some 510 (?) (+ NULL) bytes of the string and the wrapper crashes trying to access StrLen_or_Ind byte (514) of the buffer.

I have investigated further in the MonetDB ODBC driver and found what the issue could be.

The driver will do the following in ODBCFetch (called from SQLGetData):

  1. For actual SQL_WCHAR and requested SQL_WCHAR column type case it will allocate temporary buffer big enough to hold the whole string. [https://github.com/MonetDB/MonetDB/blob/c8b0e2cbfef5d360b17c48cf7214b939784232ab/clients/odbc/driver/ODBCConvert.cL1248]
  2. Next it will call copyString to copy data to the temporary buffer.
    Note that this will never fail and won't trigger errfunc "String data, right-truncated" "01004" error case as the buffer will be always big enough. [https://github.com/MonetDB/MonetDB/blob/c8b0e2cbfef5d360b17c48cf7214b939784232ab/clients/odbc/driver/ODBCConvert.cL1279]
  3. It will copy UTF-16 expanded bytes to target buffer with ODBCutf82wchar function.
    This function will copy up to 510 (?) bytes of the output UTF-16 data + NULL to the buffer provided to SQLGetData and then continue on counting bytes that it would produce if buffer was longer.
    The total bytes it WOULD produce is then written to the StrLen_or_IndPtr explaining why it is larger than the buffer length.
  4. Log message "Writing 514 bytes to 0x10e423000" is written and SQL_SUCCESS is returned

Assuming my understanding is correct it is not possible to retrieve more than buffer length of data for UTF-16 encoded columns as the code responsible for returning SQL_SUCCESS_WITH_INFO on truncation will never trigger as it will always have big enough buffer to store UTF-8 encoded original string - it won't even work with UTF-16 encoded data at this point.
Also note that debug message "Writing 514 bytes to 0x10e423000" in this case is misleading as it reports byte of the would-be output of ODBCutf82wchar function and not actual bytes written to provided output buffer (so no out of boundary write actually occurs).

Is there a workaround for this issue?
Since my client uses UTF-8 encoded strings (Rust) going UTF-8 -> UTF-16 in the driver so then I can go UTF-16 -> UTF-8 in my client code is a waste; the returned column description indicates SQL_EXT_WCHAR type for this column so that is what I try to get WCHAR - is there a way to get UTF-8 string instead and avoid all conversion and the issue described here?

Reproducible: Always

Steps to Reproduce:

  1. Run "SELECT 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec magna ligula, sollicitudin nec ultricies sit amet, luctus id est. Nunc cursus pellentesque blandit. Nullam dolor nulla, cursus cursus mattis id, scelerisque eu nisl. Vivamus ornare metus nullam.';"
  2. Provide 512 bytes long buffer to SQLGetData for column 1

Actual Results:

StrLen_or_IndPtr value set to 514 and SQL_SUCCESS_WITH_INFO returned allowing further call to obtain remaining of the string.

Expected Results:

StrLen_or_IndPtr value set to 511 (?) and SQL_SUCCESS returned.

Decode the string to UTF-16 and use copyString on that decoded data to copy it to output buffer. Preserve the buffer between calls to allow fetching it in parts.

Comment 26743

Date: 2018-12-21 10:26:42 +0100
From: jpastuszek

Sorry I messed up the results section...

Actual Results:

StrLen_or_IndPtr value set to 514 and SQL_SUCCESS returned. Follow up call for same column returns SQL_NO_DATA.

Expected Results:

StrLen_or_IndPtr value set to 511 (?) and SQL_SUCCESS_WITH_INFO returned allowing further call to obtain remaining of the string.

Comment 26745

Date: 2018-12-21 14:54:21 +0100
From: jpastuszek

Looks like the best option is to send SQL_C_CHAR requested type and let MonetDB driver to provide data as is.

Unfortunately there is a bug with this as well.
If I read this correctly:
"On each call, SQLGetData returns the next part of the data. It is up to the application to reassemble the parts, taking care to remove the null-termination character from intermediate parts of character data." [https://docs.microsoft.com/en-us/sql/odbc/reference/syntax/sqlgetdata-function?view=sql-server-2017]
the driver should append null after each part (last byte of the buffer needs to be null for intermediate parts) and clients should skip that null byte when reassembling the full buffer. MonetDB driver is not setting last byte of intermediate buffers causing last character of intermediate parts to be lost. The copyString macro does not ensure this property - it only appends null to the last part.
Luckily I could workaround this by checking if the last byte is actually null but this should be fixed as well.
Please let me know if you need a separate bug report for this issue and I will crate one.

Comment 26746

Date: 2018-12-21 16:54:58 +0100
From: @sjoerdmullender

It is extremely unlikely that I will be able to look at this bug before the year is out. But I am interested in fixing it.

Comment 26771

Date: 2019-01-02 15:02:18 +0100
From: MonetDB Mercurial Repository <>

Changeset 0b23137c480b made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=0b23137c480b

Changeset description:

Fixes for retrieving data in chunks with SQLGetData.
Also, make sure returned strings are NULL-terminated, truncating them
when needed.
This (hopefully) fixes bug #6672.

Comment 26772

Date: 2019-01-03 12:44:10 +0100
From: MonetDB Mercurial Repository <>

Changeset 8b7be856bafe made by Sjoerd Mullender sjoerd@acm.org in the MonetDB repo, refers to this bug.

For complete details, see https//devmonetdborg/hg/MonetDB?cmd=changeset;node=8b7be856bafe

Changeset description:

Return SQL_SUCCESS_WITH_INFO and SQLSTATE 01004 when SQLGetData truncates.
More fixing for bug #6672.

Comment 26773

Date: 2019-01-03 17:03:07 +0100
From: @sjoerdmullender

A number of bugs in SQLGetData have been fixed, among them the two mentioned here. (One more bug that was fixed was handling of binary data.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant