Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cursor.columns doesn't return column names #506

Closed
bforst opened this issue Jan 11, 2019 · 26 comments
Closed

cursor.columns doesn't return column names #506

bforst opened this issue Jan 11, 2019 · 26 comments
Assignees

Comments

@bforst
Copy link

bforst commented Jan 11, 2019

Environment

  • Python: 2.7.15 (32 Bit)
  • pyodbc: 4.0.25 (verified that problem doesn't exist on 4.0.24 with exactly the same code)
  • OS: Windows 10
  • DB: SQL Server 2017
  • driver: {ODBC Driver 13 for SQL Server}, {SQL Server Native Client 11.0}, {SQL Server}

Issue

This issue came up with version 4.0.25 of pyodbc on Python 2.7.15, 32 bit on Windows 10 and was not present in version 4.0.24. The cursor.columns() call intermittently fails to return the columns in a table. To exercise the bug, when I create table names with the following lengths (and sometimes other lengths) as such:

from __future__ import print_function, unicode_literals
import pyodbc
cnxn = pyodbc.connect(r"DRIVER={ODBC Driver 13 for SQL Server};SERVER=srv;DATABASE=test_db;Trusted_Connection=yes;")

sql = """CREATE TABLE A234567
(
  ID TINYINT,
  PRIMARY KEY (ID)
);
"""
with cnxn:
    cnxn.execute(sql)

sql = """CREATE TABLE A
(
  ID TINYINT,
  PRIMARY KEY (ID)
);
"""
with cnxn:
    cnxn.execute(sql)

And then I try and get the list of column names for those tables as such:

from __future__ import print_function, unicode_literals
import pyodbc
cnxn = pyodbc.connect(r"DRIVER={ODBC Driver 13 for SQL Server};SERVER=srv;DATABASE=test_db;Trusted_Connection=yes;")

curs = cnxn.cursor()
print([col.column_name for col in curs.columns('A234567')])
print([col.column_name for col in curs.columns('A234567')])
# Fine from here for A234567

curs = cnxn.cursor()
print([col.column_name for col in curs.columns('A')])
print([col.column_name for col in curs.columns('A')])
print([col.column_name for col in curs.columns('A')])
print([col.column_name for col in curs.columns('A')])
print([col.column_name for col in curs.columns('A')])
print([col.column_name for col in curs.columns('A')])
# Only succeeds 1/3 times for {ODBC Driver 13 for SQL Server}
# Only succeeds 1/3 times for {SQL Server Native Client 11.0}
# Only succeeds 1/2 times for {SQL Server}

The tables of seven characters return an empty result set the first time they are called, then return properly after.

The tables with one character names return empty result sets the first two times they are called, then a proper result set, then repeat failing twice in a row and succeeding once as you make more calls.

Unlike issue #501, I cannot replicate this on Python 3.7.2

@gordthompson
Copy link
Collaborator

I was able to partially reproduce the issue under 64-bit Python 2.7.14 on Windows 7 using the "SQL Server" ODBC driver (SQLSRV32.DLL).

With pyodbc 4.0.25, print([col.column_name for col in curs.columns('A')]) consistently produced an empty result, while pyodbc 4.0.24 consistently returned the column name.

One difference I noticed is that the SQLColumnsW call produced by 4.0.24 was

main            1fe4-1db8   EXIT  SQLColumnsW  with return code 0 (SQL_SUCCESS)
        HSTMT               0x0000000000517FB0
        WCHAR *             0x0000000000000000 <null pointer>
        SWORD                       -3 
        WCHAR *             0x0000000000000000 <null pointer>
        SWORD                       -3 
        WCHAR *             0x000000000013DB90 [      -3] "A\ 0"
        SWORD                       -3 
        WCHAR *             0x0000000000000000 <null pointer>
        SWORD                       -3 

whereas with 4.0.25 the call included a ? along with the table name

main            1f28-c10    EXIT  SQLColumnsW  with return code 0 (SQL_SUCCESS)
        HSTMT               0x0000000000487FB0
        WCHAR *             0x0000000000000000 <null pointer>
        SWORD                       -3 
        WCHAR *             0x0000000000000000 <null pointer>
        SWORD                       -3 
        WCHAR *             0x00000000021F3A78 [      -3] "A?\ 0"
        SWORD                       -3 
        WCHAR *             0x0000000000000000 <null pointer>
        SWORD                       -3 

Nothing else in the ODBC logs looked suspicious me, except of course that when 4.0.24 went to retrieve the actual column information it succeeded

main            1fe4-1db8   EXIT  SQLFetch  with return code 0 (SQL_SUCCESS)
        HSTMT               0x0000000000517FB0

while 4.0.25 was told that there was no data to retrieve

main            1f28-c10    EXIT  SQLFetch  with return code 100 (SQL_NO_DATA_FOUND)
        HSTMT               0x0000000000487FB0

4.0.24.log
4.0.25.log

@gjb1002
Copy link

gjb1002 commented Jan 18, 2019

I have the same issue using Python 3.6.8 on Windows 64-bit and MS-SQL (SQL server 2012).
columns() call works on some tables, but not others, and I cannot find any common factor between the ones that work and the ones that don't.

Makes 4.0.25 pretty much unusable for me. 4.0.24 works fine though,

Possibly this comment should be on #501 though, those two bugs seem pretty similar.

@gordthompson
Copy link
Collaborator

gordthompson commented Jan 19, 2019

Additional information re: my earlier test:

This code tests Cursor#columns for tables with names of various lengths from 1 to 27 characters

for i in range(1, 28):
    table_name = u'ABCDEFGHIJKLMNOPQRSTUVWXYZ_'[:i]

    crsr = cnxn.cursor()
    crsr.execute("""\
    BEGIN TRY
        DROP TABLE {0};
    END TRY
    BEGIN CATCH
    END CATCH
    CREATE TABLE {0} (id INT PRIMARY KEY);
    """.format(table_name))

    col_count = len([col.column_name for col in crsr.columns(table_name)])
    print('table [{}] ({} characters): {} columns{}'.format(table_name, i, col_count, ' <-' if col_count == 0 else ''))
    # print(col_list)

    crsr.execute("DROP TABLE {};".format(table_name))

The results are very consistent. 4.0.25 returns an empty list when the length of the table name is 1, 4, 5, 6, 8, 17, 21, 24, 25, or 26 characters. Examples of the failing entries in the ODBC log are

        WCHAR *             0x00000000022E7960 [      -3] "A?8\ 0"
        WCHAR *             0x00000000022F5380 [      -3] "ABCD???;??\ 0"
        WCHAR *             0x00000000022F5380 [      -3] "ABCDE??;??\ 0"
        WCHAR *             0x00000000022F5380 [      -3] "ABCDEF?;??\ 0"
        WCHAR *             0x00000000022F0BB0 [      -3] "ABCDEFGH?;\ 0"
        WCHAR *             0x00000000022F13F8 [      -3] "ABCDEFGHIJKLMNOPQ?\ 0"
        WCHAR *             0x00000000022E9460 [      -3] "ABCDEFGHIJKLMNOPQRSTU?\ 0"
        WCHAR *             0x00000000022C1680 [      -3] "ABCDEFGHIJKLMNOPQRSTUVWX???s\ 0"
        WCHAR *             0x00000000022C1680 [      -3] "ABCDEFGHIJKLMNOPQRSTUVWXY??s\ 0"
        WCHAR *             0x00000000022C1680 [      -3] "ABCDEFGHIJKLMNOPQRSTUVWXYZ?s\ 0"

while the parameters for a successful call look like

        WCHAR *             0x00000000022E79D8 [      -3] "AB\ 0"
        WCHAR *             0x00000000022E79B0 [      -3] "ABC\ 0"

@gordthompson
Copy link
Collaborator

Extra additional information: Repeated the above test with 4.0.25 under Python 3.7. The results are less consistent and the failing name lengths are different: 1 (usually, but not always), 4 (usually, but not always), 5, 8, 9, 12, 17, 20, 21, 22, 24, or 25 characters.

mkleehammer added a commit that referenced this issue Jan 30, 2019
This was noticed in Cursor.columns, but I'm sure it affects a couple of other places in the
code.

Unfortunately the solution is hideously inefficient.  I tried not null terminating, but I'm
afraid driver differences when not using SQLWCHAR, such as drivers that want UTF8, will cause
non-stop errors.  Suggestions for a more efficient implementation are welcome.  Alternatively,
if someone wants to try converting to using lengths instead of null terminated strings *and*
testing it, I'd be interested.
@mkleehammer
Copy link
Owner

I got to the bottom of this - the Unicode encoding doesn't automatically add null terminators. Encoding to UTF8 does, but not named encodings. In particular, we need two nulls at the end of a UTF16 string and 4 for UTF32.

There doesn't seem to be a way to force nulls on the end during conversion. My first thought was to not use null terminators and pass string lengths into the ODBC APIs. This is easy to do when following the specification, but a big part of pyodbc is dealing with non-standard drivers like MySQL and PostgreSQL that will want UTF8. What length would they want - characters or bytes? And what about all of the other drivers out there that I can't test?

In the end, I bit the bullet and concatenated 4 nulls on to the end of encoded bytes which is a memory copy. It isn't used for data we're reading, just column names, so I think it is acceptable.

@gordthompson Does this work for you?

@mkleehammer
Copy link
Owner

Also, I could not figure out how to make it fail in a unit test. If you can come up with one, I'd love to add it. I could get to fail every time by using a separate test file, but could not ever get it to fail in the SQL Server unit tests.

@mkleehammer mkleehammer self-assigned this Jan 30, 2019
gordthompson added a commit to gordthompson/pyodbc that referenced this issue Jan 30, 2019
mkleehammer added a commit that referenced this issue Feb 2, 2019
This was noticed in Cursor.columns, but I'm sure it affects a couple of other places in the
code.

Unfortunately the solution is hideously inefficient.  I tried not null terminating, but I'm
afraid driver differences when not using SQLWCHAR, such as drivers that want UTF8, will cause
non-stop errors.  Suggestions for a more efficient implementation are welcome.  Alternatively,
if someone wants to try converting to using lengths instead of null terminated strings *and*
testing it, I'd be interested.
mkleehammer pushed a commit that referenced this issue Feb 2, 2019
* Fix double-decref on fast_executemany error return path. (#504)

* test for null termination fix (issue #506)
mkleehammer pushed a commit that referenced this issue Feb 2, 2019
* Fix double-decref on fast_executemany error return path. (#504)

* test for null termination fix (issue #506)

(cherry picked from commit 9597e6b)
@drmikecrowe
Copy link

Is this issue really fixed? I could not get python 3.6/3.8 + pyodbc==4.0.30 to work.

Example: I have 127 tables I was researching. On 62, cursor.columns() returned 0 results. Here's the essence of my code:

cnxn = pyodbc.connect("Driver={ODBC Driver 17 for SQL Server};" + "Server={};Database={};Trusted_Connection=yes".format(server, database))
...
rows = cursor.columns(table=table, schema=schema_name).fetchall()

I used conda to downgrade python to 2.7 and did a pip install pyodbc==4.0.24 and all 62 tables (with the same code) pulled correctly.

How should I structure my code in python 3.x to work?

@v-chojas
Copy link
Contributor

Do you have an ODBC trace, both of the working and non-working cases?

@drmikecrowe
Copy link

This is bizarre -- I cannot get this to successfully run again. I got all 62 successfully one time and now I cannot reproduce it. I've created new conda environments and reinstalled the exact versions I used earlier (I still have the window open), and it repeatedly fails.

@v-chojas
Copy link
Contributor

In any case, please provide an ODBC trace and more details on your environment.

@drmikecrowe
Copy link

@drmikecrowe
Copy link

To be clear, I found my problem in reproducing the issue. IMO, this issue is still open and the above logs show how it works in 2.7/4.0.24 and the same code fails in 3.7/4.0.30.

@drmikecrowe
Copy link

Also, I've attached the function that I'm using in both scenarios
example.zip

@v-chojas
Copy link
Contributor

Your success log appears to be Windows and the fail log is Linux/Mac. If you change more than one variable it becomes harder to diagnose. Please provide more details about your environment.

@gt6989b
Copy link

gt6989b commented Jun 17, 2020

This is definitely still a problem for me on Windows 10, running Python 2.7.18 (not anaconda) with SQL Server 2016 and SSMS 18 on the local box.

I ran the code in the following example as follows:
python colsExample.py >30.txt 2>30.err
where 24, 25 and 30 were the versions of pyodbc installed at the time of the run.
colsExample.zip

24 is perfect, 25 crashes and 30 does not produce output from raw command prompt but if ran inside git bash, results in a segmentation fault.

What other information could I provide to make this helpful?

@v-chojas
Copy link
Contributor

You can provide an ODBC trace.

@gt6989b
Copy link

gt6989b commented Jun 18, 2020

I am sorry, I had to learn how to turn on the ODBC trace. Here is the updated archive with ODBC trace logs.
colsExampleWithTrace.zip

@gordthompson
Copy link
Collaborator

24.trace.log:372 (in SQLColumnsW call)

        WCHAR *             0x00000000035F66A0 [      -3] "MarketSectors\ 0"

25.trace.log:372 (in SQLColumnsW call)

        WCHAR *             0x0000000067F54676 [      -3] "????????????T????????\ 0"

@keitherskine
Copy link
Collaborator

Just FYI, there's a test for columns() in tests3\sqlservertests.py but not tests2\sqlservertests.py .

@gordthompson
Copy link
Collaborator

Just FYI, there's a test for columns() in tests3\sqlservertests.py but not tests2\sqlservertests.py .

Hmm. I guess I forgot to add it to tests2 when I submitted that test. At least I don't remember purposely omitting it from tests2 for any particular reason. Oh well, Python_2 is done now so if it works for Python_3 that's good enough for me.

@gt6989b
Copy link

gt6989b commented Jun 18, 2020

Is there still a chance to please fix this for Python 2.7.18?

@keitherskine
Copy link
Collaborator

Even if this was fixed, @gt6989b , the reality is it probably won't be fixed soon. In the meantime, perhaps you can call the metadata tables directly for the information you're looking for, e.g.:

select * from INFORMATION_SCHEMA.COLUMNS;
select * from sys.all_columns;

@gt6989b
Copy link

gt6989b commented Jun 18, 2020

@keitherskine thank you so much. For now, I downgraded to version 24, which does not have the bug :) Hope this will get done at some point so we can upgrade the package. Does the same problem now work on Py3?

@ewongbb
Copy link

ewongbb commented Jul 9, 2020

@keitherskine thanks for the workaround. I was completely baffled as to why a MSSQL table (using the SQL Native ODBC driver) wasn't showing any columns when it's supposed to have two. Using this workaround to bypass this issue.

@gt6989b
Copy link

gt6989b commented Jul 22, 2020

Could we possibly reopen this issue so it may get fixed at some point? Thank you.

@v-chojas
Copy link
Contributor

You are free to open a new issue, as the cause may not be the same as in the original one (which was fixed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants