Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyodbc 5.0.0 core dump on connecting with token to mssql #1289

Closed
karoldob opened this issue Oct 10, 2023 · 15 comments
Closed

pyodbc 5.0.0 core dump on connecting with token to mssql #1289

karoldob opened this issue Oct 10, 2023 · 15 comments

Comments

@karoldob
Copy link

karoldob commented Oct 10, 2023

Environment

  • Python: 3.10.12
  • pyodbc: 5.0.0
  • sqlalchemy: 2.0.21
  • OS: Ubuntu 22.04 64bit
  • DB: MSSQL (AzureSQL)
  • driver: Microsoft ODBC Driver 18 for SQL Server (/opt/microsoft/msodbcsql18/lib64/libmsodbcsql-18.3.so.2.1)
  • login method: token

Issue

Trying to connect with token as described in:
https://docs.sqlalchemy.org/en/20/dialects/mssql.html#connecting-to-databases-with-access-tokens

How to reproduce:

  • login with az login on your machine (in MS way)
  • copy-paste sample code from the link and fill server/database
  • add engine.connect() at the end
  • observe segmentation fault (core dumped)

Observations:
pyodbc 4.0.38 is not affected - above samples works well
Just pip install --upgrade pyodbc==4.0.38 solves the issue, in the same environment
Login with user-password works well

Trace in attachment
odbctrace.log

@v-chojas
Copy link
Contributor

The fact that the ODBC trace is almost 1MB is already a sign that something is not right, and inspecting closer, it looks like a major breakage happened in the connection attribute processing code. Have you tried using a pure Python+pyODBC script to reproduce it, just to eliminate possibility it's SQLAlchemy?

@karoldob
Copy link
Author

karoldob commented Oct 10, 2023

Yes, already tested with this samples:
https://github.com/mkleehammer/pyodbc/wiki/Connecting-to-SQL-Server-from-Linux#connecting-to-servers-with-access-tokens

Behavior is the same: segfault on pyodbc 5.0.0, normal working on 4.0.38.
And the trace file looks similar

@keitherskine
Copy link
Collaborator

Hi @karoldob , as an experiment, could I ask you to try using a bytearray rather than bytes for the access token in your call to pyodbc.connect. That is, instead of:

cnxn = pyodbc.connect(connection_string, attrs_before={SQL_COPT_SS_ACCESS_TOKEN: token_struct})

try:

cnxn = pyodbc.connect(connection_string, attrs_before={SQL_COPT_SS_ACCESS_TOKEN: bytearray(token_struct)})

It might work. (But of course bytes should work too.) Many thanks.

@jacovg91
Copy link

jacovg91 commented Oct 11, 2023

Just reporting in here, not sure if there is any relationship, but we have noticed, using Pyodbc 5.0.0, this morning we are no longer able to connect to Azure SQL database. This stopped working over night. We are also using token authentication (to Azure) using the msal library. Getting InterfaceError: (pyodbc.InterfaceError) ('28000', "[28000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Login failed for user ''. (18456) (SQLDriverConnect)")

100% certain right user and password (clientId & secret) are used and neither are expired.

EDIT: downgrading to pyodbc==4.0.38 solved it.

@ Pyodbc team, can you please look into this?

@karoldob
Copy link
Author

@keitherskine
Thanks, bytearray(token_struct) instead token_struct solved issue. But now I'm not sure to close or not: it will be changed internally or just updated documenattion (with bytearray) :)

@jacovg91
Copy link

jacovg91 commented Oct 11, 2023

Changing to bytearray(token_struct) didn't work for us.

Python: 3.9.5
pyodbc: 5.0.0
sqlalchemy: 1.4.0
OS: Linux (Databricks)
DB: MSSQL (AzureSQL)
driver: Microsoft ODBC Driver 17 for SQL Server
login method: token using msal

@mike-f50
Copy link

Hi. We're seeing the same issue, manifesting when our Great Expectations checkpoint is run. (We have an Azure SQL database configured as a store, which GE connects to using a token.)

For us, the error message is:

ConnectException: Connection refused (Connection refused)
Error while obtaining a new communication channel

The pyodbc.connect command is buried within GE, so the suggested token_struct change is hard for us to test, but rolling back to 4.0.38 has solved the problem for now.

@keitherskine
Copy link
Collaborator

Hi @karoldob , please do keep this issue open. This looks like a genuine bug.

@jabbera
Copy link
Contributor

jabbera commented Oct 12, 2023

I compiled with address sanitization. There is a use after free bug:
==437320==ERROR: AddressSanitizer: heap-use-after-free on address 0x625000057900 at pc 0x7ff9c2b6b397 bp 0x7ffd76615910 sp 0x7ffd766150b8
READ of size 125829128 at 0x625000057900 thread T0
#0 0x7ff9c2b6b396 in __interceptor_memcpy ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
#1 0x7ff9be2ff632 (/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1+0xff632)
#2 0x7ff9be30723d (/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1+0x10723d)
#3 0x7ff9be32fafd (/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1+0x12fafd)
#4 0x7ff9be32fbef (/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1+0x12fbef)
#5 0x7ff9be324113 (/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1+0x124113)
#6 0x7ff9be3249a0 (/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1+0x1249a0)
#7 0x7ff9be327c48 (/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1+0x127c48)
#8 0x7ff9be328463 (/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1+0x128463)
#9 0x7ff9be2972c8 (/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1+0x972c8)
#10 0x7ff9be2cb95d (/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1+0xcb95d)
#11 0x7ff9be2966e9 in SQLDriverConnectW (/opt/microsoft/msodbcsql17/lib64/libmsodbcsql-17.10.so.4.1+0x966e9)
#12 0x7ff9bf735e91 in SQLDriverConnectW (/lib/x86_64-linux-gnu/libodbc.so.2+0x35e91)
#13 0x7ff9bf790d5b in Connect src/connection.cpp:92
#14 0x7ff9bf790d5b in Connection_New(_object*, bool, long, bool, _object*, Object&) src/connection.cpp:205
#15 0x7ff9bf7af4a0 in mod_connect src/pyodbcmodule.cpp:523
#16 0x5562002c2e0d (/usr/bin/python3.10+0x15fe0d)
#17 0x5562002b95ea in _PyObject_MakeTpCall (/usr/bin/python3.10+0x1565ea)
#18 0x5562002b2907 in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x14f907)
#19 0x55620039ce55 (/usr/bin/python3.10+0x239e55)
#20 0x55620039ccf5 in PyEval_EvalCode (/usr/bin/python3.10+0x239cf5)
#21 0x5562003c77d7 (/usr/bin/python3.10+0x2647d7)
#22 0x5562003c10ba (/usr/bin/python3.10+0x25e0ba)
#23 0x5562003c7524 (/usr/bin/python3.10+0x264524)
#24 0x5562003c6a07 in _PyRun_SimpleFileObject (/usr/bin/python3.10+0x263a07)
#25 0x5562003c6652 in _PyRun_AnyFileObject (/usr/bin/python3.10+0x263652)
#26 0x5562003b941d in Py_RunMain (/usr/bin/python3.10+0x25641d)
#27 0x55620038fcac in Py_BytesMain (/usr/bin/python3.10+0x22ccac)
#28 0x7ff9c27f5d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#29 0x7ff9c27f5e3f in __libc_start_main_impl ../csu/libc-start.c:392
#30 0x55620038fba4 in _start (/usr/bin/python3.10+0x22cba4)

0x625000059b61 is located 0 bytes to the right of 8801-byte region [0x625000057900,0x625000059b61)
freed by thread T0 here:
#0 0x7ff9c2be5537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
#1 0x7ff9bf75249d (/lib/x86_64-linux-gnu/libodbc.so.2+0x5249d)

previously allocated by thread T0 here:
#0 0x7ff9c2be5887 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x7ff9bf74248d (/lib/x86_64-linux-gnu/libodbc.so.2+0x4248d)

@jabbera
Copy link
Contributor

jabbera commented Oct 12, 2023

4.0.39 has the same issue, but get's lucky to not hit the bug I think.

@v-chojas
Copy link
Contributor

Is the token being freed before it gets used in the connect?

@jabbera
Copy link
Contributor

jabbera commented Oct 12, 2023

I compiled a sanitized version of unixodbc 2.3.9 and got this additional information:

0x62500003e361 is located 0 bytes to the right of 8801-byte region [0x62500003c100,0x62500003e361)
freed by thread T0 here:
#0 0x7f3f2ab59537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
#1 0x7f3f271b3891 in __connect_part_one (/lib/x86_64-linux-gnu/libodbc.so.2+0x4f891)
#2 0x7f3f27229653 in SQLDriverConnectW (/lib/x86_64-linux-gnu/libodbc.so.2+0xc5653)
#3 0x7f3f2762721f in Connect src/connection.cpp:92
#4 0x7f3f2762721f in Connection_New(_object*, bool, long, bool, _object*, Object&) src/connection.cpp:205
#5 0x7f3f27646eb3 in mod_connect src/pyodbcmodule.cpp:523
#6 0x55c73057ee0d (/usr/bin/python3.10+0x15fe0d)

previously allocated by thread T0 here:
#0 0x7f3f2ab59887 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x7f3f2725d502 in unicode_to_ansi_alloc (/lib/x86_64-linux-gnu/libodbc.so.2+0xf9502)
#2 0x7f3f27248fb7 in SQLSetConnectAttrW (/lib/x86_64-linux-gnu/libodbc.so.2+0xe4fb7)
#3 0x7f3f276264eb in ApplyPreconnAttrs src/connection.cpp:147
#4 0x7f3f27626eb8 in Connection_New(_object*, bool, long, bool, _object*, Object&) src/connection.cpp:198
#5 0x7f3f27646eb3 in mod_connect src/pyodbcmodule.cpp:523
#6 0x55c73057ee0d (/usr/bin/python3.10+0x15fe0d)

@v-chojas
Copy link
Contributor

What parameters are you seeing in your SQLSetConnectAttrW call for the access token? Looks like it's somehow going through the string conversion path, which shouldn't be reached by this code path. Also, is that with or without the proposed fix at #1290 ?

@jabbera
Copy link
Contributor

jabbera commented Oct 12, 2023

@v-chojas ugh, I had a bug i my test harness causing it to go down the string path. The PR does indeed fix the issue!

@mkleehammer
Copy link
Owner

The pull request keitherskine, 1290, fixes this it sounds like. I'm releasing 5.0.1 with this fix now. Please reopen if I missed something.

Thanks everyone for troubleshooting this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants