Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlite3 segfaults and bus errors when given certain unicode strings as queries #56778

Closed
JeremyBanks mannequin opened this issue Jul 14, 2011 · 11 comments
Closed

sqlite3 segfaults and bus errors when given certain unicode strings as queries #56778

JeremyBanks mannequin opened this issue Jul 14, 2011 · 11 comments
Labels
stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@JeremyBanks
Copy link
Mannequin

JeremyBanks mannequin commented Jul 14, 2011

BPO 12569
Nosy @amauryfa, @vstinner, @ned-deily

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2011-07-15.04:36:06.200>
created_at = <Date 2011-07-14.23:23:42.101>
labels = ['library', 'type-crash']
title = 'sqlite3 segfaults and bus errors when given certain unicode strings as queries'
updated_at = <Date 2011-07-15.09:56:41.957>
user = 'https://bugs.python.org/jeremybanks'

bugs.python.org fields:

activity = <Date 2011-07-15.09:56:41.957>
actor = 'amaury.forgeotdarc'
assignee = 'none'
closed = True
closed_date = <Date 2011-07-15.04:36:06.200>
closer = 'ned.deily'
components = ['Library (Lib)']
creation = <Date 2011-07-14.23:23:42.101>
creator = 'jeremybanks'
dependencies = []
files = []
hgrepos = []
issue_num = 12569
keywords = []
message_count = 11.0
messages = ['140381', '140384', '140385', '140386', '140387', '140389', '140392', '140393', '140395', '140398', '140400']
nosy_count = 5.0
nosy_names = ['ghaering', 'amaury.forgeotdarc', 'vstinner', 'ned.deily', 'jeremybanks']
pr_nums = []
priority = 'normal'
resolution = 'works for me'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'crash'
url = 'https://bugs.python.org/issue12569'
versions = ['Python 3.1']

@JeremyBanks
Copy link
Mannequin Author

JeremyBanks mannequin commented Jul 14, 2011

I was experimenting with the sqlite3 library and noticed that using certain strings as identifiers cause bus errors or segfaults. I'm not very familiar with unicode, but after some Googling I'm pretty sure this happens when I use non-characters or surrogate characters incorrectly.

This causes a bus error:

import sqlite3
c = sqlite3.connect(":memory:")
table_name = '"' + chr(0xD800) + '"'
c.execute("create table " + table_name + " (bar)")

The equivalent Python 2 (replacing chr with unichr) works properly.

@JeremyBanks JeremyBanks mannequin added stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump labels Jul 14, 2011
@ned-deily
Copy link
Member

What operating system platform and version are you seeing this behavior? Also can you report the versions of sqlite3 adapter and the sqlite3 library by executing the following in the interpreter?

>>> sqlite3.version
'2.6.0'
>>> sqlite3.sqlite_version
'3.6.12'

On Linux and OS X systems I've tested, rather than a segfault your test case causes an exception to be raised.

For Python 3.1.4:
"sqlite3.Warning: SQL is of wrong type. Must be string or unicode."

For Python 3.2.1
"UnicodeEncodeError: 'utf-8' codec can't encode character '\ud800' in position 14: surrogates not allowed"

@JeremyBanks
Copy link
Mannequin Author

JeremyBanks mannequin commented Jul 15, 2011

I'm using OS X 10.6.7.

The bus error is occurring with my Python 3.1 installation:
path: /Library/Frameworks/Python.framework/Versions/3.1/bin/python3
sqlite3.version == 2.4.1
sqlite3.sqlite_version = 3.6.11.

But now that you mention it, my MacPorts installations of Python 3.0 and 3.1 just get an exception like you do:
paths: /opt/local/bin/python3.0 / python3.1
sqlite3.version == 2.4.1
sqlite3.sqlite_version == 3.7.7.1

A Python 2.7 installation where it works without any error:
path: /Library/Frameworks/Python.framework/Versions/2.7/bin/python
sqlite3.version == 2.6.0
sqlite3.sqlite_version == 3.6.12

A MacPorts Python 2.6 installation where it works without any error:
path: /opt/local/bin/python2.6
sqlite3.version == 2.4.1
sqlite3.sqlite_version == 3.7.7.1

@ned-deily
Copy link
Member

Sorry, I cannot reproduce on Mac OS X 10.6.8 the crash behavior you report using various Python 3.1.x installed from the python.org Python OS X installers, in particular, 3.1 and 3.1.4 (the first and the most recent 3.1 releases). If this Python instance was not installed from a python.org installer, I suggest contacting the distributor that supplied it. If you built it from source, suggest checking what ./configure options you used and which copy of the sqlite3 library was used. You might want to take this opportunity to update to Python 3.2.1 since no further bug fixes (other than security fixes) are expected to be released for 3.1.x.

@JeremyBanks
Copy link
Mannequin Author

JeremyBanks mannequin commented Jul 15, 2011

I'll try that, thank you.

If it works without exception in Python 2, isn't the behaviour in Python 3 a regression bug, even if it doesn't crash? If so, should I create a new/separate issue for the behaviour?

@ned-deily
Copy link
Member

0xD800 does not represent a valid Unicode character; it's a surrogate code point (see http://en.wikipedia.org/wiki/Mapping_of_Unicode_characters#Surrogates). If you use a code point that does represent a Unicode character, say 0xA800, there is no error. If there is a bug here, it's that the Python 2 version does not report an error for this edge case.

@vstinner
Copy link
Member

I already fixed this issue in Python 3.1, 3.2 and 3.3: issue bpo-6697 (e.g. commit 7ba851d1b46e).

$ ./python 
Python 3.3.0a0 (default:ab162f925761, Jul 15 2011, 09:36:17) 
>>> import sqlite3
>>> c = sqlite3.connect(":memory:")
>>> table_name = '"' + chr(0xD800) + '"'
>>> c.execute("create table " + table_name + " (bar)")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'utf-8' codec can't encode character '\ud800' in position 14: surrogates not allowed

@jeremyBanks: I don't think that you use sqlite3 coming from Python 3 but the third party module.

@vstinner
Copy link
Member

I already fixed this issue in Python 3.1, 3.2 and 3.3:
issue bpo-6697 (e.g. commit 7ba851d1b46e).

Oh, wrong: the bug was only fixed in Python 3.2 and 3.3. There was already a check after _PyUnicode_AsStringAndSize(), but the test was on the wrong variable (operation vs operation_cstr).

Because only security bugs can be fixed in Python 3.1, I think that this issue should be closed. Or do you consider dereferencing a NULL pointer in sqlite3 as a security vulnerability?

@amauryfa
Copy link
Member

It seems that a fix was merged in the 3.1 branch, somewhere between 3.1.2 and 3.1.3.

@vstinner
Copy link
Member

It seems that a fix was merged in the 3.1 branch,
somewhere between 3.1.2 and 3.1.3.

Which fix? The code is still wrong in Mercurial (branch 3.1):

493 operation_cstr = _PyUnicode_AsStringAndSize(operation, &operation_len);
494 if (operation == NULL)
495 goto error;

http://hg.python.org/cpython/file/42ec507815d2/Modules/_sqlite/cursor.c

@amauryfa
Copy link
Member

The fix was c073f3c3276e (thanks to hg bisect)
the variable operation_cstr is not used before the call to pysqlite_cache_get(), which also tries to encode the statement into utf8 and correctly raises an exception.
In early 3.1.2, the segfault came from the DECREF of an uninitialized member...

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

3 participants