Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BTree.get() swallows POSKeyError on internal corruption (C only) #161

Closed
jamadden opened this issue Apr 6, 2021 · 2 comments · Fixed by #162
Closed

BTree.get() swallows POSKeyError on internal corruption (C only) #161

jamadden opened this issue Apr 6, 2021 · 2 comments · Fixed by #162
Assignees
Labels

Comments

@jamadden
Copy link
Member

jamadden commented Apr 6, 2021

(Discovered when researching #82 and #91.)

If an internal node of a BTree is missing, searching through the BTree will encounter POSKeyError when it calls PER_USE(child). Unfortunately, because POSKeyError is a subclass of KeyError, the get() method will interpret this as a missing key and return the default value. The internal corruption in the BTree goes silently unreported.

BTree_getm(BTree *self, PyObject *args)
{
PyObject *key, *d=Py_None, *r;
UNLESS (PyArg_ParseTuple(args, "O|O", &key, &d))
return NULL;
if ((r=_BTree_get(self, key, 0, _BGET_REPLACE_TYPE_ERROR)))
return r;
UNLESS (PyErr_ExceptionMatches(PyExc_KeyError))
return NULL;
PyErr_Clear();
Py_INCREF(d);
return d;
}

This script can reproduce it:

from ZODB.DB import DB
from ZODB.utils import p64 as int2b
from ZODB.MappingStorage import MappingStorage
import transaction
import BTrees
from ZODB.POSException import POSKeyError

BIG = 1
BIG_UPPER_BOUND = 10000000

def test_mod(bt_mod):
    db = DB(MappingStorage())
    bt = bt_mod.BTree()
    if BIG:
        for i in range(BIG_UPPER_BOUND):
            bt[i] = i
        for i in range(0, -BIG_UPPER_BOUND, -1):
            bt[i] = i
    else:
        for i in range(10000):
            bt[i] = i


    conn = db.open()
    conn.root.key = bt
    transaction.commit()

    conn.cacheMinimize()
    if BIG:
        bt._p_activate() # bring in the root

    oss = conn.setstate
    def setstate(*args):
        print("Loading", *args)
        return oss(*args)

    conn.setstate = setstate

    # Delete some random leaves, but keep the first node we will try
    # to access.
    data = db._storage._data.copy()
    db._storage._data.clear()
    if BIG:
        for oid_to_keep in 0x04, 0x29, 0x52, 0x05:
            db._storage._data[int2b(oid_to_keep)] = data[int2b(oid_to_keep)]

    print("Getting key")
    try:
        bt.get(BIG_UPPER_BOUND + 1)
    except POSKeyError as e:
        print("Got expected POSKeyError", e)
    else:
        print("HEY NO EXCEPTION!!!")

for bt_mod in BTrees.family64.OO, BTrees.family64.II:
    test_mod(bt_mod)

With a big tree consisting of internal nodes, no exception is reported to Python:

$ python test.py
Getting key
Loading <BTrees.OOBTree.OOBTree object at 0x185e71370 oid 0x52 in <Connection at 104bc42d0>>
Loading <BTrees.OOBTree.OOBTree object at 0x104be8870 oid 0x13b6b2 in <Connection at 104bc42d0>>
Couldn't load state for BTrees.OOBTree.OOBTree 0x13b6b2
Traceback (most recent call last):
  File "//lib/python3.8/site-packages/ZODB/Connection.py", line 791, in setstate
    p, serial = self._storage.load(oid)
  File "//lib/python3.8/site-packages/ZODB/mvccadapter.py", line 143, in load
    r = self._storage.loadBefore(oid, self._start)
  File "//python3.8/site-packages/ZODB/utils.py", line 288, in __call__
    return func(*args, **kw)
  File "//python3.8/site-packages/ZODB/MappingStorage.py", line 168, in loadBefore
    raise ZODB.POSException.POSKeyError(oid)
ZODB.POSException.POSKeyError: 0x13b6b2
HEY NO EXCEPTION!!!
Getting key
Loading <BTrees.LLBTree.LLBTree object at 0x10484c910 oid 0x5 in <Connection at 104becd70>>
Loading <BTrees.LLBTree.LLBTree object at 0x15dac1cd0 oid 0x50d61 in <Connection at 104becd70>>
Couldn't load state for BTrees.LLBTree.LLBTree 0x050d61
Traceback (most recent call last):
  File "//python3.8/site-packages/ZODB/Connection.py", line 791, in setstate
    p, serial = self._storage.load(oid)
  File "//python3.8/site-packages/ZODB/mvccadapter.py", line 143, in load
    r = self._storage.loadBefore(oid, self._start)
  File "//python3.8/site-packages/ZODB/utils.py", line 288, in __call__
    return func(*args, **kw)
  File "//python3.8/site-packages/ZODB/MappingStorage.py", line 168, in loadBefore
    raise ZODB.POSException.POSKeyError(oid)
ZODB.POSException.POSKeyError: 0x050d61
HEY NO EXCEPTION!!!

Setting BIG to 0 so that it's the root BTree itself that is missing does raise a POSKeyError.

@jamadden
Copy link
Member Author

jamadden commented Apr 6, 2021

This is only an issue in the C implementation; the Python implementation lets the POSKeyError percolate.

@jamadden jamadden changed the title BTree.get() swallows POSKeyError on internal corruption BTree.get() swallows POSKeyError on internal corruption (C only) Apr 6, 2021
@jamadden jamadden self-assigned this Apr 7, 2021
@jamadden
Copy link
Member Author

jamadden commented Apr 7, 2021

This also applies to key in map, map.setdefault(k, v), and map.pop(k).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant