Skip to content

Savepoint commit logging is misleading (and potentially expensive) #282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jamadden opened this issue Sep 26, 2019 · 3 comments · Fixed by #283
Closed

Savepoint commit logging is misleading (and potentially expensive) #282

jamadden opened this issue Sep 26, 2019 · 3 comments · Fixed by #283

Comments

@jamadden
Copy link
Member

When savepoints are in use, ZODB.Connection logs something like this (notice that's 80GB):

DEBUG [ZODB.Connection]] Committing savepoints of size 86970105856

ZODB/src/ZODB/Connection.py

Lines 1026 to 1033 in 2f8cc67

def _commit_savepoint(self, transaction):
"""Commit all changes made in savepoints and begin 2-phase commit
"""
src = self._savepoint_storage
self._storage = self._normal_storage
self._savepoint_storage = None
try:
self._log.debug("Committing savepoints of size %s", src.getSize())

Where src is a TmpStore. TmpStore copies the getSize() method from the underlying original storage, so what's being logged isn't the size of the savepoint, it's the size of the entire database.

ZODB/src/ZODB/Connection.py

Lines 1131 to 1141 in 2f8cc67

class TmpStore(object):
"""A storage-like thing to support savepoints."""
def __init__(self, storage):
self._storage = storage
for method in (
'getName', 'new_oid', 'getSize', 'sortKey',
'isReadOnly'
):
setattr(self, method, getattr(storage, method))

On RelStorage, in order to provide a consistent, up-to-date answer (and be usable no matter what else the state of the storage is) this always opens a new connection to the database, which can be slow.

https://github.com/zodb/relstorage/blob/2c8997d6c862de7e6a4d9f2a8d368a5471e93ff7/src/relstorage/adapters/postgresql/stats.py#L45-L50

Perhaps what we want to log is len(src)? That's the number of entries in the TmpStore.

@jmuchemb
Copy link
Member

Why not src.position ?

@jmuchemb
Copy link
Member

BTW, IStorage defines both __len__ & getSize so it's not consistent that TmpStore redirects only 1 of them to the underlying storage.

@jamadden
Copy link
Member Author

src.position includes extra header data for each object and I think in the event of multiple savepoints that write the same OID more than once it also includes duplicates, but I haven't looked that closely. Either way seems like an improvement.

jamadden added a commit that referenced this issue Sep 26, 2019
This is consistent with its __len__ and provides a better figure for existing logging.

Fixes #282
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants