Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix fork deadlock #299

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

arieleiz
Copy link

When a fork occurs while the sqlite db is mid-transaction, so sqlite transaction lock remains locked forever in the foree, deadlocking it.

This patch protects against this case using os.register_at_fork().

When a fork occurs while the sqlite db is mid-transaction,
so sqlite transaction lock remains locked forever in the foree,
deadlocking it.

This patch protects against this case using os.register_at_fork().
@arieleiz arieleiz mentioned this pull request Dec 24, 2023
Previous commit was on Windows after testing there too
Copy link
Owner

@grantjenks grantjenks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked deeply at the changes yet but thanks for the PR. I approved the CI workflow and asked a couple questions.

diskcache/core.py Outdated Show resolved Hide resolved
tests/test_fork_multithreading.py Outdated Show resolved Hide resolved
arieleiz and others added 2 commits January 18, 2024 11:35
Co-authored-by: Grant Jenks <grant.jenks@gmail.com>
_disk_remove = self._disk.remove
tid = threading.get_ident()
txn_id = self._txn_id
_acquireLock()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried that acquiring the lock at the start of _transact and releasing it at the end is holding it for too long. The yield in the middle of the context manager means the lock could be held for a long while. I haven't thought of a specific scenario that makes this problematic but I'm thinking about it.

The fork system call copies all memory but not all threads. Only the currently executing thread is copied to the forked process. So if one thread holds the SQLite transaction lock and another thread forks ... what's the problem? I don't see how the SQLite transaction remains forever locked.

try:
_lock.acquire()
except BaseException:
_lock.release()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could an exception occur here in which the lock is released twice?


def _after_at_fork_child_reinit_locks():
global _lock
_lock = threading.RLock()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why re-init the lock in the child? I would expect the child to already have a copy of the lock.

if thread.is_alive():
os.kill(pid, signal.SIGKILL)
thread.join()
assert False, "Deadlock detected."
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some more comments to explain how this test works? I appreciate the test a lot but am not sure how to follow it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants