Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Management Command Database Locking #219

Merged
merged 20 commits into from
Feb 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
5 changes: 2 additions & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,15 +44,14 @@ jobs:
python-version: ['3.7', '3.8', '3.9', '3.10']
django-version: ['2.2', '3.2', '4.0']
xapian-version: ['1.4.18']
filelock-version: ['3.4.2']
exclude:
# Django added python 3.10 support in 3.2.9
- python-version: '3.10'
django-version: '2.2'
xapian-version: '1.4.18'
# Django dropped python 3.7 support in 4.0
- python-version: '3.7'
django-version: '4.0'
xapian-version: '1.4.18'

steps:
- name: Set up Python ${{ matrix.python-version }}
Expand All @@ -74,7 +73,7 @@ jobs:
- name: Install Django and other Python dependencies
run: |
python -m pip install --upgrade pip
pip install django~=${{ matrix.django-version }} coveralls xapian*.whl
pip install django~=${{ matrix.django-version }} filelock~=${{ matrix.filelock-version }} coveralls xapian*.whl

- name: Checkout django-haystack
uses: actions/checkout@v2
Expand Down
2 changes: 2 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ Unreleased
----------

- Dropped support for Python 3.6.
- Fixed DatabaseLocked errors when running management commands with
multiple workers.

v3.0.1 (2021-11-12)
-------------------
Expand Down
2 changes: 2 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,8 @@ The backend has the following optional settings:
See `here <http://xapian.org/docs/apidoc/html/classXapian_1_1QueryParser.html#ac7dc3b55b6083bd3ff98fc8b2726c8fd>`__ for
more information about the different strategies.

- ``HAYSTACK_XAPIAN_USE_LOCKFILE``: Use a lockfile to prevent database locking errors when running management commands with multiple workers.
Defaults to `True`.
claudep marked this conversation as resolved.
Show resolved Hide resolved

Testing
-------
Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
Django>=2.2
Django-Haystack>=3.0
filelock>=3.4
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,5 +28,6 @@ def read(fname):
install_requires=[
'django>=2.2',
'django-haystack>=2.8.0',
'filelock>=3.4',
]
)
19 changes: 19 additions & 0 deletions tests/xapian_tests/tests/test_management_commands.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import sys
from io import StringIO
from unittest import TestCase

from django.core.management import call_command
Expand Down Expand Up @@ -82,3 +84,20 @@ def test_remove(self):
# … but remove does:
call_command("update_index", remove=True, verbosity=0)
self.verify_indexed_document_count(self.NUM_BLOG_ENTRIES - 3)

def test_multiprocessing(self):
self.verify_indexed_document_count(0)

old_stderr = sys.stderr
sys.stderr = StringIO()
call_command(
"update_index",
verbosity=2,
workers=10,
batchsize=2,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://docs.djangoproject.com/en/4.0/ref/django-admin/#output-redirection, you should be able to capture stderr with the stderr= param of call_command.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that and it didn't work.
It turns out for that to work the Command implementer has to explicitly send prints to print("blah", stderr=self.stderr), where self is BaseCommand. And I think these messages aren't even coming from python, but the underlying xapian database implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the underlying xapian database implementation is probably where all this file locking nonsense should be, but I digress.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the failure mode for this test is only occasionally parsing that stderr string. Most of the time call_command() throws an exception and halts the test before the assert.

err = sys.stderr.getvalue()
sys.stderr = old_stderr
print(err)
self.assertNotIn("xapian.DatabaseLockError", err)
self.verify_indexed_documents()
30 changes: 30 additions & 0 deletions xapian_backend.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import datetime
import pickle
from pathlib import Path
import os
import re
import shutil
Expand All @@ -8,6 +9,8 @@
from django.conf import settings
from django.core.exceptions import ImproperlyConfigured

from filelock import FileLock

from haystack import connections
from haystack.backends import BaseEngine, BaseSearchBackend, BaseSearchQuery, SearchNode, log_query
from haystack.constants import ID, DJANGO_ID, DJANGO_CT, DEFAULT_OPERATOR
Expand Down Expand Up @@ -73,6 +76,24 @@
# texts with positional information
TERMPOS_DISTANCE = 100


def filelocked(func):
"""Decorator to wrap a XapianSearchBackend method in a filelock."""

def wrapper(self, *args, **kwargs):
"""Run the function inside a lock."""
if self.path == MEMORY_DB_NAME or not self.use_lockfile:
func(self, *args, **kwargs)
else:
lockfile = Path(self.filelock.lock_file)
lockfile.parent.mkdir(parents=True, exist_ok=True)
lockfile.touch()
with self.filelock:
func(self, *args, **kwargs)

return wrapper


class InvalidIndexError(HaystackError):
"""Raised when an index can not be opened."""
pass
Expand Down Expand Up @@ -168,6 +189,9 @@ def __init__(self, connection_alias, **connection_options):

Also sets the stemming language to be used to `language`.
"""
self.use_lockfile = bool(
getattr(settings, 'HAYSTACK_XAPIAN_USE_LOCKFILE', True)
)
super().__init__(connection_alias, **connection_options)

if not 'PATH' in connection_options:
Expand All @@ -182,6 +206,10 @@ def __init__(self, connection_alias, **connection_options):
except FileExistsError:
pass

if self.use_lockfile:
lockfile = Path(self.path) / "lockfile"
self.filelock = FileLock(lockfile)

self.flags = connection_options.get('FLAGS', DEFAULT_XAPIAN_FLAGS)
self.language = getattr(settings, 'HAYSTACK_XAPIAN_LANGUAGE', 'english')

Expand Down Expand Up @@ -225,6 +253,7 @@ def column(self):
self._update_cache()
return self._columns

@filelocked
def update(self, index, iterable, commit=True):
"""
Updates the `index` with any objects in `iterable` by adding/updating
Expand Down Expand Up @@ -476,6 +505,7 @@ def add_datetime_to_document(termpos, prefix, term, weight):
finally:
database.close()

@filelocked
def remove(self, obj, commit=True):
"""
Remove indexes for `obj` from the database.
Expand Down