Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on prefetch when objects is already deleted, but xapian index not updated, yet #122

Open
GoogleCodeExporter opened this issue Mar 16, 2015 · 6 comments

Comments

@GoogleCodeExporter
Copy link

When using prefetch and we hit a model object that has been deleted already, 
one get a key error in Djapian. Such a non-existent model object is found, in 
case of an outdated xapian index.

In our case, we call index --rebuild once a day. In between, objects are 
deleted by users. To prevent the subsequent key error in serach results, please 
change the following in ** resultset.py** 

resultset.py, line 150+

for hit in hits:
    try: hit.instance = instances[hit.pk]
    except: self._resultset_cache.remove(hit)

Instead of attaching instances[hit.pk] to hit.instance (which throws the actual 
error), we simply remove this hit from the resultset :)

Cheers,
Simon

Original issue reported on code.google.com by i...@pagewizz.com on 12 Nov 2010 at 6:26

@GoogleCodeExporter
Copy link
Author

yes,

i found it also. but this is not a 100% solution, when using pagination. it 
only prevents code from showing errors.

bye

Original comment by matu...@gmail.com on 24 Nov 2010 at 3:39

@GoogleCodeExporter
Copy link
Author

I think the true solution (which supports count() and pagination) would be 
using a custom MatchDecider. Below are two examples of how to get a custom 
MatchDecider for your Djapian-enabled search.

First one is a naive DBExistsCompositeDecider which just checks if the match we 
got from Xapian search backend still exists in DB. It hits DB for checking 
every result hit from Xapian search results, so performance is degraded quite a 
bit.

{{{
from djapian.decider import CompositeDecider
from django.core.exceptions import ObjectDoesNotExist
class DBExistsCompositeDecider(CompositeDecider):
    def __init__(self, model, *args, **kwargs):
        super(DBExistsCompositeDecider, self).__init__(model, *args, **kwargs)
        self.to_python = model._meta.pk.to_python
        self.objects = self._model._default_manager

    def __call__(self, document):
        res = super(DBExistsCompositeDecider, self).__call__(document)
        if not res:
            return res

        pk = self.to_python(document.get_value(1))
        try:
            self.objects.get(pk=pk)
        except ObjectDoesNotExist:
            return False

        return True
}}}

The second one uses Djapian's Change model (DB table) to check if there are 
pending requests to update the search index after some objects have been 
deleted. The ChangesCompositeDecider keeps a list of object IDs of the known 
content_type (Django model) which are being marked as 'deleted', so they are 
already gone from the DB but still exist in Xapian index. The DB gets hit a 
single time per each Django Model used during the search (in 90% cases that 
would be only once, but if you are using CompositeIndexer then your experience 
may vary).

{{{
from djapian.decider import CompositeDecider
from djapian.models import Change
from django.contrib.contenttypes.models import ContentType
class ChangesCompositeDecider(CompositeDecider):
    def __init__(self, model, *args, **kwargs):
        super(ChangesCompositeDecider, self).__init__(model, *args, **kwargs)
        self.to_python = model._meta.pk.to_python
        self._deleted = map(self.to_python,
                            Change._default_manager.filter(content_type=ContentType.objects.get_for_model(model),
                                                           action=Change.ACTIONS[2][0]) \
                                                   .values_list('object_id', flat=True))

    def __call__(self, document):
        res = super(ChangesCompositeDecider, self).__call__(document)
        if not res:
            return res

        pk = self.to_python(document.get_value(1))
        if pk in self._deleted:
            return False

        return True
}}}

Well, the last thing to be done is to use the appropriate decider for our 
ModelIndexer class and that's it!

{{{
class PosterIndexer(WeightenedIndexer):
    decider = ChangesCompositeDecider
    fields = ('show__name', 'show__description',
              )

    trigger = lambda indexer, obj: obj.date_end >= date.today()


# Registering models for the full-text search index (powered by djapian).
djapian.add_index(Poster, PosterIndexer, attach_as='indexer')
}}}

We should consider inclusion of these (or alike) MatchDeciders into the Djapian 
distribution as an examples and handy tools for direct usage.

Original comment by esizi...@gmail.com on 4 Mar 2011 at 2:26

@GoogleCodeExporter
Copy link
Author

BTW, the examples above does not cupport CompositeIndexer use-case directly. 
The update is possible and quite trivial, though.

Original comment by esizi...@gmail.com on 5 Mar 2011 at 8:25

@GoogleCodeExporter
Copy link
Author

In r384 a test case demonstrating the issue has been committed. I would prefer 
we implement a fix or any workaround as part of the default 'out of the box' 
solution for better user experience.

Original comment by esizi...@gmail.com on 21 Oct 2011 at 12:51

@GoogleCodeExporter
Copy link
Author

Original comment by esizi...@gmail.com on 24 Oct 2011 at 11:27

  • Changed state: Accepted

@GoogleCodeExporter
Copy link
Author

In r385 a trivial fix has been committed, needs a review.

It may be nice to notify somehow that not all results which were found by 
Xapian are really available.

Original comment by esizi...@gmail.com on 26 Oct 2011 at 10:47

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant