Support for PostgreSQL to_tsvector function with multiple fields #884

Charnelx · 2018-03-09T09:37:32Z

Hi.
I added SearchVectorFilter to support to_tsvector function for full text search. Theare two problems I faced:

this thing is vendor-specific (yep, only PostgreSQL)
to search by multiple fields we need to create one more argument (e.x. field_names) or use existing field_name with additional type check

Pros:

fast search
search by multiple fields at one's

Cons:

used only with PostgreSQL
dual meaning of field_name argument wich may lead to misunderstanding and mistakes

Case of use:

models.py

class Person(models.Model):

    name = models.CharField(max_length=128, verbose_name='person name')
    surname = models.CharField(max_length=128, verbose_name='person surname')

views.py

class PersonFilter(filters.FilterSet):
    person = SearchVectorFilter(field_names=['name', 'surname'], lookup_expr='icontains')

rpkilby

Hi. I've added more detailed feedback below, but in short, overloading field_name isn't the correct thing to do here. Instead, extend the Filter to use additional arguments.

Also, tests will need to be provided.

rpkilby · 2018-03-12T10:02:59Z

django_filters/utils.py

@@ -78,8 +78,19 @@ def get_field_parts(model, field_name):
        >>> [p.verbose_name for p in parts]
        ['author', 'first name']

+        >>> parts = get_field_parts(Book, [author__first_name, 'surname'])
+        >>> [p.verbose_name for p in parts]
+        ['author', 'first_name', 'surname']


get_field_parts is not intended to return a list of field names, but a list of the individual parts of a related field name (FK, m2m, 1to1). This result would be nonsensical, as first_name is a regular field on the author model, and does not represent a relationship to an additional Name model or something.

Either way, as explained above, overloading field_name is not the correct path, and these changes are a symptom of that. I would revert these changes.

You truth, I have read method descption about "the traversable relationships" and..ignored it.

rpkilby · 2018-03-12T10:10:26Z

django_filters/filters.py

@@ -484,6 +486,38 @@ class TimeRangeFilter(RangeFilter):
    field_class = TimeRangeField


+class SearchVectorFilter(Filter):


I would inherit CharFilter, as you are validating text.

Reasonably and thus accepted.

rpkilby · 2018-03-12T10:20:55Z

django_filters/filters.py

@@ -484,6 +486,38 @@ class TimeRangeFilter(RangeFilter):
    field_class = TimeRangeField


+class SearchVectorFilter(Filter):
+


Instead of overloading the purpose of field_name, create a custom __init__ that requires users to pass in a list of search fields. eg,

# Note that this provides a default `field_name` to be used for the annotation # and query, and that `search_fields` will be a required keyword argument. def __init__(self, field_name='search_vector', lookup_expr='exact', *, search_fields, **kwargs): super().__init__(field_name, lookup_expr, **kwargs) self.search_fields = search_fields

From here, it's really simple to overload the .filter() method to add in the annotation

def filter(self, qs, value): # Note that the annotation is operating on the filter's `field_name`, which was # set above to 'search_vector' as the filter's default value. qs.annotate(self.field_name=SearchVector(*self.search_fields)) return super().filter(qs, value)

Yes, this was the way it should be implemented. Thanks.

Charnelx · 2018-03-12T12:19:48Z

@rpkilby great thanks for detailed response!
Well, adding additional argument to Filter was my first idea but I thought that I would be able to make something that not require any external change at all. Now I see the failure of this approach.

I'll take into account your remarks and make another try + will write tests.

rpkilby requested changes Mar 12, 2018

View reviewed changes

Charnelx closed this Mar 12, 2018

Charnelx force-pushed the master branch from 762da93 to a88c2bc Compare March 12, 2018 12:27

Charnelx mentioned this pull request Mar 12, 2018

Support for PostgreSQL to_tsvector function with multiple fields #885

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for PostgreSQL to_tsvector function with multiple fields #884

Support for PostgreSQL to_tsvector function with multiple fields #884

Charnelx commented Mar 9, 2018

rpkilby left a comment

rpkilby Mar 12, 2018

Charnelx Mar 12, 2018

rpkilby Mar 12, 2018

Charnelx Mar 12, 2018

rpkilby Mar 12, 2018 •

edited

Charnelx Mar 12, 2018

Charnelx commented Mar 12, 2018

		@@ -484,6 +486,38 @@ class TimeRangeFilter(RangeFilter):
		field_class = TimeRangeField


		class SearchVectorFilter(Filter):

Support for PostgreSQL to_tsvector function with multiple fields #884

Support for PostgreSQL to_tsvector function with multiple fields #884

Conversation

Charnelx commented Mar 9, 2018

rpkilby left a comment

Choose a reason for hiding this comment

rpkilby Mar 12, 2018

Choose a reason for hiding this comment

Charnelx Mar 12, 2018

Choose a reason for hiding this comment

rpkilby Mar 12, 2018

Choose a reason for hiding this comment

Charnelx Mar 12, 2018

Choose a reason for hiding this comment

rpkilby Mar 12, 2018 • edited

Choose a reason for hiding this comment

Charnelx Mar 12, 2018

Choose a reason for hiding this comment

Charnelx commented Mar 12, 2018

rpkilby Mar 12, 2018 •

edited