[BUG] `Text` query doesn't work with `fetch_links=True` #606

humbertoortuzar · 2023-07-05T14:01:35Z

Describe the bug
Beanie @ 1.20.0
fetch_links=True works now with a .count() and .aggregate(...), but...
When querying a document with links and a $text statement, the query fails because in the pipeline prefetched lookups are built before the {$match: {$text: ... }}. stage. Or at least when using build_aggregation_pipeline

def build_aggregation_pipeline(self):
    aggregation_pipeline: List[Dict[str, Any]] = construct_lookup_queries(
        self.document_model
    )

    aggregation_pipeline.append({"$match": self.get_filter_query()})

    sort_pipeline = {"$sort": {i[0]: i[1] for i in self.sort_expressions}}
    if sort_pipeline["$sort"]:
        aggregation_pipeline.append(sort_pipeline)
    if self.skip_number != 0:
        aggregation_pipeline.append({"$skip": self.skip_number})
    if self.limit_number != 0:
        aggregation_pipeline.append({"$limit": self.limit_number})
    return aggregation_pipeline

So when you have a .find(Text(search_query), fetch_links=True) you get this error:

To Reproduce

class ModelB(Document):
    some_field: str = Field(..., description="A field")
    other_field: Optional[str] = Field(None, description="Another field")

class ModelA(Document):
    name: Indexed(str, pymongo.TEXT) = Field(
        ..., description="Name of the ModelA instance."
    )

    model_b: Optional[Link[ModelB]] = Field(
        None, description="Reference to ModelA."
    )

    is_deleted: StrictBool = Field(False, description="Indicates if it's soft deleted")

query_result = await ModelA.find(Text("some text"), fetch_links=True).to_list()

Expected behavior
It should return a result instead of an error. To do this, {$match: {$text: ...}} queries should be included before the construct_lookup_queries(self.document_model)

Additional context
If anyone else is facing the same, what I'm doing right now is going inside the package to the function mentioned before and printing the stages needed to fetch your links and adding it to a .aggregate() query. If you also need to do a $match, you can add that stage to your pipeline.

In terms of the current beanie code, I was able to fix my case (multiple filters, fetch_links, and an aggregate pipeline) by doing this:

def build_aggregation_pipeline(self):
    aggregation_pipeline: List[Dict[str, Any]] = []
    filter_query = self.get_filter_query()
    text_queries = [match_case for match_case in filter_query.get("$and", filter_query) if "$text" in match_case]
    other_queries = [match_case for match_case in filter_query.get("$and", filter_query) if "$text" not in match_case]
    aggregation_pipeline.append({"$match": {"$and": text_queries}})
    aggregation_pipeline += construct_lookup_queries(self.document_model)
    aggregation_pipeline.append({"$match": {"$and": other_queries}})

    sort_pipeline = {"$sort": {i[0]: i[1] for i in self.sort_expressions}}
    ...

Disclaimer: This won't work to do $text queries in a linked document, but since $text must be declared first in the pipeline, you'd be better by querying the linked document and using a backlink 👍 .

The text was updated successfully, but these errors were encountered:

roman-right · 2023-07-05T19:50:44Z

Hi @humbertoortuzar ,
You are right,
I'll fix this soon. Thank you very much for the catch!

roman-right · 2023-08-21T21:41:07Z

This bug is fixed in #669. Please try

gsakkis · 2023-09-16T20:02:01Z

@roman-right build_aggregation_pipeline is also called in FindMany.aggregate but only if fetch_links is True. So this fix for $text does not apply for aggregates with fetch_links=False. The same holds for other FindMany stages ($sort, $skip, $limit). It's not obvious to me why these are not taken into account when fetch_links=False so perhaps there is a more general bug here but I haven't tested it.

github-actions · 2023-10-17T01:48:39Z

This issue is stale because it has been open 30 days with no activity.

MrEarle · 2023-10-18T19:17:57Z

I believe this is still an issue. I'm using beanie 1.23.

As far as I managed to understand, I believe this happens because in build_aggregation_pipeline it does this:

        filter_query = self.get_filter_query()
        if "$text" in filter_query:
            ...

But get_filter_query joins all filters under an $and statement, so the filter_query looks like {'$and': [{'$text': ...}]}, resulting in "$text" in filter_query being false.

Update:

I managed to get it to work with a very similar suggestion than what @humbertoortuzar said. I replaced the build_aggregation_pipeline with:

        aggregation_pipeline: list[dict[str, Any]] = construct_lookup_queries(
            self.document_model
        )
        filter_query = self.get_filter_query()

        if filter_query:
            text_queries = [
                match_case
                for match_case in filter_query.get("$and", [filter_query])
                if "$text" in match_case
            ]
            non_text_queries = [
                match_case
                for match_case in filter_query.get("$and", [filter_query])
                if "$text" not in match_case
            ]

            if text_queries:
                aggregation_pipeline.insert(0, {"$match": {"$and": text_queries}})

            if non_text_queries:
                aggregation_pipeline.append({"$match": {"$and": non_text_queries}})

        if extra_stages:
            aggregation_pipeline.extend(extra_stages)

        sort_pipeline = {"$sort": {i[0]: i[1] for i in self.sort_expressions}}
        if sort_pipeline["$sort"]:
            aggregation_pipeline.append(sort_pipeline)
        if self.skip_number != 0:
            aggregation_pipeline.append({"$skip": self.skip_number})
        if self.limit_number != 0:
            aggregation_pipeline.append({"$limit": self.limit_number})

        return aggregation_pipeline

roman-right · 2023-10-18T22:02:08Z

Thank you. It will be fixed on the next bug-fixing session.

MrEarle · 2024-01-07T18:01:43Z

I believe this is fixed for aggregate, but I missed that count didn't use the same code in its implementation. I'll draft a PR to fix the count function.

(count still doesn't work with text queries and fetch_links=True)

roman-right added the bug Something isn't working label Jul 5, 2023

roman-right removed the bug Something isn't working label Aug 21, 2023

github-actions bot added the Stale label Oct 17, 2023

roman-right added bug Something isn't working and removed Stale labels Oct 18, 2023

MrEarle mentioned this issue Oct 19, 2023

Fix aggregations with text queries #752

Merged

MrEarle mentioned this issue Jan 7, 2024

fix: count with text queries and links #826

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] `Text` query doesn't work with `fetch_links=True` #606

[BUG] `Text` query doesn't work with `fetch_links=True` #606

humbertoortuzar commented Jul 5, 2023

roman-right commented Jul 5, 2023

roman-right commented Aug 21, 2023

gsakkis commented Sep 16, 2023

github-actions bot commented Oct 17, 2023

MrEarle commented Oct 18, 2023 •

edited

Loading

roman-right commented Oct 18, 2023

MrEarle commented Jan 7, 2024

[BUG] Text query doesn't work with fetch_links=True #606

[BUG] Text query doesn't work with fetch_links=True #606

Comments

humbertoortuzar commented Jul 5, 2023

roman-right commented Jul 5, 2023

roman-right commented Aug 21, 2023

gsakkis commented Sep 16, 2023

github-actions bot commented Oct 17, 2023

MrEarle commented Oct 18, 2023 • edited Loading

roman-right commented Oct 18, 2023

MrEarle commented Jan 7, 2024

[BUG] `Text` query doesn't work with `fetch_links=True` #606

[BUG] `Text` query doesn't work with `fetch_links=True` #606

MrEarle commented Oct 18, 2023 •

edited

Loading