Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Text query doesn't work with fetch_links=True #606

Open
humbertoortuzar opened this issue Jul 5, 2023 · 7 comments
Open

[BUG] Text query doesn't work with fetch_links=True #606

humbertoortuzar opened this issue Jul 5, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@humbertoortuzar
Copy link

Describe the bug
Beanie @ 1.20.0
fetch_links=True works now with a .count() and .aggregate(...), but...
When querying a document with links and a $text statement, the query fails because in the pipeline prefetched lookups are built before the {$match: {$text: ... }}. stage. Or at least when using build_aggregation_pipeline

def build_aggregation_pipeline(self):
    aggregation_pipeline: List[Dict[str, Any]] = construct_lookup_queries(
        self.document_model
    )

    aggregation_pipeline.append({"$match": self.get_filter_query()})

    sort_pipeline = {"$sort": {i[0]: i[1] for i in self.sort_expressions}}
    if sort_pipeline["$sort"]:
        aggregation_pipeline.append(sort_pipeline)
    if self.skip_number != 0:
        aggregation_pipeline.append({"$skip": self.skip_number})
    if self.limit_number != 0:
        aggregation_pipeline.append({"$limit": self.limit_number})
    return aggregation_pipeline

So when you have a .find(Text(search_query), fetch_links=True) you get this error:
image

To Reproduce

class ModelB(Document):
    some_field: str = Field(..., description="A field")
    other_field: Optional[str] = Field(None, description="Another field")

class ModelA(Document):
    name: Indexed(str, pymongo.TEXT) = Field(
        ..., description="Name of the ModelA instance."
    )

    model_b: Optional[Link[ModelB]] = Field(
        None, description="Reference to ModelA."
    )

    is_deleted: StrictBool = Field(False, description="Indicates if it's soft deleted")

query_result = await ModelA.find(Text("some text"), fetch_links=True).to_list()

Expected behavior
It should return a result instead of an error. To do this, {$match: {$text: ...}} queries should be included before the construct_lookup_queries(self.document_model)

Additional context
If anyone else is facing the same, what I'm doing right now is going inside the package to the function mentioned before and printing the stages needed to fetch your links and adding it to a .aggregate() query. If you also need to do a $match, you can add that stage to your pipeline.

In terms of the current beanie code, I was able to fix my case (multiple filters, fetch_links, and an aggregate pipeline) by doing this:

def build_aggregation_pipeline(self):
    aggregation_pipeline: List[Dict[str, Any]] = []
    filter_query = self.get_filter_query()
    text_queries = [match_case for match_case in filter_query.get("$and", filter_query) if "$text" in match_case]
    other_queries = [match_case for match_case in filter_query.get("$and", filter_query) if "$text" not in match_case]
    aggregation_pipeline.append({"$match": {"$and": text_queries}})
    aggregation_pipeline += construct_lookup_queries(self.document_model)
    aggregation_pipeline.append({"$match": {"$and": other_queries}})

    sort_pipeline = {"$sort": {i[0]: i[1] for i in self.sort_expressions}}
    ...

Disclaimer: This won't work to do $text queries in a linked document, but since $text must be declared first in the pipeline, you'd be better by querying the linked document and using a backlink 👍 .

@roman-right
Copy link
Member

Hi @humbertoortuzar ,
You are right,
I'll fix this soon. Thank you very much for the catch!

@roman-right roman-right added the bug Something isn't working label Jul 5, 2023
@roman-right
Copy link
Member

This bug is fixed in #669. Please try

@roman-right roman-right removed the bug Something isn't working label Aug 21, 2023
@gsakkis
Copy link
Contributor

gsakkis commented Sep 16, 2023

@roman-right build_aggregation_pipeline is also called in FindMany.aggregate but only if fetch_links is True. So this fix for $text does not apply for aggregates with fetch_links=False. The same holds for other FindMany stages ($sort, $skip, $limit). It's not obvious to me why these are not taken into account when fetch_links=False so perhaps there is a more general bug here but I haven't tested it.

@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity.

@github-actions github-actions bot added the Stale label Oct 17, 2023
@MrEarle
Copy link
Contributor

MrEarle commented Oct 18, 2023

I believe this is still an issue. I'm using beanie 1.23.

As far as I managed to understand, I believe this happens because in build_aggregation_pipeline it does this:

        filter_query = self.get_filter_query()
        if "$text" in filter_query:
            ...

But get_filter_query joins all filters under an $and statement, so the filter_query looks like {'$and': [{'$text': ...}]}, resulting in "$text" in filter_query being false.

Update:

I managed to get it to work with a very similar suggestion than what @humbertoortuzar said. I replaced the build_aggregation_pipeline with:

        aggregation_pipeline: list[dict[str, Any]] = construct_lookup_queries(
            self.document_model
        )
        filter_query = self.get_filter_query()

        if filter_query:
            text_queries = [
                match_case
                for match_case in filter_query.get("$and", [filter_query])
                if "$text" in match_case
            ]
            non_text_queries = [
                match_case
                for match_case in filter_query.get("$and", [filter_query])
                if "$text" not in match_case
            ]

            if text_queries:
                aggregation_pipeline.insert(0, {"$match": {"$and": text_queries}})

            if non_text_queries:
                aggregation_pipeline.append({"$match": {"$and": non_text_queries}})

        if extra_stages:
            aggregation_pipeline.extend(extra_stages)

        sort_pipeline = {"$sort": {i[0]: i[1] for i in self.sort_expressions}}
        if sort_pipeline["$sort"]:
            aggregation_pipeline.append(sort_pipeline)
        if self.skip_number != 0:
            aggregation_pipeline.append({"$skip": self.skip_number})
        if self.limit_number != 0:
            aggregation_pipeline.append({"$limit": self.limit_number})

        return aggregation_pipeline

@roman-right roman-right added bug Something isn't working and removed Stale labels Oct 18, 2023
@roman-right
Copy link
Member

Thank you. It will be fixed on the next bug-fixing session.

@MrEarle
Copy link
Contributor

MrEarle commented Jan 7, 2024

I believe this is fixed for aggregate, but I missed that count didn't use the same code in its implementation. I'll draft a PR to fix the count function.

(count still doesn't work with text queries and fetch_links=True)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants