Min score for vector learning resources endpoint#3285
Conversation
OpenAPI Changes1 changes: 0 error, 0 warning, 1 info Unexpected changes? Ensure your branch is up-to-date with |
There was a problem hiding this comment.
Pull request overview
Adds a score_cutoff query parameter to the v0 vector learning-resources search endpoint to apply a minimum similarity threshold when a query string is present, and adjusts the Search UI behavior so facet filtering/counts remain coherent when the endpoint returns “all hits above cutoff” (no pagination).
Changes:
- Backend: introduce
score_cutoff-> Qdrantscore_threshold, and whenq+ cutoff are present, bypass pagination by fetching all hits above the cutoff and recomputing facet counts from the returned hits. - Frontend: for vector searches with a query, request an “unfaceted” vector search and apply facet filtering + facet counts locally; hide pagination in this mode.
- API/docs/tests: update request serializer + OpenAPI spec + generated v0 client, and add/adjust tests for the new behavior.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| vector_search/views.py | Implements score_cutoff, changes query-with-cutoff behavior to fetch all hits and recompute aggregations from returned hits. |
| vector_search/views_test.py | Updates pagination test to disable cutoff; adds test asserting cutoff bypasses pagination and forwards score_threshold. |
| vector_search/utils_test.py | Updates hybrid-search test expectations to account for new cutoff-driven control flow. |
| vector_search/serializers.py | Adds score_cutoff request field (default 0.1). |
| openapi/specs/v0.yaml | Documents score_cutoff in the v0 vector search endpoint. |
| frontends/main/src/page-components/SearchDisplay/SearchDisplay.tsx | Implements local facet filtering/counts + pagination suppression for vector query searches. |
| frontends/main/src/app-pages/SearchPage/SearchPage.test.tsx | Adds UI test asserting pagination hidden + local facet filtering avoids extra API calls. |
| frontends/api/src/generated/v0/api.ts | Regenerates client types/params to include score_cutoff. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
abeglova
left a comment
There was a problem hiding this comment.
I'm not sure how to review this. I don't think this is a good approach. Both the manual filtering in the ui and the manual facet aggregation seem not great and it doesn't seem like there is an easy fix that keeps the current approach. The current score threshold is high enough that lots of relevant resources are just not returned if you make a query that should have lots of results like "AI". If you lower the the threshold and some queries return more than the number of results in a page the filtering does not work correctly. I think realistically we need to either do something else or rethink how facets work
| return true | ||
| } | ||
| const resourceValues = getResourceFacetValues(resource, facet) | ||
| return selectedValues.some((value) => resourceValues.includes(value)) |
There was a problem hiding this comment.
Why are you filtering in the UI and not the backend? I think manually filtering for facets is a bad idea in general but at least it should be done in the backend so the results are cachable. Also filtering the results in python wouldn't be fast but at least it will be faster than filtering in the ui code
There was a problem hiding this comment.
Also, with the 0.1 cutoff i never had a query return more than 15 or so results even for very broad searches like "science" locally. The facets worked as expected but many relevant results were not returned. When i lowered the cutoff so that i would get more results than would fit on a page for some queries, qdrant would still limit the number of results returned and some facets would say there were more results than were shown because some of the results were not displayed because they did not fit in the first page
There was a problem hiding this comment.
I tried again with score_cutoff 0.00001 and got the expected results. I previously tried setting score_cutoff=0 which caused to backend to still page but the UI to filter
There was a problem hiding this comment.
I went this route to get the interface and facet counts to line up (regardless of qdrant/opensearch endpoint) in addition to working around some fundamental quirks in how Qdrant applies filtering (as well as a limitation with their facet api (I left some notes on this here).
Opensearch:
Search “Biology” -> I get a stable set of matching docs -> compute facets over that same set → clicking a facet ("Earth Science" etc) narrows that same set
Qdrant:
Search “Biology”-> I get the best candidates -> fuse/rank them -> cut by score/limit -> apply filters over the surviving candidates
- Qdrant's facet api does not support hybrid/complex searches involving a pre-filter or passing in a score.(presumably it would be just as expensive as performing the entire search and manually aggregating counts)
- This was a non-issue without the score cutoff since all searches would yield all results
- Qdrant performs neither a pre-filter or post-filter (do the search first - then apply the filters) like opensearch.
- It traverses a filterable HNSW as it performs the search. To further confound this Qdrant also dynamically selects how it decides to determine the filter counts depending on the cardinality of the field(s) being filtered
| /> | ||
| )} | ||
| /> | ||
| {!isVectorQuerySearch && ( |
There was a problem hiding this comment.
What is the plan for this long term? What should happen to searches that have more results than can fit on a page?
There was a problem hiding this comment.
As a first step to make it easier to iterate on vector specific logic/interface my plan was to properly subclass and separate out the vector specific logic on the frontend into its own component as part of this ticket
Long term we could consider:
- what it currently does - limit the number of results by using a threshold score when dealing with the "brute force branch" (query is present + filters are applied + there is a cutoff_score > 0). its a non-issue if we are just dealing with the learning resources collection and the cutoff is appropriate (we are not dealing with thousands of results after applying a specific query+filter+cutoff_score). If we present the user with more than 10 results given the size of our learning resource catalog after query+filter+cutoff_score filters there is likely something else wrong IMO.
- changing the design of the search page/facet counts - consider hiding counts once a query is present or even altogether
Long term I agree with what you said "I think realistically we need to either do something else or rethink how facets work" - this is a first dip (admin-only see how it works or doesnt)
|
@abeglova This is ready for another look. I resolved the min-score = 0 issue you spotted in addition to adding an absolute hard limit for the case where we need to fetch all results without pagination |
| score_cutoff = serializers.FloatField( | ||
| required=False, | ||
| default=settings.VECTOR_SEARCH_MIN_SCORE, | ||
| min_value=settings.VECTOR_SEARCH_MIN_SCORE, |
There was a problem hiding this comment.
Can you remove this from the serialized now that this is an environment variable? Or i think the better thing to do is to add it to the search params similar to the other admin variables
There was a problem hiding this comment.
Also why is min_value the same as default?
There was a problem hiding this comment.
i was planning on factoring it out to be configurable in the next feature that lives on top of this. what do you think about for now setting min-value=0 and default=0.1 ?
There was a problem hiding this comment.
in terms of removing this - I think we may* need to adjust this (have the frontend pass this in) depending on the interface (channel pages may need a slightely different score to be effective) just because of how this applies in qdrant. I'm going to move this to constants instead of a variable setting
| type: number | ||
| format: double | ||
| minimum: 0.1 | ||
| default: 0.1 |
There was a problem hiding this comment.
You will be incorrect if VECTOR_SEARCH_MIN_SCORE changes.
There was a problem hiding this comment.
thats a good point. suggested hard coded values for now here
What are the relevant tickets?
Closes https://github.com/mitodl/hq/issues/11079
Description (What does it do?)
This PR adds a score_cutoff parameter to the vector learning resources endpoint that controls the minimum threshold score for results that are returned when there is a query string present.
There are some quirks in how this had to be implemented to get the facet counts to behave as expected (documented here)
How this works
when the "hybrid search" admin option is enabled
How can this be tested?
python manage.py generate_embeddings --courses --skip-contentfiles