use qdrant vectors in hybrid search #2718

abeglova · 2025-11-20T14:47:12Z

What are the relevant tickets?

closes https://github.com/mitodl/hq/issues/9148

Description (What does it do?)

This pr updates the hybrid search index in opensearch to reuse vectors from qdrant. The hybrid search is currently behind a flag (search_mode=hybrid)

How can this be tested?

set
OPENAI_API_KEY to the value from rc
and
QDRANT_ENCODER=vector_search.encoders.litellm.LiteLLMEncoder

run docker-compose run web ./manage.py generate_embeddings --skip-contentfiles to ensure you have resource embedding in qdrant

run docker-compose run web ./manage.py recreate_index --combined_hybrid

Got to
http://open.odl.local:8062/search

search for "intro to ai course" you should have no results. The search with search mode not set should behave normally

select search_mode=hybrid from the admin options panel (login required) or just add it to the url

search for "intro to ai course" you should get results. Facets and base search should also still work

github-actions · 2025-11-20T14:47:27Z

OpenAPI Changes

Show/hide No detectable change.

Unexpected changes? Ensure your branch is up-to-date with main (consider rebasing).

shanbady

Functionally it works great! just some minor cleanup suggestions/comments

shanbady · 2025-11-21T16:00:31Z

vector_search/utils.py

    }


+def get_vector_for_learning_resource(readable_id):


Might be cleaner to add a "with_vectors" as an optional parameter to retrieve_points_matching_params and use that:

retrieve_points_matching_params({"readable_id": "course-234"}, with_vectors=True)[0]

shanbady · 2025-11-21T16:08:33Z

learning_resources_search/api.py

-        }
+        encoder = dense_encoder()
+        query_vector = encoder.embed_query(text)
+        vector_query = {"knn": {"vector_embedding": {"vector": query_vector, "k": 5}}}


might be good to have "k" and "pagination_depth" defined in constants or in settings.py (or omitted altogether if they default to something reasonable)

shanbady · 2025-11-21T16:09:40Z

learning_resources_search/api.py

                            "combination": {
                                "technique": "arithmetic_mean",
-                                "parameters": {"weights": [0.6, 0.2, 0.2]},
+                                "parameters": {"weights": [0.8, 0.2]},


these also seem like magic numbers we may tweak often. consider moving to settings or as a constant. since all of search_pipeline is static json we could do something like:

OPENSEARCH_HYBRID_PIPELINE_CONFIGURATION = {...}

shanbady · 2025-11-21T16:22:09Z

learning_resources_search/indexing_api.py

        }
    }

    if object_type == COMBINED_INDEX:


imo the naming of these indexes are getting a bit confusing (constants.COMBINED_INDEX vs constants.ALL_INDEXES vs constants.BOTH_INDEXES) might be better to call this HYBRID_INDEX or HYBRID_COMBINED_INDEX etc

shanbady

LGTM 👍

abeglova force-pushed the ab/use-qdrant-vector branch 2 times, most recently from 11eb260 to 2bca7f2 Compare November 20, 2025 16:36

abeglova marked this pull request as ready for review November 20, 2025 17:35

abeglova marked this pull request as draft November 20, 2025 18:17

abeglova marked this pull request as ready for review November 20, 2025 20:27

shanbady self-requested a review November 21, 2025 14:45

shanbady requested changes Nov 21, 2025

View reviewed changes

shanbady assigned abeglova Nov 21, 2025

shanbady added the Waiting on author label Nov 21, 2025

shanbady approved these changes Nov 21, 2025

View reviewed changes

use qdrant for hybrid search

3586b9e

abeglova force-pushed the ab/use-qdrant-vector branch from 0a3af13 to 3586b9e Compare November 21, 2025 21:38

abeglova merged commit 3c35312 into main Nov 21, 2025
13 checks passed

abeglova deleted the ab/use-qdrant-vector branch November 21, 2025 21:51

odlbot mentioned this pull request Nov 24, 2025

Release 0.47.14 #2725

Open

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

use qdrant vectors in hybrid search #2718

use qdrant vectors in hybrid search #2718

Uh oh!

abeglova commented Nov 20, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 20, 2025 •

edited

Loading

Uh oh!

shanbady left a comment

Uh oh!

shanbady Nov 21, 2025

Uh oh!

shanbady Nov 21, 2025

Uh oh!

shanbady Nov 21, 2025

Uh oh!

shanbady Nov 21, 2025

Uh oh!

shanbady left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		}


		def get_vector_for_learning_resource(readable_id):

use qdrant vectors in hybrid search #2718

use qdrant vectors in hybrid search #2718

Uh oh!

Conversation

abeglova commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What are the relevant tickets?

Description (What does it do?)

How can this be tested?

Uh oh!

github-actions bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OpenAPI Changes

Uh oh!

shanbady left a comment

Choose a reason for hiding this comment

Uh oh!

shanbady Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

shanbady Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

shanbady Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

shanbady Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

shanbady left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

abeglova commented Nov 20, 2025 •

edited

Loading

github-actions bot commented Nov 20, 2025 •

edited

Loading