removing unused payload fields by shanbady · Pull Request #3154 · mitodl/mit-learn

shanbady · 2026-04-02T14:53:24Z

What are the relevant tickets?

Closes https://github.com/mitodl/hq/issues/10782

Description (What does it do?)

This PR removes unnecessary fields from the payload index in Qdrant. clients in mit-learn and learn-ai that call the contentfile vector endpoint only use a few fields to filter by. In Qdrant - each field we decide to add to the payload index introduces significant storage and memory requirements in addition to performance hits since it needs to maintain new HNSW graphs so we should be intentional about what is actually added there.

This PR also contains a small bugfix for an issue where the learning resource responses were not sorted by score

How can this be tested?

On the live performance side of things I have validated it on the RC qdrant cluster.

checkout this branch
restart celery
run python manage.py generate_embeddings --courses
check the contentfiles collection info in your local qdrant dashboard and look at the "payload" list - ensure that the fields line up with the fields configured in the constants file

Checklist:

Double-check that there are no calls being made to our vector search endpoint /api/v0/vector_content_files_search/ that uses any of the removed fields in this PR as a query parameter.

Copilot

Pull request overview

This PR aims to reduce Qdrant storage/memory overhead by removing unused payload indexes for the content_files embeddings collection, keeping only the fields intended for frequent filtering/faceting.

Changes:

Removed multiple fields from QDRANT_CONTENT_FILE_INDEXES to reduce payload index footprint.
Added an inline note to encourage intentional selection of indexed fields.

github-actions · 2026-04-02T17:40:46Z

OpenAPI Changes

5 changes: 0 error, 5 warning, 0 info

View full changelog

Unexpected changes? Ensure your branch is up-to-date with main (consider rebasing).

for more information, see https://pre-commit.ci

abeglova

👍

removing unused payload fields

e3a6823

shanbady added the Needs Review An open Pull Request that is ready for review label Apr 2, 2026

shanbady marked this pull request as ready for review April 2, 2026 14:54

Copilot AI review requested due to automatic review settings April 2, 2026 14:54

Copilot started reviewing on behalf of shanbady April 2, 2026 14:55 View session

sentry bot reviewed Apr 2, 2026

View reviewed changes

Comment thread vector_search/constants.py

Copilot AI reviewed Apr 2, 2026

View reviewed changes

Comment thread vector_search/constants.py

Comment thread vector_search/constants.py

Comment thread vector_search/constants.py

shanbady added 4 commits April 2, 2026 12:09

fix field mapping

2434a09

remove unused from serializer

be3c20a

fix test

d9be639

regenerate api spec

093ac99

sentry bot reviewed Apr 2, 2026

View reviewed changes

Comment thread vector_search/constants.py

fix issue with resource result ordering

aa335e9

sentry bot reviewed Apr 2, 2026

View reviewed changes

Comment thread vector_search/tasks.py Outdated

shanbady added 2 commits April 2, 2026 18:02

adding fix for ordering

d582707

remove debug code

91acc00

sentry bot reviewed Apr 3, 2026

View reviewed changes

Comment thread vector_search/constants.py

shanbady and others added 3 commits April 3, 2026 10:44

fix formatting for array notation

efb1b3e

Merge branch 'main' into shanbady/remove-unused-qdrant-payloads

b57b10b

[pre-commit.ci] auto fixes from pre-commit.com hooks

184bc96

for more information, see https://pre-commit.ci

abeglova self-assigned this Apr 6, 2026

abeglova approved these changes Apr 6, 2026

View reviewed changes

shanbady merged commit 15d7683 into main Apr 6, 2026
14 checks passed

shanbady deleted the shanbady/remove-unused-qdrant-payloads branch April 6, 2026 19:27

odlbot mentioned this pull request Apr 6, 2026

Release 0.62.1 #3169

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

removing unused payload fields#3154

removing unused payload fields#3154
shanbady merged 11 commits intomainfrom
shanbady/remove-unused-qdrant-payloads

shanbady commented Apr 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

abeglova left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

shanbady commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What are the relevant tickets?

Description (What does it do?)

How can this be tested?

Checklist:

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OpenAPI Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

abeglova left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

shanbady commented Apr 2, 2026 •

edited

Loading

github-actions bot commented Apr 2, 2026 •

edited

Loading