dfs_query_then_fetch #1518

abeglova · 2024-09-04T21:13:23Z

What are the relevant tickets?

closes https://github.com/mitodl/hq/issues/5351

Description (What does it do?)

Currently opensearch calculates term frequencies on a per index level which means that scores are not consistent between learning resource types and resource types with small indexes (Programs) are penalized

Setting search_type=dfs_query_then_fetch make OpenSearch make a pre-query to get search frequencies from across all the indexes used in the query, which will make programs be given more reasonable scores. However, the documentation warns that this is turned off by default because it's slower. I am adding use_dfs_query_then_fetch as a parameter for now so we can test the performance on RC before committing to the change. I have not noticed performance issues locally.

If this doesn't work we will probably need to get rid of the resource specific open search indexes and store all learning resources in one index. We might want to do that anyway - it will make queries simpler which might also make them faster. We had separate indexes in open-discussions because different resources had different data fields. Now
that learning resources are standardized there isn't really a good reason to have separate indexes by learning resource

How can this be tested?

Go to http://open.odl.local:8062/search/?q=Machine+Learning&resource_category=program
And verify that you see "Professional Certificate Program in Machine Learning & Artificial Intelligence" and "Machine Learning, Modeling, and Simulation: Engineering Problem-Solving in the Age of AI". If you don't run the backpopluate commands to populate them

Go to http://open.odl.local:8062/search/?q=Machine+Learning. You will not see programs in the first few pages. For me "Professional Certificate Program in Machine Learning & Artificial Intelligence" is on the third page and "Machine Learning, Modeling, and Simulation: Engineering Problem-Solving in the Age of AI" is not in the first 8 pages

Go to http://open.odl.local:8062/search/?q=machine+learning&use_dfs_query_then_fetch=True. Verify that you see the programs in the results. For me "Professional Certificate Program in Machine Learning & Artificial Intelligence" is on page one and "Machine Learning, Modeling, and Simulation: Engineering Problem-Solving in the Age of AI" is on page two

mbertrand

"Machine Learning, Modeling, and Simulation: Engineering Problem-Solving in the Age of AI" went from page 4 to page 3.

"Professional Certificate Program in Machine Learning" went from page 6 to 2.

👍

abeglova force-pushed the ab/dfs_query_then_fetch branch from 4da5d95 to f92a62a Compare September 5, 2024 14:23

abeglova marked this pull request as ready for review September 5, 2024 15:37

mbertrand self-assigned this Sep 5, 2024

mbertrand approved these changes Sep 5, 2024

View reviewed changes

mbertrand added the Work in Progress label Sep 5, 2024

dfs_query_then_fetch

74ff4e7

abeglova force-pushed the ab/dfs_query_then_fetch branch from 0ba5660 to 74ff4e7 Compare September 5, 2024 16:55

abeglova removed the Work in Progress label Sep 5, 2024

abeglova merged commit 875b53c into main Sep 5, 2024

This was referenced Sep 6, 2024

Release 0.18.2 #1523

Closed

Release 0.18.2 #1524

Merged

abeglova mentioned this pull request Sep 13, 2024

always use dfs_query_then_fetch #1558

Merged

rhysyngsun deleted the ab/dfs_query_then_fetch branch February 7, 2025 20:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dfs_query_then_fetch #1518

dfs_query_then_fetch #1518

Uh oh!

abeglova commented Sep 4, 2024 •

edited

Loading

Uh oh!

mbertrand left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dfs_query_then_fetch #1518

dfs_query_then_fetch #1518

Uh oh!

Conversation

abeglova commented Sep 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What are the relevant tickets?

Description (What does it do?)

How can this be tested?

Uh oh!

mbertrand left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

abeglova commented Sep 4, 2024 •

edited

Loading