Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What are the relevant tickets?
closes https://github.com/mitodl/hq/issues/5351
Description (What does it do?)
Currently opensearch calculates term frequencies on a per index level which means that scores are not consistent between learning resource types and resource types with small indexes (Programs) are penalized
Setting search_type=dfs_query_then_fetch make OpenSearch make a pre-query to get search frequencies from across all the indexes used in the query, which will make programs be given more reasonable scores. However, the documentation warns that this is turned off by default because it's slower. I am adding use_dfs_query_then_fetch as a parameter for now so we can test the performance on RC before committing to the change. I have not noticed performance issues locally.
If this doesn't work we will probably need to get rid of the resource specific open search indexes and store all learning resources in one index. We might want to do that anyway - it will make queries simpler which might also make them faster. We had separate indexes in open-discussions because different resources had different data fields. Now
that learning resources are standardized there isn't really a good reason to have separate indexes by learning resource
How can this be tested?
Go to http://open.odl.local:8062/search/?q=Machine+Learning&resource_category=program
And verify that you see "Professional Certificate Program in Machine Learning & Artificial Intelligence" and "Machine Learning, Modeling, and Simulation: Engineering Problem-Solving in the Age of AI". If you don't run the backpopluate commands to populate them
Go to http://open.odl.local:8062/search/?q=Machine+Learning. You will not see programs in the first few pages. For me "Professional Certificate Program in Machine Learning & Artificial Intelligence" is on the third page and "Machine Learning, Modeling, and Simulation: Engineering Problem-Solving in the Age of AI" is not in the first 8 pages
Go to http://open.odl.local:8062/search/?q=machine+learning&use_dfs_query_then_fetch=True. Verify that you see the programs in the results. For me "Professional Certificate Program in Machine Learning & Artificial Intelligence" is on page one and "Machine Learning, Modeling, and Simulation: Engineering Problem-Solving in the Age of AI" is on page two