Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ES search for archival descriptions, #10082 #416

Closed
wants to merge 6 commits into from

Conversation

MikeFE
Copy link

@MikeFE MikeFE commented Jul 15, 2016

The following PR attempts to accomplish 3 things:

  1. Narrow down how many fields in nested, foreign types inside information object we search over. Now we only search for things like repository name (but not history), creator name (but not history), etc. This will ensure we don't accidentally get weird search results just because a keyword was in a creator's biog history or things like that.
  2. Add weighting to search fields when searching, so certain fields yield higher scores when returning from queries. Also added a 'sort by relevance' option so we can take advantage of this way of searching.
  3. Change the default search operator from OR to AND. This will result in more concise results.

Mike Gale added 5 commits July 15, 2016 16:17
When searching for information objects, now only fields that make sense will
be included in our ES _all queries. So now doing information object searches
will no longer yield results if the text matches irrelevant fields such as
repository history, creator slugs, etc.
Add sort by reference so we can see results ordered based on their ES scores.

Also tweaked the boost power of creator name to account for creators also
automatically get added to name access points and that was making the score
higher than expected.
For information object searches, the default operator will now be AND instead
of OR.
@@ -439,6 +459,39 @@ protected static function getAllObjectStringFields($rootIndexType, $object, $pre
return $fields;
}

public static function addBoostValuesToFields(&$fields, $i18nFields, $nonI18nFields)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this method belong to arElasticSearchModelBase?

@sevein
Copy link
Member

sevein commented Jul 18, 2016

Looking great! Just one small observation about the location of the addBoostValuesToFields method, but other than that 👍 !

@MikeFE
Copy link
Author

MikeFE commented Jul 20, 2016

squashed&merged

@MikeFE MikeFE closed this Jul 20, 2016
@qubot qubot deleted the dev/issue-10082-improved-info-obj-es-search branch July 20, 2016 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants