Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Highlighting: dedup field names when using wildcard expression to define which fields to highlight #10916

Closed

Conversation

@javanna
Copy link
Member

commented May 1, 2015

When specifying a wildcard expression to define which fields to highlight, we retrieve the matching field names from the mappers. The returned list of fields may contain duplicates, which is why we should make sure we don't go over the same field more than once. This is just to avoid performing the same task multiple times, but doesn't change the result of highlighting.

…ntify which fields to highlight
@jpountz

This comment has been minimized.

Copy link
Contributor

commented May 1, 2015

Should the fix be in simpleMatchToFullName instead? Also maybe we should leave a comment explaining that we need to deduplicate fields today but this might not be necessary anymore if we store mappings at the index level?

@javanna

This comment has been minimized.

Copy link
Member Author

commented May 1, 2015

Should the fix be in simpleMatchToFullName instead?

good question, I was wondering myself, to be honest I have no clue :) I do have the feeling that I might be hiding some other issue with this fix, not sure.

@javanna

This comment has been minimized.

Copy link
Member Author

commented May 1, 2015

One other question could be if we should go ahead and try to highlight tons of metadata fields, like _routing, _ttl and such for instance when using *. I noticed this while looking at #10914 .

@jpountz

This comment has been minimized.

Copy link
Contributor

commented May 1, 2015

Let's ping @rjernst who might have ideas/opinions here as this is related to other refactorings he is performing.

@clintongormley

This comment has been minimized.

Copy link
Member

commented May 4, 2015

@javanna re

One other question could be if we should go ahead and try to highlight tons of metadata fields, like _routing, _ttl and such for instance when using *.

We have an adoptme issue for that: #9881

@nik9000

This comment has been minimized.

Copy link
Contributor

commented May 18, 2015

Should the fix be in simpleMatchToFullName instead? Also maybe we should leave a comment explaining that we need to deduplicate fields today but this might not be necessary anymore if we store mappings at the index level?

probably

@@ -76,12 +75,12 @@ public boolean hitExecutionNeeded(SearchContext context) {
public void hitExecute(SearchContext context, HitContext hitContext) {
Map<String, HighlightField> highlightFields = newHashMap();
for (SearchContextHighlight.Field field : context.highlight().fields()) {
List<String> fieldNamesToHighlight;
Set<String> fieldNamesToHighlight;

This comment has been minimized.

Copy link
@nik9000

nik9000 May 18, 2015

Contributor

Why not add

if (highlightFields.contains(fieldName)) { continue; }

around line 94?

That'll prevent two regexes that find the same field from highlighting it twice.

This comment has been minimized.

Copy link
@javanna

javanna May 18, 2015

Author Member

how can the same field be highlighted twice if we use a set that doesn't allow for duplicates? not sure what I'm missing here. Anyway, agree that the problem is probably in mappings code, but I am not familiar with it, somebody else will have to jump in on that.

This comment has been minimized.

Copy link
@nik9000

nik9000 May 18, 2015

Contributor

I'm thinking something like:

"highlight": {
  "cat_*": <other stuff>,
  "*": <stuff>
}

As is now the last matching regex wins but it does so by highlighting the field twice. That's what I meant.

@javanna

This comment has been minimized.

Copy link
Member Author

commented May 27, 2015

@rjernst any idea about whether simpleMatchToFullName should return duplicates or not? Should we fix the problem there (unless it was fixed already and I missed it) or in the highlighters code?

@rjernst

This comment has been minimized.

Copy link
Member

commented May 27, 2015

@javanna Yes, please make the fix in simpleMatchToFullName. It should be simpler in the future and not need a set (because only the full names will need to be checked).

javanna added a commit to javanna/elasticsearch that referenced this pull request May 27, 2015
… & `simpleMatchToIndexNames` in FieldMappersLookup

Relates to elastic#10916
@javanna

This comment has been minimized.

Copy link
Member Author

commented May 27, 2015

Superseded by #11377

@javanna javanna closed this May 27, 2015
@kevinkluge kevinkluge removed the review label May 27, 2015
javanna added a commit to javanna/elasticsearch that referenced this pull request May 27, 2015
… & `simpleMatchToIndexNames` in FieldMappersLookup

Relates to elastic#10916
Closes elastic#11377
javanna added a commit to javanna/elasticsearch that referenced this pull request May 27, 2015
… & `simpleMatchToIndexNames` in FieldMappersLookup

Relates to elastic#10916
Closes elastic#11377
javanna added a commit to javanna/elasticsearch that referenced this pull request May 28, 2015
… & `simpleMatchToIndexNames` in FieldMappersLookup

Relates to elastic#10916
Closes elastic#11377
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.