Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

elasticsearch highlight lose character when index_options=offsets #60168

Open
tenlee2012 opened this issue Jul 24, 2020 · 6 comments
Open

elasticsearch highlight lose character when index_options=offsets #60168

tenlee2012 opened this issue Jul 24, 2020 · 6 comments
Labels
>bug good first issue low hanging fruit :Search/Highlighting How a query matched a document Team:Search Meta label for search team

Comments

@tenlee2012
Copy link

Elasticsearch version (bin/elasticsearch --version): 7.7.1

Plugins installed: []

JVM version (java -version): openjdk version "11.0.7" 2020-04-14 LTS

OS version (uname -a if on a Unix-like system): GNU/Linux

Description of the problem including expected versus actual behavior:
set properties index_options=offsets,
doc title is can work?shit, serach result highlight is "<em>shit</em>".
If the search match the last word, and the last word is preceded by a tag symbol, then the highlighting will wrong when set properties index_options=offsets.

Steps to reproduce:

  1. create index
PUT demo_1
{
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "standard",
        "index_options": "offsets"
      }
    }
  }
}
  1. put data
PUT demo_1/_doc/1
{
    "title": "can work?shit"
}
  1. search
POST demo_1/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "shit"
          }
        }
      ]
    }
  },
  "highlight": {
    "fields": {
      "title": {}
    }
  }
}

search result is blow:

{
    "took":2,
    "timed_out":false,
    "_shards":{
        "total":1,
        "successful":1,
        "skipped":0,
        "failed":0
    },
    "hits":{
        "total":{
            "value":1,
            "relation":"eq"
        },
        "max_score":0.2876821,
        "hits":[
            {
                "_index":"demo_1",
                "_type":"_doc",
                "_id":"1",
                "_score":0.2876821,
                "_source":{
                    "title":"can work?shit"
                },
                "highlight":{
                    "title":[
                        "<em>shit</em>"
                    ]
                }
            }
        ]
    }
}
  1. if set hightlight type=plain, it works.
POST demo_1/search
{
  "query": {
    "bool": {
      "must": [
        {
          "multi_match": {
            "query": "shit"
          }
        }
      ]
    }
  },
  "highlight": {
    "fields": {
      "title": {
        "type": "plain"
      }
    }
  }
}

result is blow:

{
    "took":2,
    "timed_out":false,
    "_shards":{
        "total":1,
        "successful":1,
        "skipped":0,
        "failed":0
    },
    "hits":{
        "total":{
            "value":1,
            "relation":"eq"
        },
        "max_score":0.2876821,
        "hits":[
            {
                "_index":"demo_1",
                "_type":"_doc",
                "_id":"1",
                "_score":0.2876821,
                "_source":{
                    "title":"can work?shit"
                },
                "highlight":{
                    "title":[
                      "can work?<em>shit</em>"
                    ]
                }
            }
        ]
    }
}
@tenlee2012 tenlee2012 added >bug needs:triage Requires assignment of a team area label labels Jul 24, 2020
@jimczi jimczi added :Search/Highlighting How a query matched a document and removed needs:triage Requires assignment of a team area label labels Jul 27, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Highlighting)

@elasticmachine elasticmachine added the Team:Search Meta label for search team label Jul 27, 2020
@mayya-sharipova mayya-sharipova added the good first issue low hanging fruit label Mar 12, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@DHRUV6029
Copy link

Hi can i work on this issue

@mayya-sharipova
Copy link
Contributor

@DHRUV6029 So far there is no PR submitted for this issue. You are welcome to submit yours.

@arshPratap
Copy link

Hi @mayya-sharipova .. I would like to work on this issue. Can this issue be assigned to me?

@mayya-sharipova
Copy link
Contributor

@arshPratap You are welcome to submit a PR, we don't assign external contributors to an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug good first issue low hanging fruit :Search/Highlighting How a query matched a document Team:Search Meta label for search team
Projects
None yet
Development

No branches or pull requests

7 participants