Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postings highlighter does not support wildcards #4042

Closed
roytmana opened this issue Nov 1, 2013 · 8 comments
Closed

Postings highlighter does not support wildcards #4042

roytmana opened this issue Nov 1, 2013 · 8 comments

Comments

@roytmana
Copy link

roytmana commented Nov 1, 2013

Currently postings highlighter does not support wildcards and I suspect prefix queries as well - it would only highlight complete terms.

Would it be possible to make it support at least trailing wildcards and prefix queries such as photo*
if not possible, the documentation should state that

@javanna
Copy link
Member

javanna commented Nov 1, 2013

Thanks for reporting this, we'll see what we can do about it.
The documentation already states "Note that the postings highlighter is meant to perform simple query terms highlighting" but maybe that sentence needs to be made even clearer.

@roytmana
Copy link
Author

roytmana commented Nov 1, 2013

I read it as if it does not account for any boolean logic and phrases given the emphasis on that in the doc and missed wildcard aspect till i hit the issue.

Hopefully trailing wildcards/prefixes will be easy :-)

@roytmana
Copy link
Author

roytmana commented Nov 4, 2013

Hi Luca,

Are trailing wildcard supposed to work? I built 0.9.7-SNAPSHOT as of an hour ago and it does not seem to work. Actually even regular terms seem to not highlight randomly I can't quite pin when it highlight and when not. Something seems to be broken?

Alex

@javanna
Copy link
Member

javanna commented Nov 4, 2013

This is in the just released 0.90.6, no need to build from a snapshot. Just tested with 0.90.6 and saw trailing wildcard matches highlighted, as well as ordinary queries (e.g. match query).
I'd love to help if you saw issues but I'm going to need a curl recreation. Maybe you can open a new issue and describe what happens?

@roytmana
Copy link
Author

roytmana commented Nov 4, 2013

Luka,

Let me test 0.9.6 (it was not in maven couple oh hours ago yet) and if it
fails I will try to prepare the recreation late tonight.
Just wanted too check if it is supposed to work before trying to extract a
recreation

thanks,
Alex

On Mon, Nov 4, 2013 at 12:11 PM, Luca Cavanna notifications@github.comwrote:

This is in the just released 0.90.6, no need to build from a snapshot.
Just tested with 0.90.6 and saw trailing wildcard matches highlighted, as
well as ordinary queries (e.g. match query).
I'd love to help if you saw issues but I'm going to need a curl
recreation. Maybe you can open a new issue and describe what happens?


Reply to this email directly or view it on GitHubhttps://github.com//issues/4042#issuecomment-27702859
.

@roytmana
Copy link
Author

roytmana commented Nov 5, 2013

Luka,

It does not work with 0.90.6 either. Just a very simple example (not exactly my problem of not highlighting at all but highlighting incorrectly):

curl  -XDELETE http://localhost:9200/test
curl  -XPOST http://localhost:9200/test -d'{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  },
  "mappings": {
    "ht": {
      "properties": {
        "name": {
          "type": "string",
          "index_options": "offsets"
        }
      }
    }
  }
}'

curl -XPOST "http://localhost:9200/test/_search" -d'
{
   "query": {
      "query_string": {
         "query": "name:photo*"
      }
   },
   "highlight": {
      "fields": {
         "name": {
            "type": "postings"
         }
      }
   }
}'

result:


{
   "took": 2,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 2,
      "max_score": 1,
      "hits": [
         {
            "_index": "test",
            "_type": "ht",
            "_id": "JxpSkfEwTJyJ5tiKDk951Q",
            "_score": 1,
            "_source": {
               "name": "photography"
            },
            "highlight": {
               "name": [
                  "<em>photography</em>"
               ]
            }
         },
         {
            "_index": "test",
            "_type": "ht",
            "_id": "O-KeBwKpTxeXQdfvfc84mg",
            "_score": 1,
            "_source": {
               "name": "photo equipment"
            },
            "highlight": {
               "name": [
                  "<em>photo equip</em>ment"
               ]
            }
         }
      ]
   }
}

@clintongormley
Copy link

Yeah, I'm seeing weird stuff happening here too, eg:

POST /test/ht/_search
{
   "query": {
      "match": {
         "name": "equipment little"
      }
   },
   "highlight": {
      "fields": {
         "name": {
            "type": "postings"
         }
      }
   }
}


  "hits": [
     {
        "_index": "test",
        "_type": "ht",
        "_id": "2",
        "_score": 0.061554804,
        "_source": {
           "name": "photo equipment"
        },
        "highlight": {
           "name": [
              "photo <em>equipment</em>"
           ]
        }
     },
     {
        "_index": "test",
        "_type": "ht",
        "_id": "1",
        "_score": 0.049243845,
        "_source": {
           "name": "littlexxxxxxx photograph equipment"
        },
        "highlight": {
           "name": [
              "little<em>xxxxxxx p</em>hotograph equipment"
           ]
        }
     }
  ]

It's like the offsets are set on the first document, then not reset for subsequent documents.

@javanna
Copy link
Member

javanna commented Nov 5, 2013

Indeed, this is odd. I'm looking into it and I'll open another issue later on as the problem is not only with wildcards.

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants