Highlighted matched terms #25

Open
cbou opened this Issue May 11, 2013 · 22 comments

Comments

Projects
None yet
@cbou

cbou commented May 11, 2013

It would be nice if Lunr.js could highlight matched terms.

@olivernn

This comment has been minimized.

Show comment
Hide comment
@olivernn

olivernn May 14, 2013

Owner

Do you mean being able to wrap matched terms in some markup which can be later styled, e.g.?

This is a <span class="lunr-match">matched</span> term

I think this would be cool, although currently I'm not sure it's possible. There are a couple of things that lunr needs to be doing before this can be achieved, also I think the functionality for actually marking up matched terms is probably better handled outside of lunr, in some kind of plugin.

What lunr could do is return the positions of the matches in a particular document. Currently lunr discards the position of words in the documents that it indexes. Making use of the positions of terms is something that I've been thinking about a little bit, it would allow for more powerful queries, e.g. documents with "hello" and "world" next to each other.

I think this might be a little way off yet, but its definitely something that I'm interested in adding. If you have any thoughts on implementations then I'd be glad to hear them too.

Owner

olivernn commented May 14, 2013

Do you mean being able to wrap matched terms in some markup which can be later styled, e.g.?

This is a <span class="lunr-match">matched</span> term

I think this would be cool, although currently I'm not sure it's possible. There are a couple of things that lunr needs to be doing before this can be achieved, also I think the functionality for actually marking up matched terms is probably better handled outside of lunr, in some kind of plugin.

What lunr could do is return the positions of the matches in a particular document. Currently lunr discards the position of words in the documents that it indexes. Making use of the positions of terms is something that I've been thinking about a little bit, it would allow for more powerful queries, e.g. documents with "hello" and "world" next to each other.

I think this might be a little way off yet, but its definitely something that I'm interested in adding. If you have any thoughts on implementations then I'd be glad to hear them too.

@michaelmior

This comment has been minimized.

Show comment
Hide comment
@michaelmior

michaelmior Oct 26, 2013

Being able to get the position of the result in a document would be great. lunr is closest thing I've found to what I need for a new project, but I really need the ability to get the position of the matched word in the document. Any pointers on where to start would be great. I assume this could be done fairly simply by changing the index format.

Anyway, once positions are tracked, I'd imagine an interface for highlighting would be fairly straightforward.

Being able to get the position of the result in a document would be great. lunr is closest thing I've found to what I need for a new project, but I really need the ability to get the position of the matched word in the document. Any pointers on where to start would be great. I assume this could be done fairly simply by changing the index format.

Anyway, once positions are tracked, I'd imagine an interface for highlighting would be fairly straightforward.

@sergey-tihon

This comment has been minimized.

Show comment
Hide comment
@sergey-tihon

sergey-tihon Nov 13, 2013

Agreed, highlight should be a killer feature.

Agreed, highlight should be a killer feature.

@andrewwakeling

This comment has been minimized.

Show comment
Hide comment
@andrewwakeling

andrewwakeling Apr 19, 2014

+1. It's vital to provide context for matches especially when indexing multiple fields.

+1. It's vital to provide context for matches especially when indexing multiple fields.

@Stephenitis

This comment has been minimized.

Show comment
Hide comment
@Stephenitis

Stephenitis Sep 30, 2014

+1. Agreed trying to make the results user friendly is top priority.

+1. Agreed trying to make the results user friendly is top priority.

@ysadka

This comment has been minimized.

Show comment
Hide comment
@ysadka

ysadka Oct 2, 2014

+1 would be great to have this!

ysadka commented Oct 2, 2014

+1 would be great to have this!

@EricLongpre

This comment has been minimized.

Show comment
Hide comment
@EricLongpre

EricLongpre Oct 16, 2014

+1. Having the concept of matched tokens is definitely required. This is major for expanding to awesome usability. Without it, we'd have to try and reverse engineer the search logic to attempt to show the user what exactly we found to merit the results.

+1. Having the concept of matched tokens is definitely required. This is major for expanding to awesome usability. Without it, we'd have to try and reverse engineer the search logic to attempt to show the user what exactly we found to merit the results.

@JaimeObregon

This comment has been minimized.

Show comment
Hide comment

👍

@gouldingken

This comment has been minimized.

Show comment
Hide comment
@gouldingken

gouldingken May 29, 2015

Does the matching and storing of tokens with position have to happen on the indexing side? When populating search results, you typically have the full text. The only complex part is figuring out how stemmed or other loosely matched terms are being handled. For direct matches you can basically just do a split and join on the search term, truncate it appropriately and wrap it in some styling.

Would it be possible to have a method taking the full text and search term that would return the positions? I imagine it would basically use the tokenizers and stemming algorithms to find the location for the given string. That seems simpler (and less memory intensive) than storing every token and its original position as part of the indexing process.

Does the matching and storing of tokens with position have to happen on the indexing side? When populating search results, you typically have the full text. The only complex part is figuring out how stemmed or other loosely matched terms are being handled. For direct matches you can basically just do a split and join on the search term, truncate it appropriately and wrap it in some styling.

Would it be possible to have a method taking the full text and search term that would return the positions? I imagine it would basically use the tokenizers and stemming algorithms to find the location for the given string. That seems simpler (and less memory intensive) than storing every token and its original position as part of the indexing process.

@slashdotdash slashdotdash referenced this issue in slashdotdash/jekyll-lunr-js-search Nov 14, 2015

Closed

Display matching line or context? #79

@julmot

This comment has been minimized.

Show comment
Hide comment
@julmot

julmot Feb 3, 2016

I would like to realize this with a highlighting component. However, first of we need to make sure that highlighted words and matches by lunr are exactly the same for a good usability concept. Therefore I created #200.

julmot commented Feb 3, 2016

I would like to realize this with a highlighting component. However, first of we need to make sure that highlighted words and matches by lunr are exactly the same for a good usability concept. Therefore I created #200.

@olivernn

This comment has been minimized.

Show comment
Hide comment
@olivernn

olivernn Jan 4, 2017

Owner

I have published an alpha release of the next version of lunr with support for highlighting matched search terms.

In addition I've put together a basic demo showing, amongst other things, highlighting of matched terms.

The intention of this demo and alpha release is to get some feedback on the interface that lunr provides to aid highlighting search terms. Any and all feedback is welcome.

Owner

olivernn commented Jan 4, 2017

I have published an alpha release of the next version of lunr with support for highlighting matched search terms.

In addition I've put together a basic demo showing, amongst other things, highlighting of matched terms.

The intention of this demo and alpha release is to get some feedback on the interface that lunr provides to aid highlighting search terms. Any and all feedback is welcome.

@MykolaGolubyev

This comment has been minimized.

Show comment
Hide comment
@MykolaGolubyev

MykolaGolubyev Jan 10, 2017

Wonderful demo. Will try to make it work for my case. In my case I am extracting text from more complex structures to feed the indexer. Later I don't have the extracted text, but only the original data structure. So in order to highlight I have to repeat the "indexing" process. The problem is I am doing indexing on a server, but search is purely on a client...

Wonderful demo. Will try to make it work for my case. In my case I am extracting text from more complex structures to feed the indexer. Later I don't have the extracted text, but only the original data structure. So in order to highlight I have to repeat the "indexing" process. The problem is I am doing indexing on a server, but search is purely on a client...

@MykolaGolubyev

This comment has been minimized.

Show comment
Hide comment
@MykolaGolubyev

MykolaGolubyev Jan 11, 2017

Doing some progress with it. Would you prefer to get feedback on resulting structure here or do you have other channels?

Doing some progress with it. Would you prefer to get feedback on resulting structure here or do you have other channels?

@olivernn

This comment has been minimized.

Show comment
Hide comment
@olivernn

olivernn Jan 11, 2017

Owner

@MykolaGolubyev I think open a new issue, and just reference this issue, so there is still a link. Thanks for taking the test things out, all feedback is super useful.

Owner

olivernn commented Jan 11, 2017

@MykolaGolubyev I think open a new issue, and just reference this issue, so there is still a link. Thanks for taking the test things out, all feedback is super useful.

@julmot

This comment has been minimized.

Show comment
Hide comment
@julmot

julmot Jan 13, 2017

@olivernn I would investigate this too if you could let me know if you'll make the highlighting component (wrapper.js) a full-working solution? Then I don't think there would be a need for mark.js.

julmot commented Jan 13, 2017

@olivernn I would investigate this too if you could let me know if you'll make the highlighting component (wrapper.js) a full-working solution? Then I don't think there would be a need for mark.js.

@MykolaGolubyev

This comment has been minimized.

Show comment
Hide comment
@MykolaGolubyev

MykolaGolubyev Jan 13, 2017

Mark is still a solution for my case. Text I display and text I index are not strictly the same.

Mark is still a solution for my case. Text I display and text I index are not strictly the same.

@julmot

This comment has been minimized.

Show comment
Hide comment
@julmot

julmot Jan 13, 2017

@MykolaGolubyev Thanks for your reply. Could you please elaborate this a bit? I can't fully imagine the use case.

julmot commented Jan 13, 2017

@MykolaGolubyev Thanks for your reply. Could you please elaborate this a bit? I can't fully imagine the use case.

@MykolaGolubyev

This comment has been minimized.

Show comment
Hide comment
@MykolaGolubyev

MykolaGolubyev Jan 13, 2017

I have a data structure that represents a rich text content. i have many blocks of those on different pages.
When I create an index I extract certain text from them and associate with a block. Then I extract different kind of text and give it a higher weight and again associate with the block.
When I search I find an associated block. Navigate there and highlight certain words in an already rich text.
To do that I convert positions to unique words and call mark.js.
So I have to re-extract text from that block first. At the moment I share code with the server side (Nashorn) to achieve that.

I have a data structure that represents a rich text content. i have many blocks of those on different pages.
When I create an index I extract certain text from them and associate with a block. Then I extract different kind of text and give it a higher weight and again associate with the block.
When I search I find an associated block. Navigate there and highlight certain words in an already rich text.
To do that I convert positions to unique words and call mark.js.
So I have to re-extract text from that block first. At the moment I share code with the server side (Nashorn) to achieve that.

@olivernn

This comment has been minimized.

Show comment
Hide comment
@olivernn

olivernn Jan 13, 2017

Owner

@julmot If there are already libraries out there that do the job and handle browser inconsistencies correctly (mark.js) then I'd be much more inclined to look at ways of integrating with them instead of creating (and maintaining) another library.

I'd be interested in working with you on a way of providing some Lunr support in mark.js, what is the best way to collaborate on that, an issue here or on mark.js?

Owner

olivernn commented Jan 13, 2017

@julmot If there are already libraries out there that do the job and handle browser inconsistencies correctly (mark.js) then I'd be much more inclined to look at ways of integrating with them instead of creating (and maintaining) another library.

I'd be interested in working with you on a way of providing some Lunr support in mark.js, what is the best way to collaborate on that, an issue here or on mark.js?

@julmot

This comment has been minimized.

Show comment
Hide comment
@julmot

julmot Jan 13, 2017

@olivernn Thanks for your reply. Sure, we can discuss about this. Please open a new issue, then we see if a plugin is helpful (separate repository).

@MykolaGolubyev If I understand you correctly you're building your search index by extracting text from pages but don't use this search index to display the results. That's why mark.js is helpful here. Am I right?

julmot commented Jan 13, 2017

@olivernn Thanks for your reply. Sure, we can discuss about this. Please open a new issue, then we see if a plugin is helpful (separate repository).

@MykolaGolubyev If I understand you correctly you're building your search index by extracting text from pages but don't use this search index to display the results. That's why mark.js is helpful here. Am I right?

@MykolaGolubyev

This comment has been minimized.

Show comment
Hide comment
@MykolaGolubyev

MykolaGolubyev Jan 13, 2017

Correct. With one amend: I am not extracting text from pages. I am using a data source to build a page and to prepare text for indexing.

Correct. With one amend: I am not extracting text from pages. I am using a data source to build a page and to prepare text for indexing.

@olivernn olivernn referenced this issue in julmot/mark.js Jan 17, 2017

Closed

Lunr Integration #105

@julmot

This comment has been minimized.

Show comment
Hide comment
@julmot

julmot Sep 14, 2017

Hey @olivernn,

As v2 is now released, I just would like to ask you about the status of highlighting matches for this release. Did you decide to go with mark.js or do you recommend a different way?

Thanks for your reply.

julmot commented Sep 14, 2017

Hey @olivernn,

As v2 is now released, I just would like to ask you about the status of highlighting matches for this release. Did you decide to go with mark.js or do you recommend a different way?

Thanks for your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment