Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement: Allow full text search #294

Closed
estruyf opened this issue Mar 21, 2022 · 6 comments
Closed

Enhancement: Allow full text search #294

estruyf opened this issue Mar 21, 2022 · 6 comments
Labels
enhancement New feature or request
Projects

Comments

@estruyf
Copy link
Owner

estruyf commented Mar 21, 2022

#292 - allow a full-text search to go through the contents of the pages/articles.

To offer this functionality, the search index needs to be maintained in the file storage, meaning that the search logic needs to move from the webview to the extension worker context.

@estruyf estruyf added the enhancement New feature or request label Mar 21, 2022
@bwklein
Copy link

bwklein commented Mar 21, 2022

I'm already generating json files for full text search on my sites. Maybe if you specify a json data structure that we can put in the root of our project, we can point your search function to it, if we want to add full text search to the extension UI.

Here is an example of one of my search index files. https://www.femtc.com/search.json

@estruyf
Copy link
Owner Author

estruyf commented Mar 22, 2022

@bwklein which library do you use for the search experience on the website?

@bwklein
Copy link

bwklein commented Mar 22, 2022

For that site I am using https://www.meilisearch.com/ but for smaller Hugo sites I use something like this.

Index the content to json. ( Ex. https://psrfcu.org/index.json )

layouts/_default/index.json

{{- $.Scratch.Add "index" slice -}}
{{- range .Site.RegularPages -}}
  {{- $title := .Title -}}
  {{- $relURL := .RelPermalink -}}
  {{- $tags := .Params.tags -}}
  {{- $categories := .Params.categories -}}
  {{- range .Params.page_sections -}}
    {{- if (and (ne "cta-banner" .block) (ne "hero" .block) (ne "slider" .block)) }}
      {{- $fmContent := "" -}}
      {{- $sectionURL := "" -}}
      {{- $anchor := "" -}}
      {{- if .heading -}}
        {{- $title = .heading -}}
        {{- $anchor = ( print "#" (urlize .heading) ) -}}
      {{- else if .headline -}}
        {{- $title = .headline -}}
        {{- $anchor = ( print "#" (urlize .headline) ) -}}
      {{- end -}}
      {{- with .content_markdown -}}
      {{- $fmContent = (print $fmContent (markdownify . | plainify)) -}}
      {{- end -}}
      {{- with .description_markdown -}}
      {{- $fmContent = (print $fmContent (markdownify . | plainify)) -}}
      {{- end -}}
      {{- with .footnotes_markdown -}}
      {{- $fmContent = (print $fmContent (markdownify . | plainify)) -}}
      {{- end -}}
      {{- if (ne $fmContent "") -}}
      {{- $.Scratch.Add "index" (dict "title" $title "contents" (replace $fmContent "\n" " ") "permalink" (print $relURL $anchor) ) -}}
      {{- end -}}
    {{- end -}}
  {{- end -}}
{{- end -}}
{{- $.Scratch.Get "index" | jsonify -}}

JavaScript search function that uses jQuery (I'm lazy and haven't rewritten this yet to Vanilla js).

static/js/search.js

var summaryInclude = 100;

var fuseOptions = {
  shouldSort: true,
  includeMatches: true,
  threshold: 0.0,
  tokenize:true,
  location: 0,
  distance: 100,
  maxPatternLength: 32,
  minMatchCharLength: 1,
  keys: [
    {name:"title",weight:0.8},
    {name:"contents",weight:0.5}
  ]
};

var searchQuery = param("s");

if(searchQuery){
  $("#search-query").val(searchQuery);
  executeSearch(searchQuery);
} else {
  window.addEventListener('load', function () {
    document.getElementById('search-results').innerHTML = '<p>Please enter a word or phrase in the Search box at the top of the page.</p>';
  });
}

function executeSearch(searchQuery){
  $.getJSON( "/index.json", function( data ) {
    var pages = data;
    var fuse = new Fuse(pages, fuseOptions);
    var result = fuse.search(searchQuery);
    console.log({"matches":result});
    if(result.length > 0){
      populateResults(result);
    }else{
      $('#search-results').append("<p>No matches found, please try with a different search word or phrase.</p>");
    }
  });
}

function populateResults(result){
  $.each(result,function(key,value){
    var contents= value.item.contents;
    var snippet = "";
    var snippetHighlights=[];
    var tags =[];
    if( fuseOptions.tokenize ){
      snippetHighlights.push(searchQuery);
    }else{
      $.each(value.matches,function(matchKey,mvalue){
        if(mvalue.key == "tags" || mvalue.key == "categories" ){
          snippetHighlights.push(mvalue.value);
        }else if(mvalue.key == "contents"){
          start = mvalue.indices[0][0]-summaryInclude>0?mvalue.indices[0][0]-summaryInclude:0;
          end = mvalue.indices[0][1]+summaryInclude<contents.length?mvalue.indices[0][1]+summaryInclude:contents.length;
          snippet += contents.substring(start,end);
          snippetHighlights.push(mvalue.value.substring(mvalue.indices[0][0],mvalue.indices[0][1]-mvalue.indices[0][0]+1));
        }
      });
    }

    if(snippet.length<1){
      snippet += contents.substring(0,summaryInclude*2);
    }
    //pull template from hugo templarte definition
    var templateDefinition = $('#search-result-template').html();
    //replace values
    var output = render(templateDefinition,{key:key,title:value.item.title,link:value.item.permalink,tags:value.item.tags,categories:value.item.categories,snippet:snippet});
    $('#search-results').append(output);

    $.each(snippetHighlights,function(snipkey,snipvalue){
      $("#summary-"+key).mark(snipvalue);
    });

  });
}

function param(name) {
    return decodeURIComponent((location.search.split(name + '=')[1] || '').split('&')[0]).replace(/\+/g, ' ');
}

function render(templateString, data) {
  var conditionalMatches,conditionalPattern,copy;
  conditionalPattern = /\$\{\s*isset ([a-zA-Z]*) \s*\}(.*)\$\{\s*end\s*}/g;
  //since loop below depends on re.lastInxdex, we use a copy to capture any manipulations whilst inside the loop
  copy = templateString;
  while ((conditionalMatches = conditionalPattern.exec(templateString)) !== null) {
    if(data[conditionalMatches[1]]){
      //valid key, remove conditionals, leave contents.
      copy = copy.replace(conditionalMatches[0],conditionalMatches[2]);
    }else{
      //not valid, remove entire section
      copy = copy.replace(conditionalMatches[0],'');
    }
  }
  templateString = copy;
  //now any conditionals removed we can do simple substitution
  var key, find, re;
  for (key in data) {
    find = '\\$\\{\\s*' + key + '\\s*\\}';
    re = new RegExp(find, 'g');
    templateString = templateString.replace(re, data[key]);
  }
  return templateString;
}

The Hugo search page that brings it all together.

layouts/_default/search.html

{{- define "main" -}}
<script src="https://code.jquery.com/jquery-3.3.1.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/fuse.js/3.2.0/fuse.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/jquery.mark.min.js"></script>
<script src="/js/search.js"></script>

<section class="w-full px-4 lg:px-8">
  <div class="block mt-8 px-8 max-w-full lg:max-w-6xl mx-auto pb-8">
    <h2 class="font-heading text-xl lg:text-4xl text-psrfcu mb-6">Search Results</h2>
    <div id="search-results"></div>
  </div>
</section>

<!-- this template is sucked in by search.js and appended to the search-results div above. So editing here will adjust style -->
<script id="search-result-template" type="text/x-js-template">
  <div id="summary-${key}" class="ml-8 mb-4">
    <a href="${link}">
      <h3 class="text-2xl font-heading text-psrfcu">${title}</h2>
      <p class="pl-4">${snippet}&nbsp;&#8230;</p>
    </a>
  </div>
</script>
{{- end -}}

@estruyf
Copy link
Owner Author

estruyf commented Mar 23, 2022

@bwklein it might start to create conflicts. Internally the extension was already using Fuse.js, but was not indexing the body content. In the upcoming beta, this will be changed.

@estruyf estruyf added this to In progress in v7.1.0 Mar 23, 2022
estruyf added a commit that referenced this issue Mar 23, 2022
@estruyf
Copy link
Owner Author

estruyf commented Mar 23, 2022

@grahampcharles this is enabled in the latest beta

@bwklein
Copy link

bwklein commented Mar 23, 2022

@estruyf this is great! It will have an advantage of not requiring any specific setup for search to work. Plus, just knowing that a word is in a file is enough, I don't really need to have a link to the specific section of the document like I do with my more advanced search index/results.

@estruyf estruyf closed this as completed Apr 6, 2022
v7.1.0 automation moved this from In progress to Done Apr 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
Development

No branches or pull requests

2 participants