Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add options for custom lunr Liquid and JS code #1068

Merged
merged 4 commits into from
Jan 14, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file added _includes/lunr/custom-data.json
Empty file.
Empty file added _includes/lunr/custom-index.js
Empty file.
1 change: 1 addition & 0 deletions assets/js/just-the-docs.js
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ function initSearch() {
this.metadataWhitelist = ['position']

for (var i in docs) {
{% include lunr/custom-index.js %}
this.add({
id: i,
title: docs[i].title,
Expand Down
2 changes: 2 additions & 0 deletions assets/js/zzzz-search-data.json
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ permalink: /assets/js/search-data.json
"title": {{ title | jsonify }},
"content": {{ content | replace: '</h', ' . </h' | replace: '<hr', ' . <hr' | replace: '</p', ' . </p' | replace: '<ul', ' . <ul' | replace: '</ul', ' . </ul' | replace: '<ol', ' . <ol' | replace: '</ol', ' . </ol' | replace: '</tr', ' . </tr' | replace: '<li', ' | <li' | replace: '</li', ' | </li' | replace: '</td', ' | </td' | replace: '<td', ' | <td' | replace: '</th', ' | </th' | replace: '<th', ' | <th' | strip_html | remove: 'Table of contents' | normalize_whitespace | replace: '. . .', '.' | replace: '. .', '.' | replace: '| |', '|' | append: ' ' | jsonify }},
"url": "{{ url | relative_url }}",
{% include lunr/custom-data.json page=page %}
"relUrl": "{{ url }}"
}
{%- assign i = i | plus: 1 -%}
Expand All @@ -62,6 +63,7 @@ permalink: /assets/js/search-data.json
"title": {{ page.title | jsonify }},
"content": {{ parts[0] | replace: '</h', ' . </h' | replace: '<hr', ' . <hr' | replace: '</p', ' . </p' | replace: '<ul', ' . <ul' | replace: '</ul', ' . </ul' | replace: '<ol', ' . <ol' | replace: '</ol', ' . </ol' | replace: '</tr', ' . </tr' | replace: '<li', ' | <li' | replace: '</li', ' | </li' | replace: '</td', ' | </td' | replace: '<td', ' | <td' | replace: '</th', ' | </th' | replace: '<th', ' | <th' | strip_html | remove: 'Table of contents' | normalize_whitespace | replace: '. . .', '.' | replace: '. .', '.' | replace: '| |', '|' | append: ' ' | jsonify }},
"url": "{{ page.url | relative_url }}",
{% include lunr/custom-data.json page=page %}
"relUrl": "{{ page.url }}"
}
{%- assign i = i | plus: 1 -%}
Expand Down
39 changes: 39 additions & 0 deletions docs/search.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,3 +125,42 @@ $ bundle exec just-the-docs rake search:init

This command creates the `assets/js/zzzz-search-data.json` file that Jekyll uses to create your search index.
Alternatively, you can create the file manually with [this content]({{ site.github.repository_url }}/blob/main/assets/js/zzzz-search-data.json).

## Custom content for search index

The standard text that is indexed is the page (or post) `.content`, `.title`, and sometimes headers which are already in the `.content`. Other text (Front Matter, data files, etc.) is not indexed. When you want additional text to be indexed, you can customize Just the Docs.
mattxwang marked this conversation as resolved.
Show resolved Hide resolved

{: .warning }
> Customizing search indices is an advanced feature that requires Javascript and Liquid knowledge.

1. When your site used a previous version of Just the Docs, you must update the file
`assets/js/zzzz-search-data.json` using the [above methods](#generate-search-index-when-used-as-a-gem).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. When your site used a previous version of Just the Docs, you must update the file
`assets/js/zzzz-search-data.json` using the [above methods](#generate-search-index-when-used-as-a-gem).
1. First, ensure that `assets/js/zzzz-search-data.json` is up-to-date; it can be regenerated with `rake` or manually (see: ["Generate search index when used as a gem"](#generate-search-index-when-used-as-a-gem)).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattxwang I'm a bit puzzled by this bit of the docs update. Surely the sources of a site using JtD as a (remote) theme should not have a copy of the file assets/js/zzzz-search-data.json unless it is already customising search. Please clarify!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question and I have no answer to that 🤷‍♀️...I have no experience with remote themes. Since it is...remote...I would assume there are no "local" core theme files. Unless they are overriden...if remote themes support that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests site uses remote_theme: just-the-docs/just-the-docs, and I think it's equivalent to theme: just-the-docs with a Gemfile that pulls the gem from GitHub.

I find it quite convenient, because I can simply append @REF where REF can be any tag (release, pre-release, branch, commit) and restart Jekyll, without needing to rerun bundle.

And yes, the tests site generally relies on the (remote) theme providing everything in _includes, _layouts, and _sass, but it can override them when needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, users of our theme shouldn't have a copy of this; however, in my experience I have seen many, many sites that have copied the file directly into their repo. I had also mostly made this edit originally based off of @diablodale's writing.

(there's some reason that this rake task exists - my understanding is that sometimes, the JSON file doesn't exist. Perhaps I need to do more digging here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found in existing JTD docs https://just-the-docs.github.io/just-the-docs/docs/search/#generate-search-index-when-used-as-a-gem

If you use Just the Docs as a remote theme, you do not need the following steps. If you use the theme as a gem....do rake

When that is not true, that is a separate issue.
I assume for now it is true.

For gem use, rake needs to be run because those "true" instructions tell me to do that. Any zzzz-search-data.json previously created by rake is out-of-date and will not function with this PR -- silently do nothing. This PR needs an updated version of zzzz-search-data.json which rake makes.

An experiment: When I bundle install and my Gemfile points to gem "just-the-docs", github: "diablodale/just-the-docs", branch: "dp-release" it creates a local /usr/local/bundle/bundler/gems/just-the-docs-5d413480c165/assets/js/zzzz-search-data.json. I ?assume? that jekyll will use that local system zzzz file. Therefore, if a user updated the gem containing this PR, then it should have a new local zzzz file. I am not a Ruby or Gem expert...so these are more logical guesses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - my comment is more that, bundle exec just-the-docs rake search:init shouldn't be necessary, since we bundle the assets folder as part of the gem. Definitely a separate issue, just wanted to provide some context for the above conversation.

2. Add a new file named `_includes/lunr/custom-data.json` to your site. Insert your custom Liquid
code that reads the page object at `include.page` then generates custom Javascript
fields that hold the custom text you want to index. You can verify the fields you output at
the generated `assets/js/search-data.json`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Add a new file named `_includes/lunr/custom-data.json` to your site. Insert your custom Liquid
code that reads the page object at `include.page` then generates custom Javascript
fields that hold the custom text you want to index. You can verify the fields you output at
the generated `assets/js/search-data.json`.
2. To add Liquid/Jekyll-based data: create a new include at the path `_includes/lunr/custom-data.json`. Insert custom Liquid code that reads various data (ex: `include.page`, `site.data`, `site.static_files`) that then generates valid [JSON](https://www.json.org/json-en.html) to add to the index. Verify the fields in the generated `assets/js/search-data.json`.

Copy link
Contributor

@pdmosses pdmosses Jan 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattxwang A very minor meta-nit: "ex" could be read as "excluding", and I don't think we should use it as a short form of "e.g." or "for example".

This seems even more likely to be confusing (at least for European readers) when the colon is omitted, e.g., in "ex front matter, …".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not confident the recent edits to steps 2 and 3 are clear. As now written, it suggests they are disconnected. "Use 2 for liquid". Use "3 for javascript".
But the truth is the coder must use both. That's how I wrote them tightly together, in sequence, and part of the same numbered list.

This is already a very advanced feature. Anyone using it will know what they are doing. I suspect the majority of these advanced users want to liquid->lunr. And the docs should explain that majority scenario.

If you want to document the ability of two different scenarios:

  • liquid->javascript->lunr
  • javascript->lunr

...then we need to separate sections and two separate numbered lists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattxwang A very minor meta-nit: "ex" could be read as "excluding", and I don't think we should use it as a short form of "e.g." or "for example".

This seems even more likely to be confusing (at least for European readers) when the colon is omitted, e.g., in "ex front matter, …".

Agreed. "ex" should not be used. "e.g." or "for example" is the clear and correct usage.

3. Add a new file named `_includes/lunr/custom-index.js` to your site. Insert your custom Javascript
code that reads your custom Javascript fields and inserts them into the search index.
mattxwang marked this conversation as resolved.
Show resolved Hide resolved

#### Example

`_includes/lunr/custom-data.json` custom code reads the page's custom Front Matter `usage` and `examples`
fields, normalizes the text, and writes the text to custom Javascript `myusage` and `myexamples` fields.
mattxwang marked this conversation as resolved.
Show resolved Hide resolved

{% raw %}
```liquid
{%- capture newline %}
{% endcapture -%}
"myusage": {{ include.page.usage | markdownify | replace:newline,' ' | strip_html | normalize_whitespace | strip | jsonify }},
"myexamples": {{ include.page.examples | markdownify | replace:newline,' ' | strip_html | normalize_whitespace | strip | jsonify }},
```
{% endraw %}

`_includes/lunr/custom-index.js` custom code is within a Javascript loop. All custom
Javascript fields are accessed as fields of `docs[i]` such as `docs[i].myusage`.
Finally, append your custom fields on to the already existing `docs[i].content`.

```javascript
const content_to_merge = [docs[i].content, docs[i].myusage, docs[i].myexamples];
docs[i].content = content_to_merge.join(' ');
```
2 changes: 2 additions & 0 deletions lib/tasks/search.rake
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ permalink: /assets/js/search-data.json
"title": {{ title | jsonify }},
"content": {{ content | replace: \'</h\', \' . </h\' | replace: \'<hr\', \' . <hr\' | replace: \'</p\', \' . </p\' | replace: \'<ul\', \' . <ul\' | replace: \'</ul\', \' . </ul\' | replace: \'<ol\', \' . <ol\' | replace: \'</ol\', \' . </ol\' | replace: \'</tr\', \' . </tr\' | replace: \'<li\', \' | <li\' | replace: \'</li\', \' | </li\' | replace: \'</td\', \' | </td\' | replace: \'<td\', \' | <td\' | replace: \'</th\', \' | </th\' | replace: \'<th\', \' | <th\' | strip_html | remove: \'Table of contents\' | normalize_whitespace | replace: \'. . .\', \'.\' | replace: \'. .\', \'.\' | replace: \'| |\', \'|\' | append: \' \' | jsonify }},
"url": "{{ url | relative_url }}",
{% include lunr/custom-data.json page=page %}
"relUrl": "{{ url }}"
}
{%- assign i = i | plus: 1 -%}
Expand All @@ -72,6 +73,7 @@ permalink: /assets/js/search-data.json
"title": {{ page.title | jsonify }},
"content": {{ parts[0] | replace: \'</h\', \' . </h\' | replace: \'<hr\', \' . <hr\' | replace: \'</p\', \' . </p\' | replace: \'<ul\', \' . <ul\' | replace: \'</ul\', \' . </ul\' | replace: \'<ol\', \' . <ol\' | replace: \'</ol\', \' . </ol\' | replace: \'</tr\', \' . </tr\' | replace: \'<li\', \' | <li\' | replace: \'</li\', \' | </li\' | replace: \'</td\', \' | </td\' | replace: \'<td\', \' | <td\' | replace: \'</th\', \' | </th\' | replace: \'<th\', \' | <th\' | strip_html | remove: \'Table of contents\' | normalize_whitespace | replace: \'. . .\', \'.\' | replace: \'. .\', \'.\' | replace: \'| |\', \'|\' | append: \' \' | jsonify }},
"url": "{{ page.url | relative_url }}",
{% include lunr/custom-data.json page=page %}
"relUrl": "{{ page.url }}"
}
{%- assign i = i | plus: 1 -%}
Expand Down