-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search problems #46
Comments
Yes, you're right there is some search weirdness. Honestly I was hoping to have someone who is more proficient in JavaScript / lunr.js help out here. I think the search engine itself works fine (I've tested it on other sites), but the way that the search index is getting generated and initialized is probably causing the majority of the problems. |
One quick fix I was able to do when it was erroring out on my content was to convert crlf, cr, lf, and tab to a single space: "content": "{{ page.content | markdownify | strip_html | xml_escape | remove: 'Table of contents' | normalize_whitespace | replace: "\r\n", " " | replace: "\r", " " | replace: "\n", " " | replace: "\t", " " | replace: '\', ' ' }}", The odd thing is that reading the documentation this seems to be what normalize_whitespace should do. That improved the quality of matches, but it's still got issues - I'll take a peek at lunr next. |
I ended up somewhere interesting: {
{% assign docId = 1 %}{% for page in site.html_pages %}{% unless page.search == "exclude" %}"{{ docId }}": {
"id": "{{ docId }}",
"title": "{{ page.title | xml_escape }}",
"content": "{{ page.content | markdownify | strip_html | xml_escape | remove: 'Table of contents' | replace: "\r", " " | replace: "\n", " " | replace: "\t", " " | normalize_whitespace | replace: '\', ' ' }}",
"url": "{{ page.url | absolute_url | xml_escape }}",
"relUrl": "{{ page.url | xml_escape }}"
}{% if forloop.last %}{% else %},
{% endif %}{% assign docId = docId | plus: 1 %}{% endunless %}{% endfor %}
{% for collection in site.collections %}{% unless collection.output == false %}{% for page in collection.docs %}{% unless page.search == "exclude" %}{% if forloop.first %},{% else %}{% endif %}"{{ docId }}": {
"id": "{{ docId }}",
"title": "{{ page.title | xml_escape }}",
"content": "{{ page.content | markdownify | strip_html | xml_escape | remove: 'Table of contents' | replace: "\r", " " | replace: "\n", " " | replace: "\t", " " | normalize_whitespace | replace: '\', ' ' }}",
"url": "{{ page.url | absolute_url | xml_escape }}",
"relUrl": "{{ page.url | xml_escape }}"
}{% if forloop.last %}{% else %},
{% endif %}{% assign docId = docId | plus: 1 %}{% endunless %}{% endfor %}{% endunless %}{% endfor %}
} In this case I made changes to:
|
@kendallmiller thanks for your digging in here... I will take a look at this soon. |
Thanks for this theme. While we are waiting for some help on this issue, can we have an option for google search opening a new tab as in this theme: https://phlow.github.io/feeling-responsive/search/ Would make this fantastic theme be deployed by many as there would be no more serious issues. Thanks again. |
I think i have solved the search bugs in #52 ... I will open an issue to use collections as a separate work stream to not hold up v0.2.2 from being released. |
The Search box seems to be pretty unreliable to me. Sometimes it shows false positives, sometimes actual results are missing.
Example: On your site, type "searchsearch_enabled" to the search box.
As long as you type "searchsearch_en", there are matches. As you keep typing "abl", no more matches. When you type the trailing "ed", the matches come back.
Part of the story is that plenty of space characters are missing from search-data.json, e.g. whenever there's an explicit newline in markdown (to keep a sensible width of the source file), the two words become concatenated. That's how lines 14-15 of docs/configuration.md become "searchsearch_enabled", that's why searching for "abcdef" (present in ui-components/typography.md) doesn't give any match, etc.
Another part of the story is that the search engine must be buggy as well, for example not returning the match for the substring "searchsearch_ena". Another example: utilities/color.md contains the word "systemsized", this correctly appears in search_data.json, yet not found via the search box. It's quite easy to find other examples too.
The text was updated successfully, but these errors were encountered: