Large search indexes can cause Lunr to freeze the page while it creates the index. #859

Closed
sashadt opened this Issue Mar 3, 2016 · 56 comments

Comments

Projects
9 participants
@sashadt

sashadt commented Mar 3, 2016

When using readthedocs theme, search shows "no results" for a few seconds while search is being performed, and then results are loaded. However, few seconds are enough to convince the users nothing is found, and they move on.

For example see: http://docs.datatorrent.com Try searching for "test" and you will see the following for a while:

image

Perhaps an indicator of search in progress can be added?

@sashadt sashadt changed the title from Search shows "No results" for a few seconds before results are loaded to Search shows "Sorry, page not found" for a few seconds before results are loaded Mar 3, 2016

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Mar 3, 2016

Member

Perhaps an indicator of search in progress can be added?

That seems reasonable to me. Presumably the problem occurs due to the large number of pages. On a smaller site, this would not be a problem as the wait time would be much less.

Member

waylan commented Mar 3, 2016

Perhaps an indicator of search in progress can be added?

That seems reasonable to me. Presumably the problem occurs due to the large number of pages. On a smaller site, this would not be a problem as the wait time would be much less.

@sashadt

This comment has been minimized.

Show comment
Hide comment
@sashadt

sashadt Mar 3, 2016

We have 50 files with 18,936 lines of Markdown. I don't know if that's considered large, but even for smaller projects running on slower servers it would provide a better user experience. Thanks for looking into it!

sashadt commented Mar 3, 2016

We have 50 files with 18,936 lines of Markdown. I don't know if that's considered large, but even for smaller projects running on slower servers it would provide a better user experience. Thanks for looking into it!

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Mar 4, 2016

Member

It appears that the RTD theme includes the default content "Sorry, page not found." on the search results page (see search.html#L18). However, the search tool will also replace any content with <p>No results found</p> (see search.js#L63) which makes the initial filler redundant. It appears that RTD displays nothing. Should we do the same or display a message to wait for results?

Member

waylan commented Mar 4, 2016

It appears that the RTD theme includes the default content "Sorry, page not found." on the search results page (see search.html#L18). However, the search tool will also replace any content with <p>No results found</p> (see search.js#L63) which makes the initial filler redundant. It appears that RTD displays nothing. Should we do the same or display a message to wait for results?

@d0ugal

This comment has been minimized.

Show comment
Hide comment
@d0ugal

d0ugal Mar 4, 2016

Member

This is quite interesting. I assumed the delay was the length of time it takes to download the search index (about 0.5mb). However, there is a long delay after it is downloaded. So is the delay while lunr.js is processing the index? Meaning the slowness is probably more dependant users machines than the server. Looking a the Lunr change log there are a few performance improvements in 0.5.8 and later (we ship 0.5.7). So I think there are two things we should do:

  • Add a loading indicator (this could just be static text saying "Searching...") and remove the incorrect text in ReadTheDocs's search.html
  • Update Lunr.js to 0.6.0
Member

d0ugal commented Mar 4, 2016

This is quite interesting. I assumed the delay was the length of time it takes to download the search index (about 0.5mb). However, there is a long delay after it is downloaded. So is the delay while lunr.js is processing the index? Meaning the slowness is probably more dependant users machines than the server. Looking a the Lunr change log there are a few performance improvements in 0.5.8 and later (we ship 0.5.7). So I think there are two things we should do:

  • Add a loading indicator (this could just be static text saying "Searching...") and remove the incorrect text in ReadTheDocs's search.html
  • Update Lunr.js to 0.6.0
@d0ugal

This comment has been minimized.

Show comment
Hide comment
@d0ugal

d0ugal Mar 4, 2016

Member

I just cloned and tested with the DataTorrent docs locally, which confirms the slowness is with the JavaScript as downloading is almost instantaneous.

Even worse, my browser (Chrome) is locked up by the JavaScript for a good few seconds, so I can't actually click anything. I done some very rough measurements to try and find what was making it so slow. My timings below are all rounded because I don't have confidence in my measuring.

The loading of the JSON into Lunr is the culprit. This for loop takes around 8(!!) seconds. More specifically it is the call to index.add(doc), so I timed that for each document (the project has 1022, as we add individual markdown documents and page headers individually to improve the search results). Most of them are very fast (less than 50ms) but two are really slow. The Application Developer Guide takes around 1 second consistently and the Release Notes take around 4.5 seconds consistently.

I don't have time to keep looking now, but we need try and determine what makes these so slow. Is it just because these are long pages? Is there something that causes an issue with Lunr? I tested with both Lunr 0.5.7 and 0.6.0 and got similar results. 0.6.0 might actually be a little slower, but I didn't do enough tests to be confident about that.

Member

d0ugal commented Mar 4, 2016

I just cloned and tested with the DataTorrent docs locally, which confirms the slowness is with the JavaScript as downloading is almost instantaneous.

Even worse, my browser (Chrome) is locked up by the JavaScript for a good few seconds, so I can't actually click anything. I done some very rough measurements to try and find what was making it so slow. My timings below are all rounded because I don't have confidence in my measuring.

The loading of the JSON into Lunr is the culprit. This for loop takes around 8(!!) seconds. More specifically it is the call to index.add(doc), so I timed that for each document (the project has 1022, as we add individual markdown documents and page headers individually to improve the search results). Most of them are very fast (less than 50ms) but two are really slow. The Application Developer Guide takes around 1 second consistently and the Release Notes take around 4.5 seconds consistently.

I don't have time to keep looking now, but we need try and determine what makes these so slow. Is it just because these are long pages? Is there something that causes an issue with Lunr? I tested with both Lunr 0.5.7 and 0.6.0 and got similar results. 0.6.0 might actually be a little slower, but I didn't do enough tests to be confident about that.

@d0ugal d0ugal added the Bug label Mar 4, 2016

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Mar 4, 2016

Member

Even worse, my browser (Chrome) is locked up by the JavaScript for a good few seconds, so I can't actually click anything.

I noticed the same behavior when looking into this as well. Although I didn't big any deeper. The point is that I can confirm this behavior is not isolated.

Add a loading indicator (this could just be static text saying "Searching...") and remove the incorrect text in ReadTheDocs's search.html

I don't see any reason why we can't do this now. I'll push a PR when I get a few minutes unless someone beats me to it.

Member

waylan commented Mar 4, 2016

Even worse, my browser (Chrome) is locked up by the JavaScript for a good few seconds, so I can't actually click anything.

I noticed the same behavior when looking into this as well. Although I didn't big any deeper. The point is that I can confirm this behavior is not isolated.

Add a loading indicator (this could just be static text saying "Searching...") and remove the incorrect text in ReadTheDocs's search.html

I don't see any reason why we can't do this now. I'll push a PR when I get a few minutes unless someone beats me to it.

waylan added a commit that referenced this issue Mar 4, 2016

Update default text when search is running.
The search tool will replace the text with "No results found" if that is the case.
No reason to display is here. Also, the default text displays when the search 
is running (especially if the search takes a long time), so it might as well be 
accurate and indicate that the search is running. Addresses #859.

d0ugal added a commit to d0ugal/lunr.js that referenced this issue Mar 4, 2016

Improve the performance of Index.add.
The loops in Index.add can easily be called hundreds of thousands of times for
large documents. Larger documents can perform particularly badly[1] under
Chrome. Removing the usage of reduce and filter in favour of inline native for
loops can lead to a 20-30% performance increase in Chrome. Firefox also sees a
similar but much smaller impovement (~5%).

[1]: mkdocs/mkdocs#859 (comment)

@d0ugal d0ugal referenced this issue in olivernn/lunr.js Mar 4, 2016

Closed

Improve the performance of Index.add. #208

@d0ugal

This comment has been minimized.

Show comment
Hide comment
@d0ugal

d0ugal Mar 6, 2016

Member

I investigated further. FireFox is much faster than Chrome. I managed to improve the performance under Chrome a bit. See: olivernn/lunr.js#208

Member

d0ugal commented Mar 6, 2016

I investigated further. FireFox is much faster than Chrome. I managed to improve the performance under Chrome a bit. See: olivernn/lunr.js#208

olivernn added a commit to olivernn/lunr.js that referenced this issue Mar 8, 2016

Improve the performance of Index.add.
The loops in Index.add can easily be called hundreds of thousands of times for
large documents. Larger documents can perform particularly badly[1] under
Chrome. Removing the usage of reduce and filter in favour of inline native for
loops can lead to a 20-30% performance increase in Chrome. Firefox also sees a
similar but much smaller impovement (~5%).

[1]: mkdocs/mkdocs#859 (comment)
@olivernn

This comment has been minimized.

Show comment
Hide comment
@olivernn

olivernn Mar 8, 2016

I've pushed @d0ugal's changes in 0.7.0 of lunr, so there should hopefully be some improvements from that change.

Without knowing anything of how mkdocs makes use of lunr, I have a few suggestions that could improve the performance and user experience further:

  • Pre-build the index and then load this pre-built index
  • Move searching/building over to a webworker

As an example, the example lunr app makes use loading a pre-built index. I think a member of the Angular.js team wrote up how they improved performance using web workers too.

olivernn commented Mar 8, 2016

I've pushed @d0ugal's changes in 0.7.0 of lunr, so there should hopefully be some improvements from that change.

Without knowing anything of how mkdocs makes use of lunr, I have a few suggestions that could improve the performance and user experience further:

  • Pre-build the index and then load this pre-built index
  • Move searching/building over to a webworker

As an example, the example lunr app makes use loading a pre-built index. I think a member of the Angular.js team wrote up how they improved performance using web workers too.

d0ugal added a commit to d0ugal/mkdocs that referenced this issue Mar 9, 2016

@d0ugal d0ugal changed the title from Search shows "Sorry, page not found" for a few seconds before results are loaded to Large search indexes can cause Lunr to freeze the page while it creates the index. Mar 9, 2016

@d0ugal

This comment has been minimized.

Show comment
Hide comment
@d0ugal

d0ugal Mar 9, 2016

Member

Our usage of Lunr is very simple. I think all the relevant code is less than 100 lines. It is a testament to Lunr that it has gotten us this far before we hit an issue like this.

Thanks for the pointers, it sound like those would be both valuable additions. I'll look into building the index, I think that would be the biggest win.

Member

d0ugal commented Mar 9, 2016

Our usage of Lunr is very simple. I think all the relevant code is less than 100 lines. It is a testament to Lunr that it has gotten us this far before we hit an issue like this.

Thanks for the pointers, it sound like those would be both valuable additions. I'll look into building the index, I think that would be the biggest win.

@d0ugal

This comment has been minimized.

Show comment
Hide comment
@d0ugal

d0ugal Mar 9, 2016

Member

The main issue for us, is that we are Python based, pre-building the index would be tricky. We would need to shell out to node and call a file. I would be okay with that if it gracefully fell back to the current approach when node isn't available. However, this sort of functionality might be better suited for a plugin (so another candidate for #206).

That maybe means we should look into web workers for now and revisit this later.

Member

d0ugal commented Mar 9, 2016

The main issue for us, is that we are Python based, pre-building the index would be tricky. We would need to shell out to node and call a file. I would be okay with that if it gracefully fell back to the current approach when node isn't available. However, this sort of functionality might be better suited for a plugin (so another candidate for #206).

That maybe means we should look into web workers for now and revisit this later.

@pcdinh

This comment has been minimized.

Show comment
Hide comment
@pcdinh

pcdinh Mar 16, 2016

I got the same issue. I have several files only. Every time I visit my mkdocs-based web document, the page is frozen. Firefox warning dialog appeared several times, that allowed me to stop lunr.js

I found that search_index.json response was about 1.2MB. It is fast to download but quite slow to parse in Javascript

pcdinh commented Mar 16, 2016

I got the same issue. I have several files only. Every time I visit my mkdocs-based web document, the page is frozen. Firefox warning dialog appeared several times, that allowed me to stop lunr.js

I found that search_index.json response was about 1.2MB. It is fast to download but quite slow to parse in Javascript

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Mar 16, 2016

Member

The main issue for us, is that we are Python based, pre-building the index would be tricky. We would need to shell out to node and call a file.

I had wondered if there was a way to pre-build the index in Python. This is what I found:

The index is just JSON, so technically it should be possible to pre-build the index and serve that rather than the JSON file we serve today. However, there is no published spec for the index; although it is versioned. The index stores the version of lunr.js which was used to create it and an error is raised if a different version of lunr.js attempts to load the index. This suggests that backward incompatible changes between versions are possible/expected. However, as the lunr.js lib is shipped with MkDocs we have full control over which version is being used. The tricky part is the lack of a spec. Someone would have to go through the JS code and re-implement the index builder in Python and transitioning from one version to the next has the potential to require a similar amount of work. I suppose if a third party created such a lib, it might make sense to use it (or via a Plugin if/when MkDocs adds support), but until then, calling to Node is the more sensible approach. But even that feels more like something for a Plugin.

In the end, using webworkers to avoid a browser freeze is probably the most sensible short-term solution.

Member

waylan commented Mar 16, 2016

The main issue for us, is that we are Python based, pre-building the index would be tricky. We would need to shell out to node and call a file.

I had wondered if there was a way to pre-build the index in Python. This is what I found:

The index is just JSON, so technically it should be possible to pre-build the index and serve that rather than the JSON file we serve today. However, there is no published spec for the index; although it is versioned. The index stores the version of lunr.js which was used to create it and an error is raised if a different version of lunr.js attempts to load the index. This suggests that backward incompatible changes between versions are possible/expected. However, as the lunr.js lib is shipped with MkDocs we have full control over which version is being used. The tricky part is the lack of a spec. Someone would have to go through the JS code and re-implement the index builder in Python and transitioning from one version to the next has the potential to require a similar amount of work. I suppose if a third party created such a lib, it might make sense to use it (or via a Plugin if/when MkDocs adds support), but until then, calling to Node is the more sensible approach. But even that feels more like something for a Plugin.

In the end, using webworkers to avoid a browser freeze is probably the most sensible short-term solution.

@rickpeters

This comment has been minimized.

Show comment
Hide comment
@rickpeters

rickpeters Mar 25, 2016

Maybe I don't understand it, but search_index.json is already generated when you do mkdocs build. Problem is not in the generating or downloading, the problem is parsing this download by lunr.js. I am not a real expert on this, just looked at the webworker example, but I have no idea about all the javascript dependencies that are necessary, Just seems that python is not the issue here?
keep up the good work guys, I love the product!

regards,
Rick

Maybe I don't understand it, but search_index.json is already generated when you do mkdocs build. Problem is not in the generating or downloading, the problem is parsing this download by lunr.js. I am not a real expert on this, just looked at the webworker example, but I have no idea about all the javascript dependencies that are necessary, Just seems that python is not the issue here?
keep up the good work guys, I love the product!

regards,
Rick

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Mar 25, 2016

Member

@rickpeters the search_index.json file generated by MkDocs is not a lunr.js search index. It is simply a JSON file which contains all of the content of every page in a project. Every time you do a search, that file is downloaded by the browser, and the list of JSON objects are stepped through (one page at a time) with each one being passed to lunr.js to build the index which lunr.js then uses to run the search. And with each page load, the entire index has to be rebuilt. At least that is the way it works now.

The ideal solution would be to have search_index.json be an actual prebuilt lunr.js index. However, as a workaround, a webworker allows the browser to parse and build the index in a separate process from the one which is displaying the page. On page reloads, that existing separate process would continue to be used and the index would not need to be rebuilt. Even if we had a proper prebuilt index, the webworker would eliminate the need to reload the index on each page load, so using a webworker is a good idea regardless of how we generate an index. It also avoids browser freezes on index loading/building. Hope that clears things up.

Member

waylan commented Mar 25, 2016

@rickpeters the search_index.json file generated by MkDocs is not a lunr.js search index. It is simply a JSON file which contains all of the content of every page in a project. Every time you do a search, that file is downloaded by the browser, and the list of JSON objects are stepped through (one page at a time) with each one being passed to lunr.js to build the index which lunr.js then uses to run the search. And with each page load, the entire index has to be rebuilt. At least that is the way it works now.

The ideal solution would be to have search_index.json be an actual prebuilt lunr.js index. However, as a workaround, a webworker allows the browser to parse and build the index in a separate process from the one which is displaying the page. On page reloads, that existing separate process would continue to be used and the index would not need to be rebuilt. Even if we had a proper prebuilt index, the webworker would eliminate the need to reload the index on each page load, so using a webworker is a good idea regardless of how we generate an index. It also avoids browser freezes on index loading/building. Hope that clears things up.

@rickpeters

This comment has been minimized.

Show comment
Hide comment
@rickpeters

rickpeters Mar 25, 2016

Thanks very much for this clarification!
Do you have pointers on where I could find how to prebuild this index? I am currently performing the full continuous integration / continuous delivery of my mkdocs site (> 250 pages) using Docker, including extra steps like inclusion of a dot-language plugin. It generates the site and then packages it into a nginx container for presentation. Works great and is very transferable.
I would be willing to look into integrating the necessary parts for pre-building the index which then could be used by lunr.js I guess?
Even then, the webworker would be a good idea I think. Would the webworker need server side javascript also? If needed I could package that in the container also.
And of course all of this could be added to mkdocs as opensource code for showing how to host your results using docker containers or eventually transerring the static site to a great other hosting solution.

Thanks very much for this clarification!
Do you have pointers on where I could find how to prebuild this index? I am currently performing the full continuous integration / continuous delivery of my mkdocs site (> 250 pages) using Docker, including extra steps like inclusion of a dot-language plugin. It generates the site and then packages it into a nginx container for presentation. Works great and is very transferable.
I would be willing to look into integrating the necessary parts for pre-building the index which then could be used by lunr.js I guess?
Even then, the webworker would be a good idea I think. Would the webworker need server side javascript also? If needed I could package that in the container also.
And of course all of this could be added to mkdocs as opensource code for showing how to host your results using docker containers or eventually transerring the static site to a great other hosting solution.

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Mar 25, 2016

Member

The lunr.js library includes an example script which could be adapted and run under Node.js to build the index. You would need to call out to Node in your build processes (or call JavaScript from Python some way). That would give you a prebuilt index. However, the current lunr.js calling code in MkDocs would need to be updated to accept a prebuilt index (at aprox search.js#L32) similar to this.

That code in search.js would need to be modified for a webworker as well. The code would need to be placed in a webworker with a fallback for when a webworker is not available (in some older browsers).

Member

waylan commented Mar 25, 2016

The lunr.js library includes an example script which could be adapted and run under Node.js to build the index. You would need to call out to Node in your build processes (or call JavaScript from Python some way). That would give you a prebuilt index. However, the current lunr.js calling code in MkDocs would need to be updated to accept a prebuilt index (at aprox search.js#L32) similar to this.

That code in search.js would need to be modified for a webworker as well. The code would need to be placed in a webworker with a fallback for when a webworker is not available (in some older browsers).

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Mar 28, 2016

Member

FYI, I have succeeded in getting the search index to build here: 3009a3a. It was relatively easy using the PyExecJS lib. Still not usable though as search.js has not yet been modified.

You can follow my progress here: master...waylan:search

Member

waylan commented Mar 28, 2016

FYI, I have succeeded in getting the search index to build here: 3009a3a. It was relatively easy using the PyExecJS lib. Still not usable though as search.js has not yet been modified.

You can follow my progress here: master...waylan:search

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Mar 29, 2016

Member

So I thought I had a working solution for pre-building the index (here: master...waylan:search), but I'm getting weird errors I can't make sense of on Windows. It works fine on Linux and I don't have access to a Mac for testing right now. If anyone wants to help, please test (my branch) on your local machine and report back.

Member

waylan commented Mar 29, 2016

So I thought I had a working solution for pre-building the index (here: master...waylan:search), but I'm getting weird errors I can't make sense of on Windows. It works fine on Linux and I don't have access to a Mac for testing right now. If anyone wants to help, please test (my branch) on your local machine and report back.

@facelessuser

This comment has been minimized.

Show comment
Hide comment
@facelessuser

facelessuser Mar 29, 2016

Contributor

The require path must use forward slashes:

        lunr_path = os.path.join(
            os.path.dirname(os.path.abspath(__file__)),
            'assets/search/mkdocs/js/lunr.min.js'
        ).replace('\\', '/')
Contributor

facelessuser commented Mar 29, 2016

The require path must use forward slashes:

        lunr_path = os.path.join(
            os.path.dirname(os.path.abspath(__file__)),
            'assets/search/mkdocs/js/lunr.min.js'
        ).replace('\\', '/')
@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Mar 29, 2016

Member

Thanks @facelessuser. That solved one problem.

I have Node installed on all my machines and it is working now. However, as I understand it, PyExecJS does not need any dependencies installed in Windows or OS X as it can use JScript or Apple JavaScriptCore, each of which are installed by default on their respective systems (Linux will always need an additional dependency (one of Node.js, PyV8, Spidermoney, PhantomJS, ...), but that seems like an acceptable compromise).

When I force the use of JScript, I'm getting an error I can't make sense of (TypeError: Object expected with no indication of where in the JS the error is being raised) and I don't have access to a Mac, so any additional feedback would be appreciated.

Member

waylan commented Mar 29, 2016

Thanks @facelessuser. That solved one problem.

I have Node installed on all my machines and it is working now. However, as I understand it, PyExecJS does not need any dependencies installed in Windows or OS X as it can use JScript or Apple JavaScriptCore, each of which are installed by default on their respective systems (Linux will always need an additional dependency (one of Node.js, PyV8, Spidermoney, PhantomJS, ...), but that seems like an acceptable compromise).

When I force the use of JScript, I'm getting an error I can't make sense of (TypeError: Object expected with no indication of where in the JS the error is being raised) and I don't have access to a Mac, so any additional feedback would be appreciated.

@facelessuser

This comment has been minimized.

Show comment
Hide comment
@facelessuser

facelessuser Mar 29, 2016

Contributor

Yeah, I have node installed as well. I would have to look at another time as I probably don't have time right now to dig into really obscure issues today.

Contributor

facelessuser commented Mar 29, 2016

Yeah, I have node installed as well. I would have to look at another time as I probably don't have time right now to dig into really obscure issues today.

@rickpeters

This comment has been minimized.

Show comment
Hide comment
@rickpeters

rickpeters Mar 29, 2016

Hi Waylan,

I would at least try and help you test this (if it would help), however currently I have a test environment in which I use the requirements.txt to specify the beta build of mkdocs itself.
It looks like this:
https://github.com/mkdocs/mkdocs/archive/master.tar.gz

I tried to reference your github repo with this:
-e git+https://github.com/waylan/mkdocs.git@search#egg=mkdocs

however this does not work because I guess I only need the mkdocs subdirectory in the repo and I don't know how to specify that in the requirements.txt

Hi Waylan,

I would at least try and help you test this (if it would help), however currently I have a test environment in which I use the requirements.txt to specify the beta build of mkdocs itself.
It looks like this:
https://github.com/mkdocs/mkdocs/archive/master.tar.gz

I tried to reference your github repo with this:
-e git+https://github.com/waylan/mkdocs.git@search#egg=mkdocs

however this does not work because I guess I only need the mkdocs subdirectory in the repo and I don't know how to specify that in the requirements.txt

@d0ugal

This comment has been minimized.

Show comment
Hide comment
@rickpeters

This comment has been minimized.

Show comment
Hide comment
@rickpeters

rickpeters Mar 29, 2016

@d0ugal thanks, yes that works :-)
able to build, now I get an error on starting mkdocs.
Error is on import execjs
Seems logical since in essence in Docker I'm running on linux and I need a dependency for this I think. Just have to find out what to add tom my requirements.txt

@d0ugal thanks, yes that works :-)
able to build, now I get an error on starting mkdocs.
Error is on import execjs
Seems logical since in essence in Docker I'm running on linux and I need a dependency for this I think. Just have to find out what to add tom my requirements.txt

@facelessuser

This comment has been minimized.

Show comment
Hide comment
@facelessuser

facelessuser Mar 29, 2016

Contributor

@rickpeters

now I get an error on starting mkdocs.
Error is on import execjs

I got that error at first too in my virtual environment. I then did pip install pyexecjs and it said that dependencies where already met, but then it worked. Idk.

Contributor

facelessuser commented Mar 29, 2016

@rickpeters

now I get an error on starting mkdocs.
Error is on import execjs

I got that error at first too in my virtual environment. I then did pip install pyexecjs and it said that dependencies where already met, but then it worked. Idk.

@rickpeters

This comment has been minimized.

Show comment
Hide comment
@rickpeters

rickpeters Mar 29, 2016

Nice thing about Docker is that I package all my dependencies in the container and never have to contaminate my system :-)
I added the pyexecjs to requirements.txt
After that I changed my Dockerfile and also installed (recent) node.js because of the javascript dependency.

Lo and behold... It magically works :-)

I put this mkdocs container in dev mode using my very simple site (one page) with a separate search.html (because of the slow loading). And it also works nice, so functionally it seems correct.
Hopefully tomorrow I will be able to test it against a large (static) site, for now I have to stop (family duties...)

Nice thing about Docker is that I package all my dependencies in the container and never have to contaminate my system :-)
I added the pyexecjs to requirements.txt
After that I changed my Dockerfile and also installed (recent) node.js because of the javascript dependency.

Lo and behold... It magically works :-)

I put this mkdocs container in dev mode using my very simple site (one page) with a separate search.html (because of the slow loading). And it also works nice, so functionally it seems correct.
Hopefully tomorrow I will be able to test it against a large (static) site, for now I have to stop (family duties...)

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Mar 29, 2016

Member

@rickpeters glad to hear it is working for you. As you have access to a large site, I would like to hear how it helps performance. How long does your browser hang as it loads the search index? Is it even perceivable?

Oh, and I just realized I never added PyExecJS as a requirement to the setup.py script. I only added it to the requirements.txt file (which is only used by tests). Should be fixed now.

Member

waylan commented Mar 29, 2016

@rickpeters glad to hear it is working for you. As you have access to a large site, I would like to hear how it helps performance. How long does your browser hang as it loads the search index? Is it even perceivable?

Oh, and I just realized I never added PyExecJS as a requirement to the setup.py script. I only added it to the requirements.txt file (which is only used by tests). Should be fixed now.

@rickpeters

This comment has been minimized.

Show comment
Hide comment
@rickpeters

rickpeters Mar 29, 2016

Couldn't help myself, had to test!
Just tried it on a large site with static building for presentation with nginx. Same separate search page as before. I attached the timeline for the search page. I think it's fairly spectacular. Keep in mind this is from a local webserver, but even then it downloads the 33 MB index in 680 ms. But more important the search itself performs very nice! I have the feeling this is way better.

searchtimeline

Couldn't help myself, had to test!
Just tried it on a large site with static building for presentation with nginx. Same separate search page as before. I attached the timeline for the search page. I think it's fairly spectacular. Keep in mind this is from a local webserver, but even then it downloads the 33 MB index in 680 ms. But more important the search itself performs very nice! I have the feeling this is way better.

searchtimeline

@d0ugal

This comment has been minimized.

Show comment
Hide comment
@d0ugal

d0ugal Mar 29, 2016

Member

33MB index? That is huge!

Member

d0ugal commented Mar 29, 2016

33MB index? That is huge!

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Mar 29, 2016

Member

33MB index? That is huge!

Wow, that it is! And the search data (the file we previously served as the index ) is only 3.59 MB according to that report. We still need the data file to display the search results, but I had no idea that the index would be much larger than the data. Crazy!

It probably makes sense to still look into a web worker then.

Member

waylan commented Mar 29, 2016

33MB index? That is huge!

Wow, that it is! And the search data (the file we previously served as the index ) is only 3.59 MB according to that report. We still need the data file to display the search results, but I had no idea that the index would be much larger than the data. Crazy!

It probably makes sense to still look into a web worker then.

@d0ugal

This comment has been minimized.

Show comment
Hide comment
@d0ugal

d0ugal Jun 9, 2016

Member

I have now also tested on Fedora 23 and Python 3.5.1.

python -m execjs --print-available-runtimes   
Node
Nashorn

Node again completed without issues.

Nashorm failed again, but differently, and with a lengthly traceback.

INFO    -  Using 'Nashorn' JavaScript runtime to build search index. 
Traceback (most recent call last):
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/mkdocs/search.py", line 134, in generate_search_index
    return context.call('build_index', self.generate_search_data())
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/execjs/_abstract_runtime_context.py", line 37, in call
    return self._call(name, *args)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/execjs/_external_runtime.py", line 87, in _call
    return self._eval("{identifier}.apply(this, {args})".format(identifier=identifier, args=args))
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/execjs/_external_runtime.py", line 73, in _eval
    return self.exec_(code)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/execjs/_abstract_runtime_context.py", line 18, in exec_
    return self._exec_(source)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/execjs/_external_runtime.py", line 83, in _exec_
    return self._extract_result(output)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/execjs/_external_runtime.py", line 165, in _extract_result
    raise exceptions.ProgramError(value)
execjs._exceptions.ProgramError: ReferenceError: "require" is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/bin/mkdocs", line 9, in <module>
    load_entry_point('mkdocs==0.15.3', 'console_scripts', 'mkdocs')()
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/click/core.py", line 716, in __call__
    return self.main(*args, **kwargs)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/click/core.py", line 696, in main
    rv = self.invoke(ctx)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/click/core.py", line 1060, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/click/core.py", line 889, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/mkdocs/__main__.py", line 142, in build_command
    ), clean_site_dir=clean)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/mkdocs/commands/build.py", line 352, in build
    build_pages(config)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/mkdocs/commands/build.py", line 318, in build_pages
    search_index = search_index.generate_search_index()
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/mkdocs/search.py", line 138, in generate_search_index
    e.__class__.__name__, e.message
AttributeError: 'ProgramError' object has no attribute 'message'

It looks like the debug code wasn't Python 3 compatible, so I ran it again on Python 2.7.11 (still on Fedora) and got the following (which is actually all in the traceback above).

INFO    -  Using 'Nashorn' JavaScript runtime to build search index. 
DEBUG   -  Skipped building search index. Failed with ProgramError: "ReferenceError: "require" is not defined" 
Member

d0ugal commented Jun 9, 2016

I have now also tested on Fedora 23 and Python 3.5.1.

python -m execjs --print-available-runtimes   
Node
Nashorn

Node again completed without issues.

Nashorm failed again, but differently, and with a lengthly traceback.

INFO    -  Using 'Nashorn' JavaScript runtime to build search index. 
Traceback (most recent call last):
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/mkdocs/search.py", line 134, in generate_search_index
    return context.call('build_index', self.generate_search_data())
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/execjs/_abstract_runtime_context.py", line 37, in call
    return self._call(name, *args)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/execjs/_external_runtime.py", line 87, in _call
    return self._eval("{identifier}.apply(this, {args})".format(identifier=identifier, args=args))
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/execjs/_external_runtime.py", line 73, in _eval
    return self.exec_(code)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/execjs/_abstract_runtime_context.py", line 18, in exec_
    return self._exec_(source)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/execjs/_external_runtime.py", line 83, in _exec_
    return self._extract_result(output)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/execjs/_external_runtime.py", line 165, in _extract_result
    raise exceptions.ProgramError(value)
execjs._exceptions.ProgramError: ReferenceError: "require" is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/bin/mkdocs", line 9, in <module>
    load_entry_point('mkdocs==0.15.3', 'console_scripts', 'mkdocs')()
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/click/core.py", line 716, in __call__
    return self.main(*args, **kwargs)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/click/core.py", line 696, in main
    rv = self.invoke(ctx)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/click/core.py", line 1060, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/click/core.py", line 889, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/mkdocs/__main__.py", line 142, in build_command
    ), clean_site_dir=clean)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/mkdocs/commands/build.py", line 352, in build
    build_pages(config)
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/mkdocs/commands/build.py", line 318, in build_pages
    search_index = search_index.generate_search_index()
  File "/home/dougal/.virtualenvs/tmp-9e96b7b7b00b3ba/lib/python3.5/site-packages/mkdocs/search.py", line 138, in generate_search_index
    e.__class__.__name__, e.message
AttributeError: 'ProgramError' object has no attribute 'message'

It looks like the debug code wasn't Python 3 compatible, so I ran it again on Python 2.7.11 (still on Fedora) and got the following (which is actually all in the traceback above).

INFO    -  Using 'Nashorn' JavaScript runtime to build search index. 
DEBUG   -  Skipped building search index. Failed with ProgramError: "ReferenceError: "require" is not defined" 
@d0ugal

This comment has been minimized.

Show comment
Hide comment
@d0ugal

d0ugal Jun 9, 2016

Member

I then tried testing it with mkdocs serve - by mistake I started it with Nashorn on OSX, but the search still worked despite getting an error. Does it fall back to the existing code? Is there a way for me to tell the difference?

I also notice if I pick a runtime that I don't have, or if I pick one that doesn't exist it just falls back to Node without warning.

I also updated my comments above, so if you read them quickly the results might have changed (I made some copy paste errors in the OSX results).

Member

d0ugal commented Jun 9, 2016

I then tried testing it with mkdocs serve - by mistake I started it with Nashorn on OSX, but the search still worked despite getting an error. Does it fall back to the existing code? Is there a way for me to tell the difference?

I also notice if I pick a runtime that I don't have, or if I pick one that doesn't exist it just falls back to Node without warning.

I also updated my comments above, so if you read them quickly the results might have changed (I made some copy paste errors in the OSX results).

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Jun 10, 2016

Member

After posting the testing instructions, I realized that the serve command ignores the env variable. Not sure why. I know that exporting the variable on the same line as the command sets the variable for that command only, so maybe tornado is creating a separate process which doesn't get the variable. And as far as I can tell, if a runtime is not specified it will use Node if available every time.

My latest update today added an log message which tells us which runtime is being used, which helps a little. And yes, if a runtime is not available, or if a runtime fails, then it falls back to building the index in the browser.

I'm inclined to just hardcode Node as the only runtime (as it is the only one that seems to work) and many users will just get the fallback behavior as they won't have Node installed. For most small sites that should be okay (adding a webworker should still improve things). If a user has a big site or really needs a better solution, then installing Node will be the solution for them. It may turn out that those who need it may already have Node installed to begin with.

In any event, thanks for the feedback. It helps tremendously. At this point, I think I'm ready to tackle the webworker.

Member

waylan commented Jun 10, 2016

After posting the testing instructions, I realized that the serve command ignores the env variable. Not sure why. I know that exporting the variable on the same line as the command sets the variable for that command only, so maybe tornado is creating a separate process which doesn't get the variable. And as far as I can tell, if a runtime is not specified it will use Node if available every time.

My latest update today added an log message which tells us which runtime is being used, which helps a little. And yes, if a runtime is not available, or if a runtime fails, then it falls back to building the index in the browser.

I'm inclined to just hardcode Node as the only runtime (as it is the only one that seems to work) and many users will just get the fallback behavior as they won't have Node installed. For most small sites that should be okay (adding a webworker should still improve things). If a user has a big site or really needs a better solution, then installing Node will be the solution for them. It may turn out that those who need it may already have Node installed to begin with.

In any event, thanks for the feedback. It helps tremendously. At this point, I think I'm ready to tackle the webworker.

waylan added a commit to waylan/mkdocs that referenced this issue Jun 10, 2016

Optimize search by building index locally.
This builds an actual lunr.js search index via Node.js in addition to a JSON
array of page objects previously used to build the index client side. Note that
both are needed as the array of page objects is used to display search
results.

If Node fails (not installed or errors out), then fallback to old behavior
by creating an empty index. If the client (browser) receives an empty index,
then it builds the index from the array of pages as previously, which is
fine for most (small) sites. Large sites will want to make sure Node is
installed to avoid the browser from hanging during index creation.

Partially addresses #859.
@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Jun 10, 2016

Member

I've left my search branch as-is and created a new branch: webworker to work on that. The first thing I did was hardcode Node as the only runtime and clean up the commit history (squashed 9 commits into 1 (eb9f72b) with a comprehensive commit message) . I'm moving on from the runtimes for now. If you want to play with them, then use the search branch.

Member

waylan commented Jun 10, 2016

I've left my search branch as-is and created a new branch: webworker to work on that. The first thing I did was hardcode Node as the only runtime and clean up the commit history (squashed 9 commits into 1 (eb9f72b) with a comprehensive commit message) . I'm moving on from the runtimes for now. If you want to play with them, then use the search branch.

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Jul 15, 2016

Member

Just a reminder to myself to explore this comment to see if we could acquire the index from a webworker without using an XMLHttpRequest. If not, we'll leave that as a simpler search to be implemented as a separate plugin.

Member

waylan commented Jul 15, 2016

Just a reminder to myself to explore this comment to see if we could acquire the index from a webworker without using an XMLHttpRequest. If not, we'll leave that as a simpler search to be implemented as a separate plugin.

@YoungElPaso

This comment has been minimized.

Show comment
Hide comment
@YoungElPaso

YoungElPaso Sep 30, 2016

Maybe I don't know enough about the details but could the index, once created be saved into localstorage and then accessed from any subsequent page load without requiring it to be built again and again?

Maybe I don't know enough about the details but could the index, once created be saved into localstorage and then accessed from any subsequent page load without requiring it to be built again and again?

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Oct 1, 2016

Member

@YoungElPaso, that feature would need to be added to the library we use (olivernn/lunr.js), and if I recall correctly, such a feature request was made in the past and denied for good reason. I don't recall the details now, but perhaps a search of that project would answer your question better.

Member

waylan commented Oct 1, 2016

@YoungElPaso, that feature would need to be added to the library we use (olivernn/lunr.js), and if I recall correctly, such a feature request was made in the past and denied for good reason. I don't recall the details now, but perhaps a search of that project would answer your question better.

@olivernn

This comment has been minimized.

Show comment
Hide comment
@olivernn

olivernn Oct 3, 2016

@YoungElPaso @waylan You can achieve this without any modifications to the lunr. There is good support for serialising an index to JSON and then loading an index from JSON, this will be significantly quicker than regenerating the index.

Perhaps something like this:

if (existsInLocalStorage()) {
  var idx = lunr.Index.load(getJSONFromLocalStorage())
} else {
  var idx = buildIndexFromScratch()
  storeIndexInLocalStorage(JSON.stringify(idx))
}

olivernn commented Oct 3, 2016

@YoungElPaso @waylan You can achieve this without any modifications to the lunr. There is good support for serialising an index to JSON and then loading an index from JSON, this will be significantly quicker than regenerating the index.

Perhaps something like this:

if (existsInLocalStorage()) {
  var idx = lunr.Index.load(getJSONFromLocalStorage())
} else {
  var idx = buildIndexFromScratch()
  storeIndexInLocalStorage(JSON.stringify(idx))
}
@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Oct 3, 2016

Member

Before we jump down the "store the index in LocalStorage" rabbit hole, we need to address the creation of the index in the first place. Even using LocalStorage, the first visit to a site would be painfully slow and unresponsive for the user (on a large site). You have lost your user on their first visit and they probably won't be coming back. Until we have that problem fixed, there is no point of exploring how to store the index across multiple visits. And if we do fix that problem, then maybe we don't need to worry about storing the index across visits. My point is, even if LocalStorage is possible, it doesn't solve the problem. it just addresses one of the symptoms.

As an aside, my current roadmap for this issue is as follows:

  1. Pre-build the index (with a client-side fallback for those who don't have node installed)
  2. Finish the Plugin API (see #206)
  3. Refactor search to be a plugin
  4. Wrap client built index in a webworker
  5. Explore other options for storing index client-side longterm

Note that I only have item 1 completed. I started item 4 and realized that the above would be a better approach, so I stopped and moved my attention to the Plugin API. The great thing about using a Plugin is that others can create their own search which starts with a different set of assumptions (or different backend tools) to better meet there needs.

Member

waylan commented Oct 3, 2016

Before we jump down the "store the index in LocalStorage" rabbit hole, we need to address the creation of the index in the first place. Even using LocalStorage, the first visit to a site would be painfully slow and unresponsive for the user (on a large site). You have lost your user on their first visit and they probably won't be coming back. Until we have that problem fixed, there is no point of exploring how to store the index across multiple visits. And if we do fix that problem, then maybe we don't need to worry about storing the index across visits. My point is, even if LocalStorage is possible, it doesn't solve the problem. it just addresses one of the symptoms.

As an aside, my current roadmap for this issue is as follows:

  1. Pre-build the index (with a client-side fallback for those who don't have node installed)
  2. Finish the Plugin API (see #206)
  3. Refactor search to be a plugin
  4. Wrap client built index in a webworker
  5. Explore other options for storing index client-side longterm

Note that I only have item 1 completed. I started item 4 and realized that the above would be a better approach, so I stopped and moved my attention to the Plugin API. The great thing about using a Plugin is that others can create their own search which starts with a different set of assumptions (or different backend tools) to better meet there needs.

@olivernn

This comment has been minimized.

Show comment
Hide comment
@olivernn

olivernn Oct 3, 2016

@waylan completely agree that pre-building is the best option here. Was just saying that there is nothing in lunr preventing a serialised index being stored client side. The tradeoff made with pre-building the index and serving it to the client is that, even gzipped, it can be quite large. That said, if it was me, I'd still opt for serving a pre-built index.

If there is anything I can help with from lunr then please do let me know, I'm more than happy to help.

olivernn commented Oct 3, 2016

@waylan completely agree that pre-building is the best option here. Was just saying that there is nothing in lunr preventing a serialised index being stored client side. The tradeoff made with pre-building the index and serving it to the client is that, even gzipped, it can be quite large. That said, if it was me, I'd still opt for serving a pre-built index.

If there is anything I can help with from lunr then please do let me know, I'm more than happy to help.

@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Oct 3, 2016

Member

If there is anything I can help with from lunr then please do let me know, I'm more than happy to help.

What would be ideal is a Python library that builds the index. Requiring our users to have both Python and Node installed is a little much (our Python script calls out to a Node script to build the index). As it happens, many users have both installed anyway, but I'm not comfortable making it a hard requirement and having the fallback results in two very different user experiences. Of course, I don't expect you to go build a Python clone of your JavaScript lib, but a publicly published spec of your index file format could go a long way toward someone else creating a Python lib (or Ruby, or ... whatever other language static site generators are written in).

Member

waylan commented Oct 3, 2016

If there is anything I can help with from lunr then please do let me know, I'm more than happy to help.

What would be ideal is a Python library that builds the index. Requiring our users to have both Python and Node installed is a little much (our Python script calls out to a Node script to build the index). As it happens, many users have both installed anyway, but I'm not comfortable making it a hard requirement and having the fallback results in two very different user experiences. Of course, I don't expect you to go build a Python clone of your JavaScript lib, but a publicly published spec of your index file format could go a long way toward someone else creating a Python lib (or Ruby, or ... whatever other language static site generators are written in).

@olivernn

This comment has been minimized.

Show comment
Hide comment
@olivernn

olivernn Oct 3, 2016

This is similar to an approach I am hoping to move towards with a new version of lunr, documenting (JSON Schema?) the serialisation format would open up the possibility of using the JavaScript version as a client, with the actual indexing etc being done in any other language that can generate the serialised index. Python would probably be an excellent choice for this given the number of NLP and IR libraries available.

My Python is non-existent, so I'm unlikely to be able to produce anything production ready myself, but by documenting the format others could certainly produce an implementation that would be easier to integrate into MkDocs (or other tools).

olivernn commented Oct 3, 2016

This is similar to an approach I am hoping to move towards with a new version of lunr, documenting (JSON Schema?) the serialisation format would open up the possibility of using the JavaScript version as a client, with the actual indexing etc being done in any other language that can generate the serialised index. Python would probably be an excellent choice for this given the number of NLP and IR libraries available.

My Python is non-existent, so I'm unlikely to be able to produce anything production ready myself, but by documenting the format others could certainly produce an implementation that would be easier to integrate into MkDocs (or other tools).

waylan added a commit to waylan/mkdocs that referenced this issue Oct 3, 2016

Optimize search by building index locally.
This builds an actual lunr.js search index via Node.js in addition to a JSON
array of page objects previously used to build the index client side. Note that
both are needed as the array of page objects is used to display search
results.

If Node fails (not installed or errors out), then fallback to old behavior
by creating an empty index. If the client (browser) receives an empty index,
then it builds the index from the array of pages as previously, which is
fine for most (small) sites. Large sites will want to make sure Node is
installed to avoid the browser from hanging during index creation.

Partially addresses #859.

waylan added a commit to waylan/mkdocs that referenced this issue Oct 3, 2016

Optimize search by building index locally.
This builds an actual lunr.js search index via Node.js in addition to a JSON
array of page objects previously used to build the index client side. Note that
both are needed as the array of page objects is used to display search
results.

If Node fails (not installed or errors out), then fallback to old behavior
by creating an empty index. If the client (browser) receives an empty index,
then it builds the index from the array of pages as previously, which is
fine for most (small) sites. Large sites will want to make sure Node is
installed to avoid the browser from hanging during index creation.

Partially addresses #859.

@waylan waylan self-assigned this Dec 2, 2016

@fskreuz fskreuz referenced this issue in ractivejs/ractivejs.github.io Apr 3, 2017

Closed

Copy over examples from examples.ractivejs.org #6

@waylan waylan added this to To Do in Refactor search. May 2, 2017

@fskreuz fskreuz referenced this issue in ractivejs/ractivejs.github.io Jun 5, 2017

Closed

Add back the mkdocs search #67

@squidfunk

This comment has been minimized.

Show comment
Hide comment
@squidfunk

squidfunk Oct 22, 2017

Just a note: This may be a perfect case for requestIdleCallback - processing in the background without freezing the UI. I will think about including this into Material. Reasonable polyfills available.

Just a note: This may be a perfect case for requestIdleCallback - processing in the background without freezing the UI. I will think about including this into Material. Reasonable polyfills available.

@waylan waylan added the Plugin label Nov 1, 2017

@ihnorton ihnorton referenced this issue in JuliaDocs/Documenter.jl Dec 27, 2017

Merged

Better search results #560

@waylan waylan referenced this issue Jan 16, 2018

Merged

Defer scripts #1380

waylan added a commit to waylan/mkdocs that referenced this issue Jan 31, 2018

Implement fallback for no worker support
Fixes #1218, fixes #1127, and partially addresses #859.

@waylan waylan moved this from To Do to In Progress in Refactor search. Feb 5, 2018

waylan added a commit to waylan/mkdocs that referenced this issue Feb 27, 2018

Implement fallback for no worker support
Fixes #1218, fixes #1127, and partially addresses #859.
@waylan

This comment has been minimized.

Show comment
Hide comment
@waylan

waylan Feb 28, 2018

Member

FYI to everyone following this: A refactor of search is available for review in #1418 which addresses all of the concerns raised here and more.

Member

waylan commented Feb 28, 2018

FYI to everyone following this: A refactor of search is available for review in #1418 which addresses all of the concerns raised here and more.

@waylan waylan closed this in #1418 Mar 6, 2018

Refactor search. automation moved this from In Progress to Done Mar 6, 2018

waylan added a commit that referenced this issue Mar 6, 2018

Refactor search plugin (#1418)
* Use a web worker in the browser with a fallback (fixes #859 & closes #1396).
* Optionally pre-build search index (fixes #859 & closes #1061).
* Upgrade to lunr.js 2.x (fixes #1319).
* Support search in languages other than English (fixes #826).
* Allow the user to define the word separators (fixes #867).
* Only run searches for queries of length > 2 (fixes #1127).
* Remove dependency on require.js, mustache, etc. (fixes #1218).
* Compress the search index (fixes #1128).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment