Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial implementation of local search fix #1805

Closed
wants to merge 1 commit into from
Closed

Initial implementation of local search fix #1805

wants to merge 1 commit into from

Conversation

k4kfh
Copy link

@k4kfh k4kfh commented Jun 4, 2019

I use mkdocs at work, and we generally just view the site using file:/// urls, so we have been missing the search function since browser security tightened up. This is a simple fix for that (fixes #871). Here is the summary:


By slightly modifying the search plugin in mkdocs and adding a small JS file to your docs directory, it's possible to get the search function working in all browsers, even if you open the HTML files directly (a situation which would previously cause cross-site-scripting errors).

The Problem:

mkdocs search feature relies on a client-side JavaScript search library called lunr.js. Out of the box, the JavaScript that powers the docs site tries to fetch search_index.json using the HTML5 fetch API. When viewing the docs site locally (i.e. not via a web server), modern browsers block this fetch request because they think it's cross site scripting.

The Workaround:

The browser still allows .js files to be embedded in HTML (via <script> tags) when viewing the site from your filesystem. It just doesn't allow JavaScript to fetch files from your filesystem programmatically (for security reasons). However, if we change the search_index.json file to search_index.js, we can turn the JSON file into an executable JavaScript file which sets the search index JSON object as a global variable. See the pseudo-code below:

Original search_index.json

// Nothing but a big JSON object
{key:value, key:value}

Modified search_index.js

// Just store the big JSON object in a global variable
search_index_shimmed = {key:value, key:value}

With these modifications, search_index.js can be embedded in mkdocs HTML pages just like any other JavaScript, making the search index accessible from the browser (via a global variable instead of via an HTML5 fetch call).

These modifications can be automated by adjusting the mkdocs search plugin. The modifications I've made allow you to enable/disable this "shim" from within mkdocs.yml, and it still generates the regular search_index.json file as well.

plugins:
  - search:
      local_search_shim: true # setting this to true tells the search plugin to generate both search_index.json (unmodified) and search_index.js (modified as shown above)

The Last of the Fix

The modifications above solve the problem almost completely. However, the theme (in our case, mkdocs-material) is unaware of these changes, so it's still trying to grab the search index by calling the fetch function.

The solution is to add another short JavaScript to the site which decorates/wraps the fetch function. It behaves just like the normal native fetch code, UNLESS it detects a file:// URL that ends in search_index.json, in which case it "fakes" the expected output of the fetch command. So no modification to the theme is necessary, because the theme has no idea it's not dealing with the normal fetch API.

This file looks messy, particularly the return statement, but it only modifies fetch's behavior in one very specific situation (the only situation that fetch gets used, in our case).

// Simple decorator shim for fetch, which makes the search_index.json file fetchable (ONLY if you enable search:local_search_shim in mkdocs.yml)
fetch_native = fetch
fetch = function(url, options){

    // Simple helper to resolve relative url (./search/search_index.json) to absolute (file://C:/Users...)
    var absolutePath = function(href) {
        var link = document.createElement("a");
        link.href = href;
        absolute = link.href;
        link.remove();
        return absolute;
    }

    // Check if this fetch call is one we need to "intercept"
    if (absolutePath(url).startsWith("file:") && absolutePath(url).endsWith("search_index.json")) {
        // If we detect that this IS a call trying to fetch the search index, then...
        console.log("LOCAL SEARCH SHIM: Detected search_index fetch attempt! Using search index shim for " + url)

        // Return a "forged" object that mimics a normal fetch call's output
        // This looks messy, but it essentially just slips in the search index wrapped in
        // all the formatting that normally results from the fetch() call
        return new Promise(
            function(resolve, reject){
              var shimResponse = {
                  json: function(){
                  // This should return the search index
                  return shim_localSearchIndex;
                }
              }
              resolve( shimResponse ) 
            }
        )
    }
    // In all other cases, behave normally
    else {
        console.log("LOCAL SEARCH SHIM: Using native fetch code for " + url)
        return fetch(url, options);
    }
}

I didn't update the documentation yet because I have a feeling that using the extra_javascript directive to add search_index.js and fetch_shim.js is unnecessary, but I couldn't find a programmatic way to do that. Please advise. After that's addressed, I'll be happy to write proper documentation for this modification.

This is a first shot at fixing #871. It works by adding a config option to the search plugin
which generates an additional search_index.js file. This is simply search_index.json encapsulated in a
global variable. This gets embedded into the site, along with a small JavaScript shim,
by using the extra_javascript directive.
@waylan
Copy link
Member

waylan commented Jun 5, 2019

I was aware of how to work around this, but haven't done so because existing third party themes expect the json file. I hadn't considered using a shim as a workaround. I'm not sure how I feel about that. Any input @squidfunk?

@k4kfh
Copy link
Author

k4kfh commented Jun 5, 2019 via email

@squidfunk
Copy link
Contributor

@waylan @k4kfh In general, I would regard serving documentation locally as an edge case. As the OP suggested, there's no possibility to fetch local files, i.e. file:///... from JavaScript for security reasons. Patching fetch is ... interesting, but will only work for themes that use fetch + some polyfill for browsers that don't support it, and not XMLHTTPRequestor some other older APIs. However, as I understand this feature request to only propose a search_index.js and not the general inclusion of the fetch patch, it's a patch that is local to a theme and only used locally, so the patch should be fine.

Regarding the creation of a search_index.js: I think there's no need to implement this on your side, as this could easily be done by the OP himself with a simple one-line bash script:

echo "search_index_shimmed = $(cat site/search/search_index.json)" > site/search/search_index.js

Valid JSON is also valid JavaScript.

@k4kfh
Copy link
Author

k4kfh commented Jun 5, 2019

@squidfunk @waylan You're right that it could be handled with a batch script, and this would arguably be the easiest way, and wouldn't require any modifications to mkdocs. This was the direction I initially thought of going. However, I'm trying to make this work with existing automated building tools (and have the convenience of still being able to use mkdocs build and mkdocs serve when developing the docs).

Perhaps a better solution would be for me to PR a simple plugin for adding pre/post build shell scripts (or additional Python scripts) stored in your docs directory by adding them to the config. Something like:

build_scripts:
  - pre_build:
    - "shellscripts/searchindexshim.sh"
  - post_build:
    - "shellscripts/otherscript.sh"

This would let me (and anyone else) fix the local search problem by just dragging some script files into their docs directory and adding directives to the config file, no need to modify mkdocs itself. And it would also add some general convenience features. This could be shipped with mkdocs like the search plugin to make it easy for someone to make simple additions to the build process without having to actually change the underlying program. It would solve my search issue easily, but it would also be a nice feature to have generally speaking. No matter whose mkdocs install you build your docs on, modifications like this would "just work".

The potential problem with this would be cross-platform support, because if I wrote a bash script, that doesn't always translate 1:1 to a Windows .bat file. So maybe Python scripts instead of Bat scripts? Or have a "windows" and "linux" and "osx" suboption? I don't know.

@waylan
Copy link
Member

waylan commented Jun 5, 2019

I would regard serving documentation locally as an edge case.

As do I. In fact, this is why I haven't bothered to work out a solution. It is not something we aim to support.

... but will only work for themes that use...

Right. Looking at the patch as it stands now, it is incomplete. The extension currently adds a search script to the theme when search.search_index_only is False. Presumably, the shim could be included there as well. But that doesn't address when search.search_index_only is True which is what Material and other third-party themes expect.

My expectation has been that while we provide a basic search plugin, someone would come along and implement a better third-party search plugin. I'm inclined to take the position that supporting search over file:/// should be implemented by a third party plugin. @k4kfh you or anyone else are free to fork the search plugin and add any features you want, so long as you respect the existing (very permissive) license. But that fork should be maintained as a separate third-party plugin.

@k4kfh
Copy link
Author

k4kfh commented Jun 5, 2019

Well, having looked into the current design for plugins a little more, I think maybe a good solution would be to allow plugins to be stored within the documentation directory. Something like the custom_dir option for themes. That would mean "modifications" are written in Python, solving the cross-platform issue, and it would be a nice general feature that would also solve my problem (I could just write a search plugin like you said, but I wouldn't have to worry about packaging it, keeping it updated on all our mkdocs installs, etc).

Thoughts?

@waylan
Copy link
Member

waylan commented Jun 5, 2019

I think maybe a good solution would be to allow plugins to be stored within the documentation directory.

I disagree. Plugins are Python packages. They belong where Python packages go. As the search plugin demonstrates, they already can add themselves to the theme env. And using the on_files event, they have the ability to inject files (which may or may not be in the docs_dir) into the site. In other words, plugins already have everything they need to accomplish what you want. Someone just needs to do the work and then maintain it. If you are not willing to maintain is, then why should I when I don't even want/need the feature?

@k4kfh
Copy link
Author

k4kfh commented Jun 5, 2019

I disagree. Plugins are Python packages. They belong where Python packages go.

Okay, fair enough. What do you think of a plugin to add shell scripts to the build process? Better idea? Worse idea?

@waylan
Copy link
Member

waylan commented Jun 5, 2019

What do you think of a plugin to add shell scripts to the build process?

That is certainly a use-case which many MkDocs users have for various reasons. And any existing plugin events could be used to make calls to shell scripts. There is no need to modify MkDocs to support that. Of course, I expect such a plugin to be implemented and maintained as a third-party plugin.

@waylan
Copy link
Member

waylan commented Jun 13, 2019

My expectation has been that while we provide a basic search plugin, someone would come along and implement a better third-party search plugin. I'm inclined to take the position that supporting search over file:/// should be implemented by a third party plugin. @k4kfh you or anyone else are free to fork the search plugin and add any features you want, so long as you respect the existing (very permissive) license. But that fork should be maintained as a separate third-party plugin.

I am closing this based on the above reasoning.

@waylan waylan closed this Jun 13, 2019
@wilhelmer
Copy link
Contributor

Thanks for providing this. As @squidfunk pointed out, there's no need to change the search plugin. Adding search_index.js and fetch_shim.js to extra_javascript is sufficient.

Plus some one-line script to create the search_index.js from the JSON. I'm running Windows, so I used PowerShell:

@("shim_localSearchIndex = ") + (Get-Content "search_index.json") | Set-Content "search_index.js"

@wilhelmer
Copy link
Contributor

@k4kfh Why did you keep the JSON file around instead of using the JS file for both local and web? Performance reasons?

@k4kfh
Copy link
Author

k4kfh commented Sep 13, 2019

@wilhelmer No particular reason. I guess I just didn't want to change something that didn't need changing.

@berot3
Copy link

berot3 commented Oct 9, 2019

I would be interested in this "edge case" @k4kfh ! Have you maybe documented somehwere what you did to completly make mkdocs work with a server-less setup (file:///-url)?

I also dont quite managed to get this local search to work. can you maybe share what you all did? a step-by-step-guide for noobies would be perfect :)

@wilhelmer
Copy link
Contributor

wilhelmer commented Oct 9, 2019

Step-by-step guide:

  1. Install mkdocs-localsearch
  2. Done 😊

@berot3
Copy link

berot3 commented Oct 11, 2019

Indeed! 😮 Simply works! Thanks!

@wilhelmer
Copy link
Contributor

Currently, the mkdocs-localsearch plugin only works with the Material theme, and will stop working with Material when 5.x is released. I had a quick look at the 5.x search implementation, and it seems I'll have to rewrite the code entirely to make it work. I don't know RxJS, so maybe I won't be able to update the plugin at all.

@waylan @squidfunk I'd love to reopen this discussion – why not provide the search index as a JS variable (in a seperate JS file) instead of a JSON? This way, no fetch (or ajax() call) would be necessary and all CORS issues could be avoided. If some 3rd party themes require the JSON, make it optional.

While running a production documentation system locally may be an edge case, I think this is also a matter of UX. Right now, if you run mkdocs build and open the resulting site_dir from disk, everything will look fine except that the search isn't working, without any indication why.

Providing the search index via JS file would solve this, with zero downsides. Or are there any? Love to hear your thoughts on this.

@waylan
Copy link
Member

waylan commented Dec 27, 2019

why not provide the search index as a JS variable (in a seperate JS file) instead of a JSON?

Because it would be backward incompatible. As a policy we always try to provide a graceful deprecation when introducing a backward incompatible change. In other words, for at least one release cycle we add the new way but continue to support the old way with a deprecation warning being issues when people use it. This gives theme devs the opportunity to update as their schedules allow during that release cycle. I looked at doing this when I refactored search into a plugin, but decided against it when I realized it would require a hard break of every third party theme out there. There's not really any way to issue a deprecation warning for this sort of thing.

A second reason is that MkDocs does not "officially" support file based browsing (and never has). So it is not high on our priority list to find a workable solution which doesn't isolate all of the existing third party themes.

@wilhelmer
Copy link
Contributor

For backward compatibility, you could provide something like

plugins:
    - search:
        json_index: true # defaults to false

... and remove that option once most 3rd party themes have adopted to the JS index.

@squidfunk
Copy link
Contributor

@wilhelmer integrating this into Material v5 will probably be harder, yes. Search is actually carried out inside a web worker, but the search index is fetched in the main thread, as the main thread does some more work in trying to load pre-serialized indexes and merging them with the payload sent to the web worker. When releasing v5 I'll try to provide some architectural notes which should clear up how things are divided and why.

Regardless of that I'll still consider this pretty much an edge case. However, one other option to make it work locally would be to inline the search index into the template as JSON. I'm doing the same for the localizations which are needed by the application logic, see this snippet.

You could just shim fetch with retrieving the textContent from the actual element. If this works (haven't prototyped it), we could integrate a flag into Material which will inline the search_index.json automatically, so there would be no need to shim fetch.

@wilhelmer
Copy link
Contributor

wilhelmer commented Jan 2, 2020

@squidfunk As far as I can tell, v5 uses ajax() instead of fetch() to retrieve the JSON, so I need a shim for that. Or inline the search index, but I guess that requires modifications on your side?

I'm afraid whichever approach I choose, I won't be able to pull this off on my own. I'm a tech writer, not a developer. Any help would be much appreciated 🙏

@squidfunk
Copy link
Contributor

@wilhelmer ajax() is a wrapper around XMLHttpRequest. I decided to use ajax() instead of fetch(), as the latter would need a Polyfill to properly work in IE. Also given that RxJS models everything in streams, the DX is exactly the same.

I'm actually not sure if this inlining approach is feasible, as the search_index.json is generated during (after?) building the documentation, and thus may not be available for automatic inclusion. I don't know enough about MkDocs' plugin architecture, so this was just an educated guess.

I can look into this when I find some time, which unfortunately will definitely be after the release of v5. As v5 is a complete rewrite, I'll also suspect that there may be some problems/issues with which I/we might have to deal first. You can open an issue on the Material issue tracker for that, maybe somebody else can help out. It would be awesome if you could write up some learnings / directions / ideas, so we can start a discussion how we could maybe support local search.

BTW, I'm in favor of *.json, not *.js. It's data after all.

@wilhelmer
Copy link
Contributor

wilhelmer commented Jan 3, 2020

@squidfunk For an inelegant and despicable, yet pragmatic solution, maybe you could change the ajax() call to

if (g_localSearchIndex) {
   const data$ = g_localSearchIndex;
} else {
   const data$ = ajax({...})
}

... so my plugin can put the search index data in g_localSearchIndex?

@squidfunk
Copy link
Contributor

@wilhelmer yes, that would be possible, but there might be more things to consider. Let's discuss it in a new issue over at mkdocs-material.

Is the README of your project still up to date? I'm asking because #1805 (comment) doesn't match the instructions.

@wilhelmer
Copy link
Contributor

Yes, the README is up to date. The comment is outdated.

@waylan
Copy link
Member

waylan commented Jan 3, 2020

As an off-topic discussion has continued despite multiple requests to take the discussion elsewhere, I am locking this issue.

@mkdocs mkdocs locked as off-topic and limited conversation to collaborators Jan 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Search not working in local files/file URLs
5 participants