New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search: don't show content of raw HTML <script>
tags (and maybe others) in search preview
#12052
Comments
Wow, never expected that this bug wasn't noticed before! so PR is welcome though I will be less available in the next 10 days. |
We may need to implement some equivalent filtering in the client-side JavaScript code to resolve this; when configured to, the client makes an HTTP request per search result to retrieve the summary content (as HTML). The relevant sphinx/sphinx/themes/basic/static/searchtools.js Lines 547 to 572 in cf7d275
|
HTML/XML comments occurred to me as another thing that we'd want to filter out from the summary text; they already are, though. I'll spend some more time to think about other elements that could require filtering. In the meantime, #12057 begins by handling the |
According to the search index filtering code I referenced above, it looks like I assume that the third regular expression is supposed to filter out the tags themselves, without removing the content within those tags. It looks like this is already working correctly. |
Note: this can only occur when the non-visible HTML elements are located within |
freaking love this! Thanks! :) |
Describe the bug
Some tags are filtered out before creating the search index (I don't know why this appears in two places in the code):
sphinx/sphinx/search/__init__.py
Lines 221 to 227 in ae51974
sphinx/sphinx/search/__init__.py
Lines 487 to 495 in ae51974
However, those parts still seem to be contained in the search preview.
How to Reproduce
index.rst
:When searching for "green", the JavaScript code is shown in the preview text:
Note that this only happens if the word "green" is also part of the "normal" text.
Environment Information
Sphinx extensions
none
Additional context
This problem was reported to
nbsphinx
, where the raw HTML is created from output cells of Jupyter notebooks: spatialaudio/nbsphinx#777The text was updated successfully, but these errors were encountered: