-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include H3-H6 as individual search results and highlight the search term in the result. #624
Comments
It only includes h1 and h2 by design. This is the just like we only include h1 and h2 in the table of contents and thus made the same choice here to match. I don't think there would be a problem with making the search more find grained, I can't really think of anything. I wrote this particular section of the search code in December I think, so my memory of the reasoning behind it is a little rusty :) |
Would we want to include all of them from h1 - h6? |
I don't see why not. But the other option which might make more sense is to improve the sample shown in each search result. It would be great if the sample was adjusted to include the search result rather than just being the first paragraph of the section with the result in (I assume?), I guess it would also be useful if it highlighted the search result by making it bold. Does that make sense? |
Yeah, that makes sense. I feel like we should probably do both. Adding H3-H6 is really trivial. So maybe we should do that first. Improving the display of the results is somewhat trickier. I guess we need to update this code and probably find the sentence with the search term. This could be tricky if the word was only found due to tokenisation. Then highlight the search term in that sentence. |
I've marked this for 0.15 to at least do the H3-H6 change. I'm not sure I have the capacity to improve the search results too - so, some help there would be good if anyone is interested. |
looks very doable, and I'm keen to help but honestly I'm pretty busy right now, I'm not going to get round to it in June. |
Sure :) |
@d0ugal - "This is the just like we only include h1 and h2 in the table of contents" - is that true? I thought I'd observed more "we only include the first two levels of heading, whatever they are". Case in point: simple hierarchy of headings: ... and then comment out (or just break...) one level in the middle: |
We take the first two levels as Python-Markdown interprets them when it creates a table of contents. It is a bit wonky, and not an ideal approach. I'd be happy to see a better on me :) |
I think the second case would be good in some situations, but the obvious solution is to make this a config variable so the default behaviour stays the same. Do we generate search headings from the same object as the main toc/scroll-spy? I can see that people (me for a start) are going to want one setting heading depth in search and another for everything else. |
@d0ugal, that's fine, I didn't know from your comment whether it was actually meant to be h1 and h2, and thus not working as intended. It's easy enough to live with the current behaviour. The conversation above is probably related to roll-up headings as well, as limiting the heading styles in the ToC (or using more levels in your docs that aren't included) is just another way to limit ToC overload. I have less of an opinion on the search angle. |
No, this is done in a different area as the search index needs the full document contents and not just the headers. |
I don't know either :) That bit of code predates my involvement with MkDocs. I do think we should revisit the TOC generation at some point. I think @waylan has stated that the current way isn't ideal. Basically, Python-Markdown creates the table of contents in HTML. We then parse that to create an object representing the table of contents for a page. It is a bit of a hack :) |
Actually, the bits of MkDocs that I didn't like (at least regarding the TOC), was removed/fixed in a6e6eb1. As far as I'm concerned the TOC handling code in MkDocs is fine. However, that doesn't mean there isn't room for improvement. Under the hood, Python-Markdown extracts the headers into a nested list of tokens (here) which it then converts into an HTML list, serializes as a string, and assigns to the Either way, how should skipped/missing levels be handled? Consider this example:
If we require every level (even skipped levels to be part of the TOC list, how would that work?
I don't see a clear answer to what goes in the level 2 slot. It could be left blank, but a blank menu item would be bad, so that doesn't work. And if it is not blank, what would go there? Therefore, it is left out completely. The really tricky part would be if we had this instead:
How would that look?
Today, you get this:
While that is obviously "wrong," I think it is the "correct" result for Markdown to return. It clearly communicates to the document author that they have an error in their document that they need to fix. One of Markdown's "features" is that it never breaks the parser on bad input, it simply outputs unexpected formatting that clues the author in to the problem. In other words, it is good practice to never publish a Markdown document without first reviewing a rendered copy of it. Interestingly, this Markdown Style Guide suggests that document authors should avoid skipping header levels in their documents. In other words, I don't think MkDocs needs to try to support something that is generally considered bad form. In fact, this Markdown Lint Tool (which is based on the previously mentioned style guide) will raise an error on skipped header levels. P.S.: @d0ugal, it might be interesting to add that lint tool to your flake8 tests to check the Markdown source in the docs dir (as well as the README, etc.) to enforce good form in the docs. Some of them are pretty awful to read/edit right now. It is a Ruby project, but is does have a CLI so it should not be impossible to use from Tox/Travis (the Node.js clone does not have a CLI and I'm not aware of a Python clone outside of this, which is a little too special-purpose for general use). |
The additional headers are now included individually in the search results by #644 |
If I search for a subsection heading (eg.
h3
) I would expect the subsection I searched for to appear in the search results, instead only it's parenth1
seems to appear.Perhaps most easily demonstrated with an example: on help.tutorcruncher.com If I search for "credit note" (which is the text of an
h3
heading: help.tutorcruncher.com/accounting/#credit-notes) I would expect to see "credit note" in the results. I guess as a heading would make most sense, but highlighted in the text of a result would be ok too.Instead the parent
h1
"Invoices" shows in results without "credit note" shown anywhere.Perhaps we need to alter
search.py > create_entry_for_section
to recursively work right down the hierarchy of sections?The text was updated successfully, but these errors were encountered: