-
-
Notifications
You must be signed in to change notification settings - Fork 146
TOC generator for generated html #180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I bundled dom.d right here to avoid having an external dependency. Since it is just a single file anyway, it isn't a huge problem to drop it in here. And since it is reasonably stable, we probably won't hit any bugs... but if we do, I'll fix them upstream then copy the file over here again - KISS package management :) |
|
FWIW chmgen also does HTML parsing to some degree and warns about broken internal links. |
|
Thanks! I think this is a neat idea, and having an official means of postprocessing our HTML opens many possibilities - hey, including perhaps cross-referencing and A few notes about tactics:
@adamdruppe how do you think we can address these? |
|
On Tue, Jan 12, 2016 at 05:57:44AM -0800, Andrei Alexandrescu wrote:
Yeah, it could do that too, though cross-referencing is really more of a source thing. Doing it as a post processor is pretty leaky.
Maybe the tools could try using dub and dfmt. Dog food a little!
What do you have in mind? |
Nothing fancy, just generate a ddoc file with |
What's the state of this? What was the intended goal? Just a TOC or simply a Ddoc postprocessor?
Yeah I have recently added footer navigation for the spec, which is really hard to do for such a trivial problem... |
|
I'll leave this to @CyberShadow. I recall he had concerns about adding additional dependencies. |
|
Ouch, dom.d is 6K LOC. I'm not sure. Is an alternative approach an option?
|
|
@CyberShadow the path of least resistance is generate ddoc with a different .ddoc macros file and then filter out chaff with sed. |
Well, we could always use the ugly void main(string[] args)
{
import std.file, std.meta, std.stdio, std.typecons, std.xml;
string s = readText(args[1]);
alias TocEntry = Tuple!(string, "id", string, "name");
TocEntry[] toc;
auto xml = new DocumentParser(s);
foreach (heading; AliasSeq!("h1", "h2", "h3", "h4", "h5"))
{
xml.onStartTag[heading] = (ElementParser parser)
{
TocEntry entry;
parser.onStartTag["a"] = (ElementParser e) {
if (auto v = "id" in e.tag.attr)
entry.id = *v;
};
parser.onText = (string s) { entry.name = s; };
parser.parse();
toc ~= entry;
};
}
xml.parse();
foreach (entry; toc)
writefln(`<a href="#%s">%s</a>`, entry.id, entry.name);
}
It's not that hard to include a DUB dependency nowadays, but this brings another series of problems with as e.g. I am pretty sure we don't want to maintain a XML library.
Yeah, not sure when/if this is going to happen. I see these okayish solutions here:
Probably improving |
|
On Tue, Jul 04, 2017 at 08:17:50PM +0000, Sebastian Wilzbach wrote:
Well, we could always use the ugly `std.xml`, e.g.:
I don't think std.xml can read the broken html ddoc tends to produce.
It's not that hard to include a DUB dependency nowadays, but this brings another series of problems with as e.g. I am pretty sure we don't want to maintain a XML library.
dom.d is fairly stable and has built with several versions of dmd (so you could probably just keep the fork unmodified without issue), and besides, I use it heavily so you don't have to maintain it if you'd prefer to just keep up with me.
While I don't believe in this approach any more, I'll still work with you on the html lib.
|
|
On Fri, Jun 30, 2017 at 08:58:10AM -0700, Vladimir Panteleev wrote:
- Post-process the HTML with ugly but simple hacks, like regular expressions
- A combination of the above (make DDoc generate HTML markup that's useful for post-processing HTML)
You could probably also just make a null-defined macro with a name that is easily searchable to store some meta info and then grep for it.
|
|
Okay I went with improving the existing footer generation script and it doesn't seem to hard to parse the ddoc macros directly from their raw sources: So in this approach of parsing it directly from Ddoc, we couldn't have used your @adamdruppe I'm sorry that your work got wasted :/ |
This PR is not yet ready to be pulled, I just want to show what it does and see if @andralex feels it is useful to continue with.
Compile the new file with
dmd generate_toc.d dom.dthen you can run it on a downloaded copy of html from the website like firstwget http://dlang.org/spec/declaration.htmlthen./generate_doc declaration.htmland you will get this:The warnings ought to be fixed in the source.... or I could make the program do that automatically to, that's trivial, the question is just what we want to do. We so badly need these sections to be easy to link to though! So something needs to be done.
Then the generated TOC is HTML suitable to be pasted right in. This program could also edit it in itself very easily too, provide a
<div id="toc"></div>or something. Or, we could make an index or sitemap file with links to all the sections too.As you can see in the source, dom.d makes postprocessing html really easy and there's tons of possibilities to improve the website by expanding on this idea.