Skip to content

Conversation

@adamdruppe
Copy link
Contributor

This PR is not yet ready to be pulled, I just want to show what it does and see if @andralex feels it is useful to continue with.

Compile the new file with dmd generate_toc.d dom.d then you can run it on a downloaded copy of html from the website like first wget http://dlang.org/spec/declaration.html then ./generate_doc declaration.html and you will get this:

WARNING: no anchor for heading <h1>Declarations</h1>
WARNING: no anchor for heading <h3>Declaration Syntax</h3>
WARNING: no anchor for heading <h3>Void Initializations</h3>
WARNING: no anchor for heading <h3>Global and Static Initializers</h3>
WARNING: no anchor for heading <h3>Type Qualifiers vs. Storage Classes</h3>
<ol>
        <li><a href="#AutoDeclaration">Implicit Type Inference</a></li>
        <li><a href="#alias">Alias Declarations</a></li>
        <li><a href="#extern">Extern Declarations</a></li>
        <li><a href="#typeof">typeof</a></li>
</ol>

The warnings ought to be fixed in the source.... or I could make the program do that automatically to, that's trivial, the question is just what we want to do. We so badly need these sections to be easy to link to though! So something needs to be done.

Then the generated TOC is HTML suitable to be pasted right in. This program could also edit it in itself very easily too, provide a <div id="toc"></div> or something. Or, we could make an index or sitemap file with links to all the sections too.

As you can see in the source, dom.d makes postprocessing html really easy and there's tons of possibilities to improve the website by expanding on this idea.

@adamdruppe
Copy link
Contributor Author

I bundled dom.d right here to avoid having an external dependency. Since it is just a single file anyway, it isn't a huge problem to drop it in here. And since it is reasonably stable, we probably won't hit any bugs... but if we do, I'll fix them upstream then copy the file over here again - KISS package management :)

@CyberShadow
Copy link
Member

FWIW chmgen also does HTML parsing to some degree and warns about broken internal links.

@andralex
Copy link
Member

Thanks! I think this is a neat idea, and having an official means of postprocessing our HTML opens many possibilities - hey, including perhaps cross-referencing and &shy; insertion. @MartinNowak what do you think?

A few notes about tactics:

  • There is appeal about KISS package management but also we should be weary about unwittingly plopping 6KLOC of code into the project.
  • The code looks foreign compared to all other tools - style is very different etc. It's not that it's worse - I actually prefer Egyptians myself - but it's just different, creates a precedent, etc.
  • Generation should probably issue ddo(c|x) code, so we can control it further. The page would need more than a bare HTML list.
  • In the same spirit @CyberShadow in another PR, I think a great way to introduce a tool is together with a use of it, in this case to automate some page of the site or to generate a new neat page.

@adamdruppe how do you think we can address these?

@adamdruppe
Copy link
Contributor Author

On Tue, Jan 12, 2016 at 05:57:44AM -0800, Andrei Alexandrescu wrote:

Thanks! I think this is a neat idea, and having an official means of postprocessing our HTML opens many possibilities - hey, including perhaps cross-referencing and &shy; insertion. @MartinNowak what do you think?

Yeah, it could do that too, though cross-referencing is really more of a source thing. Doing it as a post processor is pretty leaky.

  • There is appeal about KISS package management but also we should be weary about unwittingly plopping 6KLOC of code into the project.
  • The code looks foreign compared to all other tools - style is very different etc. It's not that it's worse - I actually prefer Egyptians myself - but it's just different, creates a precedent, etc.

Maybe the tools could try using dub and dfmt. Dog food a little!

  • Generation should probably issue ddo(c|x) code, so we can control it further. The page would need more than a bare HTML list.

What do you have in mind?

@andralex
Copy link
Member

What do you have in mind?

Nothing fancy, just generate a ddoc file with $(UL ...) and $(LI ...). Then we can put it in any shape by defining DDOC appropriately.

@wilzbach
Copy link
Contributor

We so badly need these sections to be easy to link to though! So something needs to be done.

What's the state of this? What was the intended goal? Just a TOC or simply a Ddoc postprocessor?
In any case it seems like this PR is dead?

there's tons of possibilities to improve the website by expanding on this idea.

Yeah I have recently added footer navigation for the spec, which is really hard to do for such a trivial problem...

@andralex
Copy link
Member

I'll leave this to @CyberShadow. I recall he had concerns about adding additional dependencies.

@CyberShadow
Copy link
Member

Ouch, dom.d is 6K LOC. I'm not sure. Is an alternative approach an option?

  • Pre-process the DDoc instead of the HTML
  • Post-process the HTML with ugly but simple hacks, like regular expressions
  • A combination of the above (make DDoc generate HTML markup that's useful for post-processing HTML)
  • Wait until dlang.org is built with DDox (though I don't think anyone is working on this).

@andralex
Copy link
Member

@CyberShadow the path of least resistance is generate ddoc with a different .ddoc macros file and then filter out chaff with sed.

@wilzbach
Copy link
Contributor

wilzbach commented Jul 4, 2017

Ouch, dom.d is 6K LOC. I'm not sure. Is an alternative approach an option?

Well, we could always use the ugly std.xml, e.g.:

void main(string[] args)
{
    import std.file, std.meta, std.stdio, std.typecons, std.xml;
    string s = readText(args[1]);

    alias TocEntry = Tuple!(string, "id", string, "name");

    TocEntry[] toc;
    auto xml = new DocumentParser(s);
    foreach (heading; AliasSeq!("h1", "h2", "h3", "h4", "h5"))
    {
        xml.onStartTag[heading] = (ElementParser parser)
        {
            TocEntry entry;

            parser.onStartTag["a"] = (ElementParser e) {
                if (auto v = "id" in e.tag.attr)
                    entry.id = *v;
            };
            parser.onText = (string s) { entry.name = s; };
            parser.parse();
            toc ~= entry;
        };
    }
    xml.parse();
    foreach (entry; toc)
        writefln(`<a href="#%s">%s</a>`, entry.id, entry.name);
}

Post-process the HTML with ugly but simple hacks, like regular expressions
A combination of the above (make DDoc generate HTML markup that's useful for post-processing HTML)

It's not that hard to include a DUB dependency nowadays, but this brings another series of problems with as e.g. I am pretty sure we don't want to maintain a XML library.

Wait until dlang.org is built with DDox (though I don't think anyone is working on this).

Yeah, not sure when/if this is going to happen.

I see these okayish solutions here:

  • use the ugly std.xml to transform the generated HTML
  • use sth. like footer_gen to generate the headings in Ddoc and store the result in git (the tool should be idempotent, s.t. it can be rerun to update all headings)

Probably improving footer_gen is a bit better

@adamdruppe
Copy link
Contributor Author

adamdruppe commented Jul 5, 2017 via email

@adamdruppe
Copy link
Contributor Author

adamdruppe commented Jul 5, 2017 via email

@wilzbach
Copy link
Contributor

wilzbach commented Jan 2, 2018

Okay I went with improving the existing footer generation script and it doesn't seem to hard to parse the ddoc macros directly from their raw sources:

dlang/dlang.org#2043

So in this approach of parsing it directly from Ddoc, we couldn't have used your dom.d.

@adamdruppe I'm sorry that your work got wasted :/

@wilzbach wilzbach closed this Jan 2, 2018
@Geod24 Geod24 added PR.NeedsAttention A PR that is stalled/not mergeable anymore/abandoned and needs to be taken over and removed stalled labels Apr 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR.NeedsAttention A PR that is stalled/not mergeable anymore/abandoned and needs to be taken over

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants