Find and list a web page's headings using this Perl script. Helpful for making a table of contents.
Perl
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore
LICENSE.txt
README.txt
example.html
hlist

README.txt

This script helps you quickly make a table of contents for a web page, by identifying
the page's section headings (<H1>...</H1>, etc.) and printing a list of them.

Command-line options let you adjust the output by, among other things, turning each
item into a hyperlink.  Run 'hlist -h' for complete instructions on the syntax.  Or,
for a demonstration, try out these commands on the included file example.html
(adapted from Wikipedia's article on pterosaurs):

hlist example.html
hlist -iT example.html
hlist -i2 -x3 example.html
hlist -oc -i2 example.html
hlist -uan -i2 example.html

Note that this is not a CGI script and does not add a ToC to your page on the fly.
The idea is that you'll run this and then paste the output into your page before
finalizing it.

If you use named anchors (<H2><A name="...">...</A></H2>) instead of ID attributes
(<H2 id="...">...</H2>), you should probably run hlist before you add them. This is
because the script preserves HTML inside the header tags, in case it's important
formatting.

This script uses regular expressions to parse HTML.  I await your criticism.

hlist is free software.  As of version 2.0.0, it is released under the terms of the
MIT License, which is permissive, non-viral, and short.  See LICENSE.txt for complete
details.  Earlier versions used the GNU General Public License, and may still be
copied on those terms if you prefer.  The example page (example.html) was adapted
from Wikipedia's "Pterosaur" article ( https://en.wikipedia.org/wiki/Pterosaur )
under the terms of the Creative Commons Attribution-ShareAlike 3.0 Unported License
( https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License )
and thus may be reused under the same terms.