A tool that extracts links and definitions from the CL HyperSpec as an RDF graph
Common Lisp
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Interlinking information for the HyperSpec

The CLHS comes with ~110000 links. We know where each one of them goes, but where do they come from?

Imagine you're looking at a glossary entry. It's normative, but to which areas of the spec is it relevant? Which functions depend on its definition? I often asked myself that, and it bothered me enough that I wrote this program to extract the hyperlinking structure from the CLHS into an RDF graph, which can be imported into any database (of course, I recommend agraph) that can answer these types of questions.

= Creating the triple information

  1. Install the dependencies:
  • asdf
  • cxml-stp
  • closure-html
  • cl-ppcre
  • drakma
  1. Load the clsem.asd, (asdf:oos 'asdf:load-op :clsem)
  2. (clsem:do-it #p"/path/to/output/file.ttl")

This will query the lispworks HTTP servers and will take a long, long time. If you have a copy of the HyperSpec downloaded, you can use:

      (clsem:do-it #p"/path/to/output/file.ttl"
                   :prefix "file:///Users/asf/Downloads/HyperSpec-7-0/HyperSpec")

And it will finish in ~9 seconds.

This file is in the turtle RDF triple language format. If you import the data into a graph database, things may be easier if you convert it to the ntriples format first. I recommend using the most excellent rapper for this task.