Skip to content

Commit

Permalink
Basic citation support (#69)
Browse files Browse the repository at this point in the history
* Basic citation support

 - Convert git authors to an oracle cache. When you need to rebuild a
   lot of pages, this can get quite slow, so useful to save between
   runs.
 - Add a bibliography to the source tree (with a single citation in) and
   get Pandoc to read that.
 - Do some massaging to the parsed Markdown/HTML to put the references
   block at the very end of the page (after footnotes). Also patch up
   citation links to include the paper's title:

     [Steele 2017](#ref-Steele2017)

   becomes

     [Steele 2017](#ref-Steele2017 "It’s Time for a New Old Language")

Closes #68

* Take a more structured approach to references

Basically try to parse our references from the tree Pandoc produces
(it's gross, I know, but otherwise we need to copy a lot of Pandoc's
internal logic) and then emit them within the template itself.
  • Loading branch information
SquidDev committed May 9, 2022
1 parent de8ff87 commit 8fd4400
Show file tree
Hide file tree
Showing 5 changed files with 114 additions and 21 deletions.
10 changes: 4 additions & 6 deletions src/1Lab/intro.lagda.md
Expand Up @@ -347,11 +347,9 @@ _$\beta$-reducible expression_, but since we're not really interested in
the theory of programming languages here, this terminology will not come
up.

[^newoldlang]: A [survey paper] of POPL proceedings by Guy L. Steele
identified **twenty-eight** different notations for substitution, so the
word "typical" is.. questionable, at best.

[survey paper]: https://dl.acm.org/doi/abs/10.1145/3155284.3018773
[^newoldlang]: A survey paper of POPL proceedings by Guy L. Steele
[@Steele2017] identified **twenty-eight** different notations for
substitution, so the word "typical" is.. questionable, at best.

In addition, function types enjoy a definitional _uniqueness_ rule,
which says "any function is a $\lambda$ expression". Symbolically, this
Expand Down Expand Up @@ -1298,7 +1296,7 @@ is not organised like this. It's meant to be an _explorable_
presentation of HoTT, where concepts can be accessed in any order, and
everything is hyperlinked together. However, we can highlight the
following modules as being the "spine" of the development, since
everything depends on them, and they're roughly linearly ordered.
everything depends on them, and they're roughly linearly ordered.\\\
## Paths, in detail
Expand Down
19 changes: 19 additions & 0 deletions src/bibliography.bibtex
@@ -0,0 +1,19 @@
@article{Steele2017,
author = {Steele, Guy L.},
title = {It's Time for a New Old Language},
year = {2017},
issue_date = {August 2017},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {52},
number = {8},
issn = {0362-1340},
url = {https://doi.org/10.1145/3155284.3018773},
doi = {10.1145/3155284.3018773},
abstract = {The most popular programming language in computer science has no compiler or interpreter. Its definition is not written down in any one place. It has changed a lot over the decades, and those changes have introduced ambiguities and inconsistencies. Today, dozens of variations are in use, and its complexity has reached the point where it needs to be re-explained, at least in part, every time it is used. Much effort has been spent in hand-translating between this language and other languages that do have compilers. The language is quite amenable to parallel computation, but this fact has gone unexploited.In this talk we will summarize the history of the language, highlight the variations and some of the problems that have arisen, and propose specific solutions. We suggest that it is high time that this language be given a complete formal specification, and that compilers, IDEs, and proof-checkers be created to support it, so that all the best tools and techniques of our trade may be applied to it also.},
journal = {SIGPLAN Not.},
month = {jan},
pages = {1},
numpages = {1},
keywords = {programming languages, specifications, compilers}
}
1 change: 1 addition & 0 deletions support/shake/1lab-shake.cabal
Expand Up @@ -17,6 +17,7 @@ executable shake
Agda,
blaze-html,
bytestring,
citeproc,
containers,
deepseq,
directory,
Expand Down
94 changes: 79 additions & 15 deletions support/shake/app/Main.hs
Expand Up @@ -23,7 +23,7 @@ import Data.Char (isDigit)
import Data.List

import Development.Shake.FilePath
import Development.Shake.Classes
import Development.Shake.Classes (Hashable, Binary, NFData)
import Development.Shake

import Network.URI.Encode (decodeText)
Expand All @@ -33,10 +33,12 @@ import qualified System.Directory as Dir
import Text.HTML.TagSoup
import Text.DocTemplates

import Text.Pandoc.Citeproc
import Text.Pandoc.Filter
import Text.Pandoc.Walk
import Text.Pandoc
import Text.Printf
import qualified Citeproc as Cite

import Agda.Interaction.FindFile (SourceFile(..))
import Agda.TypeChecking.Monad.Base
Expand All @@ -48,7 +50,7 @@ import Agda.Utils.FileName
import qualified System.Environment as Env
import HTML.Backend
import HTML.Base
import System.IO hiding (readFile')
import System.IO (IOMode(..), hPutStrLn, withFile)
import Agda

{-
Expand All @@ -60,7 +62,8 @@ main :: IO ()
main = shakeArgs shakeOptions{shakeFiles="_build", shakeChange=ChangeDigest} $ do
fileIdMap <- newCache parseFileIdents
gitCommit <- newCache gitCommit
gitAuthors <- newCache (gitAuthors (gitCommit ()))
gitAuthors' <- addOracleCache (gitAuthors (gitCommit ()))
let gitAuthors = gitAuthors' . GitAuthors

{-
Write @_build/all-pages.agda@. This imports every module in the source tree
Expand Down Expand Up @@ -264,9 +267,14 @@ gitCommit () = do
Stdout t <- command [] "git" ["rev-parse", "--verify", "HEAD"]
pure (head (lines t))

newtype GitAuthors = GitAuthors FilePath
deriving (Show, Typeable, Eq, Hashable, Binary, NFData)

type instance RuleResult GitAuthors = [Text]

-- | Get the authors for a particular commit.
gitAuthors :: Action String -> FilePath -> Action [Text]
gitAuthors commit path = do
gitAuthors :: Action String -> GitAuthors -> Action [Text]
gitAuthors commit (GitAuthors path) = do
_commit <- commit -- We depend on the commit, but don't actually need it.

-- Sort authors list and make it unique.
Expand All @@ -291,6 +299,18 @@ gitAuthors commit path = do
-- Markdown Compilation
--------------------------------------------------------------------------------

data MarkdownState = MarkdownState
{ mdReferences :: [Val Text]
, mdDependencies :: [String]
}

instance Semigroup MarkdownState where
(MarkdownState r s) <> (MarkdownState r' s') = MarkdownState (r <> r') (s <> s')

instance Monoid MarkdownState where
mempty = MarkdownState mempty mempty


{-| Convert a markdown file to templated HTML.
After parsing the markdown, we perform the following post-processing steps:
Expand All @@ -302,6 +322,8 @@ After parsing the markdown, we perform the following post-processing steps:
which runs @support/build-diagram.sh@ to build the SVG.
- For each equation, invoke katex to compile them to HTML. This is cached
between runs (search for 'LatexEquation' in 'main').
- Extract the references block (if present), passing it through to the
template instead. Also add the paper's title to all reference links.
- Fetch all git authors for this file and add it to the template info.
Finally, we emit the markdown to HTML using the @support/web/template.html@
Expand All @@ -314,9 +336,10 @@ buildMarkdown :: String
buildMarkdown gitCommit gitAuthors input output = do
let
templateName = "support/web/template.html"
bibliographyName = "src/bibliography.bibtex"
modname = dropDirectory1 (dropDirectory1 (dropExtension input))

need [templateName, input]
need [templateName, bibliographyName, input]

modulePath <- findModule modname
authors <- gitAuthors modulePath
Expand All @@ -327,15 +350,28 @@ buildMarkdown gitCommit gitAuthors input output = do
| length modname > 24 = '':reverse (take 24 (reverse modname))
| otherwise = modname

Pandoc meta markdown <- liftIO do
mStr = MetaString . Text.pack
patchMeta
= Meta
. Map.insert "title" (mStr title)
. Map.insert "source" (mStr permalink)
. Map.insert "bibliography" (mStr bibliographyName)
. Map.insert "link-citations" (MetaBool True)
. unMeta

(markdown, references) <- liftIO do
contents <- Text.readFile input
either (fail . show) pure =<< runIO do
md <- readMarkdown def { readerExtensions = getDefaultExtensions "markdown" } [(input, contents)]
applyFilters def [JSONFilter "agda-reference-filter"] ["html"] md
markdown <- readMarkdown def { readerExtensions = getDefaultExtensions "markdown" } [(input, contents)]
Pandoc meta markdown <- applyFilters def [JSONFilter "agda-reference-filter"] ["html"] markdown
let pandoc = Pandoc (patchMeta meta) markdown
(,) <$> processCitations pandoc <*> getReferences Nothing pandoc

let
htmlInl = RawInline (Format "html")

refMap = Map.fromList $ map (\x -> (Cite.unItemId . Cite.referenceId $ x, x)) references

-- | Replace any expression $foo$-bar with <span ...>$foo$-bar</span>, so that
-- the equation is not split when word wrapping.
patchInlines (m@Math{}:s@(Str txt):xs)
Expand All @@ -356,30 +392,57 @@ buildMarkdown gitCommit gitAuthors input output = do
digest = showDigest . sha1 . LazyBS.fromStrict $ Text.encodeUtf8 contents
title = fromMaybe "commutative diagram" (lookup "title" attrs)
liftIO $ Text.writeFile ("_build/diagrams" </> digest <.> "tex") contents
tell ["_build/html" </> digest <.> "svg"]
tell mempty { mdDependencies = ["_build/html" </> digest <.> "svg"] }

pure $ Div ("", ["diagram-container"], [])
[ Plain [ Image (id, "diagram":classes, attrs) [] (Text.pack (digest <.> "svg"), title) ]
]
-- Find the references block, parse the references, and remove it. We write
-- the references as part of our template instead.
patchBlock (Div ("refs", _, _) body) = do
for_ body \ref -> case ref of
(Div (id, _, _) body) -> do
-- If our citation is a single paragraph, don't wrap it in <p>.
let body' = case body of
[Para p] -> [Plain p]
body -> body
-- Now render it the citation itself to HTML and add it to our template
-- context.
body <- either (fail . show) pure . runPure $
writeHtml5String def { writerExtensions = getDefaultExtensions "html" } (Pandoc mempty body')
let ref = Context $ Map.fromList
[ ("id", toVal id)
, ("body", toVal body)
]
tell mempty { mdReferences = [ MapVal ref ]}

_ -> fail ("Unknown reference node " ++ show ref)
pure Null

patchBlock h = pure h

-- Pre-render latex equations.
patchInline (Math DisplayMath contents) = htmlInl <$> askOracle (LatexEquation (True, contents))
patchInline (Math InlineMath contents) = htmlInl <$> askOracle (LatexEquation (False, contents))
-- Add the title to reference links.
patchInline (Link attrs contents (target, ""))
| Just citation <- Text.stripPrefix "#ref-" target
, Just ref <- Map.lookup citation refMap
, Just title <- Cite.valToText =<< Cite.lookupVariable "title" ref
= pure $ Link attrs contents (target, title)
patchInline h = pure h

mStr = MetaString . Text.pack
patchMeta = Meta . Map.insert "title" (mStr title) . Map.insert "source" (mStr permalink) . unMeta

liftIO $ Dir.createDirectoryIfMissing False "_build/diagrams"

markdown <- pure . walk patchInlines . Pandoc (patchMeta meta) $ markdown
markdown <- pure $ walk patchInlines markdown
markdown <- walkM patchInline markdown
(markdown, dependencies) <- runWriterT $ walkM patchBlock markdown
(markdown, MarkdownState references dependencies) <- runWriterT (walkM patchBlock markdown)
need dependencies

text <- liftIO $ either (fail . show) pure =<< runIO do
template <- getTemplate templateName >>= runWithPartials . compileTemplate templateName
>>= either (throwError . PandocTemplateError . Text.pack) pure

let
authors' = case authors of
[] -> "Nobody"
Expand All @@ -389,6 +452,7 @@ buildMarkdown gitCommit gitAuthors input output = do
context = Context $ Map.fromList
[ (Text.pack "is-index", toVal (modname == "index"))
, (Text.pack "authors", toVal authors')
, (Text.pack "reference", toVal references)
]
options = def { writerTemplate = Just template
, writerTableOfContents = True
Expand Down
11 changes: 11 additions & 0 deletions support/web/template.html
Expand Up @@ -150,6 +150,17 @@ <h3 class="Agda" style="margin-top: 0; margin-bottom: 0; white-space: pre;">
</div>

$body$
$if(reference)$
<hr />
<section id="refs" role="doc-bibliography">
<h2>References</h2>
<ul>
$for(reference)$
<li id="$reference.id$" role="doc-biblioentry">$reference.body$</li>
$endfor$
</ul>
</section>
$endif$
</article>
</div>
</main>
Expand Down

0 comments on commit 8fd4400

Please sign in to comment.