Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove anchor tag in xml #144

Closed
hcf-n opened this issue Jan 22, 2024 · 5 comments
Closed

Remove anchor tag in xml #144

hcf-n opened this issue Jan 22, 2024 · 5 comments

Comments

@hcf-n
Copy link

hcf-n commented Jan 22, 2024

I use make4ht to convert to docbook. In the resulting xml there are several anchor tags. Is there a way to remove/prevent these tags? (Some anchor tags are self-closing, and some are open-close pairs)

best regards
hcf

@michal-h21
Copy link
Owner

These anchors are usually created by \label or at places, where labels could be used, like in sections or tables. I think the easiest way how to remove them is to create make4ht DOM filter that would remove links that don't link anywhere.

Can you create a MWE that shows these extra anchors?

@hcf-n
Copy link
Author

hcf-n commented Jan 22, 2024

Of course :)

The following minimal example gives examples of different anchor tags when converting to docbook xml.

\documentclass[11pt, a4paper]{article}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc,url}
\usepackage{textcomp}
\usepackage[style=authoryear-comp]{biblatex}
\addbibresource{\jobname.bib}
\usepackage{filecontents}

\begin{filecontents}{\jobname.bib}
@book{key,
author = {Author, A.},
year = {2001},
title = {Title},
publisher = {Publisher},
}
\end{filecontents}

\title{Placeholder for title}
\author{Firstname Lastname}
\date{\today}

\begin{document}
\maketitle
\section{Test}
\begin{enumerate}
  \item First item
  \item Second item
\end{enumerate}

Sentence.\footnote{Example footnote.}

Sentence.\footcite{key}

\end{document}
	

@michal-h21
Copy link
Owner

This make4ht build file should do it:

local domfilter = require "make4ht-domfilter"

local process = domfilter {
  function(dom)
    local links = {}
    for _, el in ipairs(dom:query_selector("link")) do
      -- collect all links
      links[el:get_attribute("xlink:href"):gsub("^#", "")] = true
    end
    for _, el in ipairs(dom:query_selector("anchor")) do
      if not links[el:get_attribute("xml:id")] then
        el:remove_node()
      end
    end

    return dom
  end
}

Make:match("xml$", process)

It first saves all links, in order to keep anchors that some links point to. Then it loops over anchors and remove ones that no link points to.

Unfortunately, I've found a bug in make4ht, so it is possible that it will fail for you, if you use any links in your document (for example using \ref command). It will print that XML parsing failed. This should be fixed in the development version of make4ht.

@hcf-n
Copy link
Author

hcf-n commented Jan 24, 2024

Thank you, this works fine!

@michal-h21
Copy link
Owner

Great! So should I close this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants