-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
structured abstract generated for HTML and LaTeX from JATS XML #8015
Comments
I'm tempted to say: if you don't want level-1 headings in the abstract, don't use Markdown If you don't have control over that, another option is to use a Lua filter that converts level 1 headings in an abstract to something else. |
In hindsight, my repo steps are confusing in that I start with So having I haven't learned how to make a Lua filters, but that sounds like a reasonable approach if one wants to generate HTML and LaTeX from JATS XML. In my particular situation I have a quick work around so I'm good for now. For the long-term I suspect I will want to upgrade my JATS -> HTML/LaTex conversion from the swiss-army knife that is pandoc to a more specialized knife that only cuts JATS. It's amazing that pandoc can convert so much to so much! But I bet it's inevitable I'll want to upgrade to a specialized JATS -> HTML/LaTex solution soon. |
To help clarify this issue, here is a summary. The attached JATS XML example is (roughly):
which pandoc converts to (roughly):
where the pandoc template variable
So the issue here is that pandoc is converting jATS |
The root cause here is that A solution to this could be to write a customized treatment for pandoc/src/Text/Pandoc/Readers/JATS.hs Line 389 in 16f28ef
To a behaviour that processes the inner |
Sounds like a promising idea. Thanks for thinking it out! However to be honest, I barely understand the code. I'm not very fluent in Haskell. A net result that seems like a big improvement is something like:
getting converted to
But if it's easier to output |
An easier fix would be to add a function to the abstract processing that just converts the Header elements to something more appropriate. |
FWIW, my really easy fix is to just not use headers in abstracts. 😅 So as an author I do this instead of authoring section headers:
which after |
Actually, an even easier solution would be to wrap the pandoc/src/Text/Pandoc/Readers/JATS.hs Lines 336 to 340 in 16f28ef
So the
|
To be honest, this level of JATS processing is probably beyond the scope of pandoc. I imagine at some point, there is a level of JATS specific semantics for which pandoc is no longer the right tool for the job. So I'm labeling this as an enhancement.
Nonetheless, I report the limitation here with pandoc 2.18.
REPO STEPS
With source.md
GOT
jats.xml.txt
got.tex.txt
got.html.txt
EXPECTED
The abstract to NOT be the same conversion of JATS XML
<sec>
,<title>
,<p>
elements that is done in the body. Rather it should be something semantic for the abstract section.For instance, the HTML generated for the abstract is
which isn't really right because
Objective
is not an h1 level heading. Although CSS is powerful enought to hack around this, it would be more appropriate to output something like:In the case of LaTex, the current output is:
my guess is there is a way to hack around this in LaTeX but I'm not as knowledgeable with LaTeX as HTML/CSS.
Currently the default output look pretty bad for JATS structured abstracts in both default HTML and LaTeX.
The text was updated successfully, but these errors were encountered: