/ pandoc Public

# Explicit Figure element in Block#3177

Open
opened this issue Oct 23, 2016 · 64 comments · May be fixed by jgm/pandoc-types#83
Open

# Explicit Figure element in Block #3177

opened this issue Oct 23, 2016 · 64 comments · May be fixed by jgm/pandoc-types#83
Labels
Projects

### jgm commented Oct 23, 2016

 Currently we represent figures in the AST using this hack: a figure is an Image whose title attribute starts with fig: and which is by itself in a Para. Short of a full-featured figure environment in the AST, it would make sense to move to a less hacky representation: a Div with class figure containing the image (which need not have a title starting with fig:). This would involve changes to readers and writers. Indeed, if we did this, we could support figures containing multiple images, via explicit Divs. The text was updated successfully, but these errors were encountered:

### jgm commented Oct 23, 2016

 Another advantage is that attributes could be added explicitly to the Div. 
![my image](img.jpg){.imageclass}
 See #3094.

### hrehfeld commented Oct 24, 2016 • edited

 Seconded, I just spent an hour figuring out why multiple images in a paragraph don't create a figure. Is there any way to create a figure with multiple images right now? (I guess creating Rawblocks will work?) Relevant code is here? https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/HTML.hs#L450 Why is the match only for one image and not multiple ones in the first place?

### jgm commented Oct 25, 2016

 If multiple images were allowed, which one would form the figure's caption? (There is only one caption.) What would determine how the images are arrayed in the figure? In retrospect, some kind of more explicit syntax for figures would have been desirable, and maybe that's the direction we should move in.

### hrehfeld commented Oct 25, 2016 • edited

 Layout: Hm, I only use html and latex backends, but both of those handle multiple images in a figure in a reasonable way without extra specification. However, in latex IIRC it makes a difference if there is a SoftBreak between images (break vs. no break). IMHO it would be up to the writer/backend to break up figures if multiple images are not supported. Caption: None unless explicitely stated? That's legal in both html and latex iirc. However I believe Paras do not support extra data like attrs, so that a div or figure node would be easier.

### jgm commented Oct 25, 2016

 +++ Hauke Rehfeld [Oct 24 16 16:42 ]: Seconded, I just spent an hour figuring out why multiple images in a paragraph don't create a figure. Is there any way to create a figure right now (I guess creating Rawblocks will work?) You can paste the images together into one image, I suppose, or use a filter. Relevant code is here? [1]https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/HT ML.hs#L450 Why is the match only for one image and not multiple ones in the first place? Good question. I suppose that since I'm in a field where pictorial figures aren't used much, I didn't realize how common figures with multiple images and a single caption are. But how would it work to allow a paragraph with multiple images (and nothing else) to form a figure? From which image would the figure's caption be taken? What determines how the images are arranged in the figure (side by side, in a square, etc.)? Really we need a more explicit syntax for figures if this kind of thing is going to be allowed.

mentioned this issue Dec 5, 2016

### jgm commented Feb 26, 2017

 I think we need an explicit Block element for Figure.

added the AST change label Feb 26, 2017

### mb21 commented Feb 26, 2017 • edited

 The question is whether this figure element should only contain images, or if it should be a general floating-container more analogous to the LaTeX \begin{figure} and HTML5 figure elements (emphasis added): Usually a 

### jgm commented Feb 26, 2017 via email

 +++ Mauro Bieg [Feb 26 17 05:07 ]: If so, the figure element should contains a caption (multiple paragraphs allowed) and arbitrary block content: Figure Attr [Block] [Block] That's what I was thinking.

mentioned this issue Apr 16, 2017
added this to the pandoc 2.0 milestone Aug 15, 2017

### jgm commented Aug 20, 2017 • edited

 Development on figures branch for pandoc, pandoc-types.

### jgm commented Aug 20, 2017 • edited

 Thinking about about explicit Markdown syntaxes for figures. If we had a native syntax for divs, we could treat any div with a following caption as a figure: ;--- {#foo .right} ![my image](img.jpg){.imageclass} ![second image](img2.jpg) ;--- ^^ [This is the optional short caption.] This is the long caption. It can span multiple blocks. Syntax like footnotes. Subsequent paragraphs indented. We automatically treat a div as a figure if it is followed by a caption. Or is it too confusing if the caption comes outside the div?  Another possibility would be to have a special kind of marking for figures, like: !--- {#foo .right} ![my image](img.jpg){.imageclass} ![second image](img2.jpg) [This is the optional short caption.] This is the long caption. It can span multiple blocks. Syntax like footnotes. In this version, all the Para elements at the end of the structure are treated as the caption, so we don't need an explicit syntax to mark the caption. !---  I'm trying to avoid syntaxes that require you to use the word figure. I like the ^^ syntax for attaching captions; we might want to use this for tables as well (INS: and code blocks) if we use it for this.

changed the title Better handling of implicit figures Explicit Figure element in Block Aug 20, 2017

### TODO on figures branch on pandoc and pandoc-citeproc

• Finish updating writers to handle Figure
• RST
• Markdown
• LaTeX
• HTML
• Org/Blocks
• MediaWiki
• Get everything to compile
• Update tests
• Update ToJSON, FromJSON instance in pandoc-types to include Figure, Caption
• Update Arbitary in pandoc-types (it lacks Div, Figure, Span, probably others)
• Markdown syntax, new extension?
• Figure numbering and internal refs?

### jgm commented Aug 20, 2017 • edited

 Having second thoughts about the type now. I suspect that allowing ANY kind of block content inside a figure is not going to work well in many output formats. E.g. in docbook, only certain elements are allowed inside a figure element. http://tdg.docbook.org/tdg/4.5/figure.html Perhaps instead the contents should be limited to a list of images (or perhaps a list of lists of images, so they can be organized on lines? -- though it may be better to let the layout happen automatically, given width information). Perhaps listings could go in figures as well?

### jgm commented Aug 20, 2017

 I think I'm going to remove this from the 2.0 milestone as it still needs more thought.

removed this from the pandoc 2.0 milestone Aug 20, 2017

### mb21 commented Aug 21, 2017

 Having second thoughts about the type now. I suspect that allowing ANY kind of block content inside a figure is not going to work well in many output formats. I still think taking the most general approach in the AST makes sense. There are always going to be some formats that don't support certain things, but that should be handled by the respective writers and the AST design shouldn't be held up by those. It would be great to have a general block figure element to output to HTML/ePUB/LaTeX...

### mb21 commented Aug 21, 2017 • edited

 Concerning the caption syntax, I kind of prefer the second one, since it is clearly placed inside the figure/div element. A third variant: ;--- {#foo .right} ![my image](img.jpg){.imageclass} ![second image](img2.jpg) ;-- This is the long caption. It can span multiple blocks. Syntax like footnotes. ;-- This is the optional short caption. Since it's optional, it needs to go at the end in this syntax. ;--- 

### jgm commented Aug 22, 2017

 What kinds of things do people really put in figures, besides images?

### mb21 commented Aug 23, 2017 • edited

 Maybe the element I have in mind is more of a Float than a Figure. Again the MDN extract posted above: Usually a 

### jgm commented Aug 23, 2017 via email

 Yes, this is the big conceptual issue to decide. Whether to have separate elements for (possibly floating) figures, tables, and listings with captions, or to make all of these non-floating and non-captioned, but add a container element that makes them floating and adds a caption.

### mb21 commented Aug 26, 2017

 I'm leaning currently towards the second option (general float/caption container). Use cases include floating more than just images (e.g. float two tables that share a caption), or having one figure with a caption, that contains subfigures (or images) with each having a caption, e.g: It's probably true that it gets a bit trickier to consider all cases in all writers, but it is a more flexible option.

mentioned this issue Oct 17, 2017
This was referenced Dec 25, 2017

## Captions

Markdown and HTML (and MediaWiki etc.) support six heading levels, plain Latex supports up to seven named levels (\part, \chapter, \section, \subsection, \subsubsection, \paragraph, \subparagraph). Most authors reserve the top level for single use at the start of the document, i.e. as title (although HTML and Latex have different dedicated markup for them), if they use it at all. Most styles guides tell writers to avoid more than three heading levels, but in technical documents deeply nested hierarchies do occur. The general syntax could support deeper levels as well: just repeat the prefix (and optional postfix) character # more often.

My point is, captions could use the lowest heading level already available

###### Caption

![text](target) 

or another level could be introduced for them, systematically:

####### Caption

![text](target) 

## Contents

In modern forum, blog and chat software and social websites, plain links are often automatically converted to informative “cards” by fetching metadata like title, author and cover image. Links to media files, audio and video recordings in particular, are also displayed with embedded playback controls. These can hardly be distinguished, conceptually, from traditional (floating) figures.

I therefore suggest that implicit figures shall support any number and combination of links ([foo](bar), [foo][baz], [foo][], [foo], <bar>) and embedded media (![foo](bar), ![foo][baz], ![foo][], ![foo]) as long as they are the only contents of a paragraph. In practice, authors will often put each one in a line of its own, but this, probably, cannot be relied upon.

A single, complex figure:

![text](target)
![text](target)

Another single, complex figure:

![text](target)![text](target)

Two simple figures:

![text](target)

![text](target) 

### lierdakil commented Jul 6, 2020 • edited

 @Crissov Pandoc Markdown does not have a limit on header level. Additionally, many other formats don't either, and we want Pandoc Markdown to be (at least somewhat) interoperable with those. So that's a hard blocker to your proposal.

### hubgit commented Jul 14, 2020

 It would be great to see support for figures as essentially wrappers that associate a caption with some content. Both HTML and JATS XML allow a fairly wide range of content inside their 

### despresc commented Sep 5, 2020

 Just to collect some thoughts from reading the thread above and looking through some of the supported formats: There are two sorts of figures. One type is a floating captioned container, which would most easily have this type in Pandoc: -- Caption from Table could be removed. data Block = ... | Figure Attr Caption CaptionPos FigureWidth [Block] ... -- A Figure with [Table...] or [CodeBlock...] content could be a -- captioned table or listing (for numbering or in output, if there -- are separate captions for those elements). -- Not sure how the Figure and Table Attrs would be handed in -- HTML output in that case. Just use the Figure's? Or merge. -- A Figure with [Plain [Image...]] content (or a Figure with -- a sequence of those figures as content) could be a -- gallery-type figure (the second kind). -- Caption position is frequently customizable data CaptionPos = CaptionBelow | CaptionAbove -- The figure width is necessary for subfigures in many formats. -- Handling it like Table columns (fractions of the enclosing -- container width) should work. data FigureWidth = FigureWidth Double | FigureWidthDefault Some support for this type of figure, that I know of: HTML5 has a 

### despresc commented Sep 5, 2020

 The second (gallery) version could be a grid, but I don't know how well that's supported in the outputs. Possibly support could be added with whatever native tables the output supported, if there wasn't native support for the grid version.

### hrehfeld commented Sep 7, 2020 via email

 To be precise/nitpicking, my understanding is that in html5 the figure/caption element is just a semantically tagged container (a block element), without any special formatting. I think one confusion that we frequently deal with when converting between formats is when to focus on retaining semantics versus retaining presentation. I would almost always argue that semantics need to be preserved and visuals need to be adapted, but not familiar with too many formats. So for me a figure is "a block element with an optional caption, possibly containing a series of these". It's hard to define without recursion. However, in latex you frequently do not use subfigure (each block has  it's own caption), but just lay out the sub-blocks and put the captions in a list textually, e.g.  "(left) first image caption. (right) second image caption." 6 Sep 2020 00:27:03 Christian Despres : … The second (gallery) version could be a grid, but I don't know how well that's supported in the outputs. Possibly support could be added with whatever native tables the output supported, if there wasn't native support for the grid version. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub[#3177 (comment)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AAATX5Z25IZ6UGUTNGQ6FQ3SEK3KXANCNFSM4CTY6SUA]. [https://github.com/notifications/beacon/AAATX54TC7Y52U4AXFODOILSEK3KXA5CNFSM4CTY6SUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOFD6QMNQ.gif]

### despresc commented Sep 7, 2020

 That is my understanding of HTML5 figures as well. The figure element represents some flow content, optionally with a caption, that is self-contained (like a complete sentence) and is typically referenced as a single unit from the main flow of the document. From the spec. Both it and MDN mention that it can be (or usually is) used for content that can be moved elsewhere without affecting the main flow of a document, and is best referenced in the text with a label like "Figure 7" so it can be moved. That's how floats are used in LaTeX, though LaTeX is much happier to move floats around by default than web browsers are figures, and will automatically put "Table 3" or "Figure 2" in the caption for you. I also prefer the semantically cleaner container version of figures. Their recursive structure does mean that figures inside Pandoc may be nested more deeply than figures are allowed in the output, so it's important to know what extensions (LaTeX packages and the like) can be used in the outputs to deal with nested figures.

### despresc commented Sep 7, 2020

 The HTML writer currently has to deal with the fact that HTML headings only go up to h6. Its fallback when encountering a Header past 6 is to render it as a paragraph with the heading class. So, eventually, the writers that have depth-limited figures and tables could keep track of the current figure depth. If they encounter too-deep nesting, they could convert the Figure to a Div containing its body and a Div caption (with appropriate classes), then attempt to render that. Otherwise they would render figures (and tables and galleries) however they're supported in the output. Initially, of course, every writer would need to fallback in this way, except for figures with [Table...] and [Plain [Image...]] content, which would be rendered as tables and figures currently are. Then better support (for figures, subfigures, subtables, and galleries) could be added to the relevant outputs.

mentioned this issue Sep 8, 2020
linked a pull request Sep 16, 2020 that will close this issue
mentioned this issue Oct 27, 2020
mentioned this issue Nov 19, 2020
mentioned this issue Mar 28, 2021

### tarleb commented May 6, 2021

 See #6782 for important info on accessibility.

### tarleb commented May 6, 2021

 Noting that #5994 depends on this.

added this to To do in Better figures May 14, 2021
mentioned this issue May 27, 2021
mentioned this issue Jul 27, 2021
mentioned this issue Jul 29, 2021
mentioned this issue Aug 2, 2021
This was referenced Aug 19, 2021
mentioned this issue Nov 22, 2021
mentioned this issue Feb 13, 2022
mentioned this issue Jun 24, 2022
mentioned this issue Jul 5, 2022
This was referenced Oct 26, 2022
mentioned this issue Nov 11, 2022
mentioned this issue Nov 29, 2022