New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Feature: internal links to tables and figures and headers #813

Open
GeraldLoeffler opened this Issue Apr 4, 2013 · 136 comments

Comments

Projects
None yet
@GeraldLoeffler

It's currently possible to include internal links to sections. I'd like to propose a similar feature for links to figures/images and tables.

It may make sense to provide this feature only if the figure/image or table that is being linked to has a caption. In that case Pandoc can today automatically generate a number for the figure or table and include it in the caption, e.g. "Figure 15".

At the most basic, the text of the link would be provided by the user, as is currently the case for links to sections.

Of course it would be very convenient if the automatically generated number for the figure or table would also be used for the text of the link, e.g. "as can be seen in Figure 15, blah", where "Figure 15" would be the internal link whose text is auto-generated from the figure it points to.

@kovla

This comment has been minimized.

Show comment
Hide comment
@kovla

kovla May 3, 2013

That would be lovely indeed. In academic writing it is quite often necessary, and while automatic numbering of figures and tables is nice, it really should be linked to what is in the text.

kovla commented May 3, 2013

That would be lovely indeed. In academic writing it is quite often necessary, and while automatic numbering of figures and tables is nice, it really should be linked to what is in the text.

@nichtich

This comment has been minimized.

Show comment
Hide comment
@nichtich

nichtich May 21, 2013

Contributor

One could use the figure caption as link target, similar to links to captions:

![la lune](lalune.jpg "Voyage to the moon")

...is shown in figure [la lune]...

And/or without automatic generation of link text:

...is shown in [the figure](#la-lune)...

See also issue #615 on automatic numbering of figures and tables in HTML output.

Contributor

nichtich commented May 21, 2013

One could use the figure caption as link target, similar to links to captions:

![la lune](lalune.jpg "Voyage to the moon")

...is shown in figure [la lune]...

And/or without automatic generation of link text:

...is shown in [the figure](#la-lune)...

See also issue #615 on automatic numbering of figures and tables in HTML output.

@liob

This comment has been minimized.

Show comment
Hide comment
@liob

liob Jul 23, 2013

I concur. However, @nichtich suggestion breaks the current syntax. Maybe a less intrusive approach would be a syntax like:

![Voyage to the moon](lalune.jpg){la lune}

It would be great to be able to reference figures. As @nichtich said: it is nearly a requirement in academic writing.

liob commented Jul 23, 2013

I concur. However, @nichtich suggestion breaks the current syntax. Maybe a less intrusive approach would be a syntax like:

![Voyage to the moon](lalune.jpg){la lune}

It would be great to be able to reference figures. As @nichtich said: it is nearly a requirement in academic writing.

@jgm

This comment has been minimized.

Show comment
Hide comment
@jgm

jgm Jul 23, 2013

Owner

A more consistent format would be

![Voyage to the moon](lalune.jpg){#lalune}

See the current attribute format for headers.

Owner

jgm commented Jul 23, 2013

A more consistent format would be

![Voyage to the moon](lalune.jpg){#lalune}

See the current attribute format for headers.

@liob

This comment has been minimized.

Show comment
Hide comment
@liob

liob Jul 27, 2013

indeed, that is a more consistent format.

About the implementation:
I see 2 major ways to implement this feature:

  1. Emulate something like the latex figure environment and output the figure as image with plain text underneath. Very much like figures are handled now in docx format, except that you put "Figure 1:" at the beginning. This would be the most portable way and should be fairly easy to implement in all format writers. However, than pandoc has to keep track of the references itself for cross referencing.
  2. Implement it the "proper" way in the corresponding format writer. Sticking with the docx example: Adding a caption to the image and then cross reference it in the text.

Can anybody (@jgm ?) make an educated guess on how much work either of the solutions will be?

liob commented Jul 27, 2013

indeed, that is a more consistent format.

About the implementation:
I see 2 major ways to implement this feature:

  1. Emulate something like the latex figure environment and output the figure as image with plain text underneath. Very much like figures are handled now in docx format, except that you put "Figure 1:" at the beginning. This would be the most portable way and should be fairly easy to implement in all format writers. However, than pandoc has to keep track of the references itself for cross referencing.
  2. Implement it the "proper" way in the corresponding format writer. Sticking with the docx example: Adding a caption to the image and then cross reference it in the text.

Can anybody (@jgm ?) make an educated guess on how much work either of the solutions will be?

@aaren

This comment has been minimized.

Show comment
Hide comment
@aaren

aaren Sep 10, 2013

I agree - this is essential for academic writing. I wish I knew Haskell!

The current way around this, in the mailing list discussion, is functional but clumsy.

Would this mean using \autoref in the latex? Then from markdown input:

...is shown in [the figure](#la-lune)...

you would get the latex output:

...is shown in \autoref{la-lune}...

aaren commented Sep 10, 2013

I agree - this is essential for academic writing. I wish I knew Haskell!

The current way around this, in the mailing list discussion, is functional but clumsy.

Would this mean using \autoref in the latex? Then from markdown input:

...is shown in [the figure](#la-lune)...

you would get the latex output:

...is shown in \autoref{la-lune}...
@9peppe

This comment has been minimized.

Show comment
Hide comment
@9peppe

9peppe Sep 19, 2013

![Voyage to the moon](lalune.jpg){#lalune}

I just tried to write something like

Some text

![Bla blah](pic.png)   {#something}

Some other text

I was surprised that did not work. It showed the image without caption, and a raw "{#something}" afterwards.

I assumed curly braces were for assigning attributes to anything... :D

9peppe commented Sep 19, 2013

![Voyage to the moon](lalune.jpg){#lalune}

I just tried to write something like

Some text

![Bla blah](pic.png)   {#something}

Some other text

I was surprised that did not work. It showed the image without caption, and a raw "{#something}" afterwards.

I assumed curly braces were for assigning attributes to anything... :D

@CFCF

This comment has been minimized.

Show comment
Hide comment
@CFCF

CFCF Nov 16, 2013

A workaround with numbered example lists is added to #904

For my purposes, this method works well with docx.

CFCF commented Nov 16, 2013

A workaround with numbered example lists is added to #904

For my purposes, this method works well with docx.

@Utsira

This comment has been minimized.

Show comment
Hide comment
@Utsira

Utsira Mar 29, 2014

I agree that being able to reference figures is essential to academic writing. The workarounds linked to above aren't really satisfactory, in my opinion

![Voyage to the moon](lalune.jpg){#lalune} would be perfect

Utsira commented Mar 29, 2014

I agree that being able to reference figures is essential to academic writing. The workarounds linked to above aren't really satisfactory, in my opinion

![Voyage to the moon](lalune.jpg){#lalune} would be perfect

@jgm jgm added the enhancement label Mar 31, 2014

@srhb

This comment has been minimized.

Show comment
Hide comment
@srhb

srhb Apr 24, 2014

Similar syntaxes would be very interesting for equations, too. In fact, why not adopt a completely general syntax? It would be especially nice if it could carry over to LaTeX bits, once you have to bail out and use say \begin{align} and friends.

srhb commented Apr 24, 2014

Similar syntaxes would be very interesting for equations, too. In fact, why not adopt a completely general syntax? It would be especially nice if it could carry over to LaTeX bits, once you have to bail out and use say \begin{align} and friends.

@frederik-elwert

This comment has been minimized.

Show comment
Hide comment
@frederik-elwert

frederik-elwert Apr 25, 2014

I have sympathy for the numbered example list approach, mainly for two reasons: Firstly, what we want are not really links but references, and secondly, the use case for numbered example lists is already close to, e.g., numbered equations. The example from the docs is close to a typical use case for figure references:

(@good)  This is a good example.

As (@good) illustrates, ...

This mechanism can already be used for figure references, as CFCF pointed out:

![Figure (@primitive_hut): The primitive hut](Illustrations\primitive_hut.png)

As can be seen in Figure (@primitive_hut), huts may be primitive.

# Index of Figures

(@primitive_hut) *Primitive hut* from the frontispiece of Marc-Antoine Laugier’s 1755 second edition of *Esssay on Architecture*, illustration by Charles-Dominique-Joseph-Eisen.

However, there are a few drawbacks:

  • You currently need an index of figures, since example lists require the (@id) to be at the beginning of a line at least once.
  • You have to add the Figure (@id): bit to the caption manually.
  • This breaks LaTeX/PDF output, since LaTeX adds a “Figure” prefix itself.

Thus, a proper referencing scheme would need a bit additional thinking. Especially, PDF and HTML output should work alike, probably by pandoc adding the Figure: bit to HTML output, while leaving it to LaTeX in the PDF case. Additionally, this should also work for referencing numbered sections, like in see chapter (@mychapter).

I have sympathy for the numbered example list approach, mainly for two reasons: Firstly, what we want are not really links but references, and secondly, the use case for numbered example lists is already close to, e.g., numbered equations. The example from the docs is close to a typical use case for figure references:

(@good)  This is a good example.

As (@good) illustrates, ...

This mechanism can already be used for figure references, as CFCF pointed out:

![Figure (@primitive_hut): The primitive hut](Illustrations\primitive_hut.png)

As can be seen in Figure (@primitive_hut), huts may be primitive.

# Index of Figures

(@primitive_hut) *Primitive hut* from the frontispiece of Marc-Antoine Laugier’s 1755 second edition of *Esssay on Architecture*, illustration by Charles-Dominique-Joseph-Eisen.

However, there are a few drawbacks:

  • You currently need an index of figures, since example lists require the (@id) to be at the beginning of a line at least once.
  • You have to add the Figure (@id): bit to the caption manually.
  • This breaks LaTeX/PDF output, since LaTeX adds a “Figure” prefix itself.

Thus, a proper referencing scheme would need a bit additional thinking. Especially, PDF and HTML output should work alike, probably by pandoc adding the Figure: bit to HTML output, while leaving it to LaTeX in the PDF case. Additionally, this should also work for referencing numbered sections, like in see chapter (@mychapter).

@btel

This comment has been minimized.

Show comment
Hide comment
@btel

btel Apr 25, 2014

Your workaround works as suggested, but I had to remove the parentheses when referencing the label, otherwise they were rendered in the output. After this modification my example looks like this:

Figure @figure is about being in time

![Figure @figure: Cubes](cubes.png)

(@figure) Figure 1

To remove the automatic numbering in LaTex (Figure 1:, etc.) you can add to the template:

\usepackage[labelformat=empty]{caption}

After rendering to pdf this produces the following output:

figure

btel commented Apr 25, 2014

Your workaround works as suggested, but I had to remove the parentheses when referencing the label, otherwise they were rendered in the output. After this modification my example looks like this:

Figure @figure is about being in time

![Figure @figure: Cubes](cubes.png)

(@figure) Figure 1

To remove the automatic numbering in LaTex (Figure 1:, etc.) you can add to the template:

\usepackage[labelformat=empty]{caption}

After rendering to pdf this produces the following output:

figure

@bitsgalore

This comment has been minimized.

Show comment
Hide comment
@bitsgalore

bitsgalore May 8, 2014

Just came across this issue as well and ended up here. I'm also really in favor of support for the syntax suggested by @jgm above:

![Voyage to the moon](lalune.jpg){#lalune}

Especially since this is the standard way of dealing with this in PHP Markdown Extra:

http://michelf.ca/projects/php-markdown/extra/#spe-attr

Just came across this issue as well and ended up here. I'm also really in favor of support for the syntax suggested by @jgm above:

![Voyage to the moon](lalune.jpg){#lalune}

Especially since this is the standard way of dealing with this in PHP Markdown Extra:

http://michelf.ca/projects/php-markdown/extra/#spe-attr

@mangecoeur

This comment has been minimized.

Show comment
Hide comment
@mangecoeur

mangecoeur Jul 10, 2014

Has there been any developments on this? It also seems to me that @jgm suggestiong

![Voyage to the moon](lalune.jpg){#lalune}

is the most consistent internally and with other tools. What would need to happen for this to be implemented?

Has there been any developments on this? It also seems to me that @jgm suggestiong

![Voyage to the moon](lalune.jpg){#lalune}

is the most consistent internally and with other tools. What would need to happen for this to be implemented?

@edwardabraham

This comment has been minimized.

Show comment
Hide comment
@edwardabraham

edwardabraham Jul 23, 2014

I was wanting to add support for this addition to the syntax. When trying to replicate papers using markdown for the scholmd project, this is the feature that stands out as most needed by Pandoc . In short this can be addressed through the general use of {#lalune} for labelling elements, and of @lalune for referencing the number of the corresponding element. The syntax (@) may be used to number elements that are otherwise unnumbered.

A general syntax for labels {#lalune}, that are associated with the preceding element would allow for any element to be labelled (paragraphs, equations, tables, etc.). By associating the label with an element in the abstract syntax tree, the properties of the element would be available when the reference was made, and so they can be numbered appropriately. This syntax is already used in one context in Pandoc (section heading labels), and is used by PHP Markdown extra. For elements that don't have numbers, such as equations, the syntax (@) may be used (from the example_lists extension). So an equation would be numbered and labelled as $$ F = G{m_1 m_2 \over r^2$$ (@) {#gravity}. (An alternative could be to use the example_lists extension style and number and label it in one go as $$ F = G{m_1 m_2 \over r^2$$ (@gravity). There are clearly some details and edge cases to be thought through here.)

When the document is rendered, Pandoc would associate a number with each labelled element, based on its type, and its position in the document. This logic would need to be carried out in Pandoc, so that it was available to the range of back-end writers (including HTML). The philosophy would be similar to Pandoc-citeproc, which carries out its own formatting of citations, rather than delegating to writers that support this approach (such as bibtex for latex). An option would to have this behaviour depend on the backend (so that it in latex it inserts \label and \ref commands), but elsewhere it may insert calculated numbers, if referencing is not supported by the backend. This has the advantage that it will work easily in contexts where only a fragment of the document is rendered. If pandoc is calculating the numbering, a syntax would be needed for specifying the start numbers in a fragment that wasn't being compiled in stand alone mode.

Labelled elements may be linked to, with the @ symbol being used to indicate the reference. So
a trip to [the moon](@lalune) would be an anchor link to the element labelled {#lalune}. In this case the text is rendered as a trip to the moon.

The syntax The moon is illustrated in Figure @lalune may be used to insert the number of the referenced element, as well as a link to that element, with the text rendered as The moon is illustrated in Figure 1. This follows the syntax used for referencing numbered lists with the example_lists extension.

A further syntax could be to use square brackets [@lalune] to insert the type and number of the element that is referenced, similar to the behaviour of latex's \autoref command. So, the moon is illustrated in [@lalune] would be rendered as the moon is illustrated in Figure 1 (including a link to the anchor). To implement this feature would require some localisation or customisation capability, so that the word used to describe the element could be specified. In its simplest, this customisation could be put in the YAML header, with for example figure_label: Fig. if the style required a shortened label. The syntax for the reference, [@lalune], is the same as is used by the pandoc-citeproc library, so it would be overloading that usage to implement a self-citation. Pandoc would have the information on the context that is needed to either format it as a citation, or as a reference, assuming that there was no collision between the labels and the citation keys.

I was wanting to add support for this addition to the syntax. When trying to replicate papers using markdown for the scholmd project, this is the feature that stands out as most needed by Pandoc . In short this can be addressed through the general use of {#lalune} for labelling elements, and of @lalune for referencing the number of the corresponding element. The syntax (@) may be used to number elements that are otherwise unnumbered.

A general syntax for labels {#lalune}, that are associated with the preceding element would allow for any element to be labelled (paragraphs, equations, tables, etc.). By associating the label with an element in the abstract syntax tree, the properties of the element would be available when the reference was made, and so they can be numbered appropriately. This syntax is already used in one context in Pandoc (section heading labels), and is used by PHP Markdown extra. For elements that don't have numbers, such as equations, the syntax (@) may be used (from the example_lists extension). So an equation would be numbered and labelled as $$ F = G{m_1 m_2 \over r^2$$ (@) {#gravity}. (An alternative could be to use the example_lists extension style and number and label it in one go as $$ F = G{m_1 m_2 \over r^2$$ (@gravity). There are clearly some details and edge cases to be thought through here.)

When the document is rendered, Pandoc would associate a number with each labelled element, based on its type, and its position in the document. This logic would need to be carried out in Pandoc, so that it was available to the range of back-end writers (including HTML). The philosophy would be similar to Pandoc-citeproc, which carries out its own formatting of citations, rather than delegating to writers that support this approach (such as bibtex for latex). An option would to have this behaviour depend on the backend (so that it in latex it inserts \label and \ref commands), but elsewhere it may insert calculated numbers, if referencing is not supported by the backend. This has the advantage that it will work easily in contexts where only a fragment of the document is rendered. If pandoc is calculating the numbering, a syntax would be needed for specifying the start numbers in a fragment that wasn't being compiled in stand alone mode.

Labelled elements may be linked to, with the @ symbol being used to indicate the reference. So
a trip to [the moon](@lalune) would be an anchor link to the element labelled {#lalune}. In this case the text is rendered as a trip to the moon.

The syntax The moon is illustrated in Figure @lalune may be used to insert the number of the referenced element, as well as a link to that element, with the text rendered as The moon is illustrated in Figure 1. This follows the syntax used for referencing numbered lists with the example_lists extension.

A further syntax could be to use square brackets [@lalune] to insert the type and number of the element that is referenced, similar to the behaviour of latex's \autoref command. So, the moon is illustrated in [@lalune] would be rendered as the moon is illustrated in Figure 1 (including a link to the anchor). To implement this feature would require some localisation or customisation capability, so that the word used to describe the element could be specified. In its simplest, this customisation could be put in the YAML header, with for example figure_label: Fig. if the style required a shortened label. The syntax for the reference, [@lalune], is the same as is used by the pandoc-citeproc library, so it would be overloading that usage to implement a self-citation. Pandoc would have the information on the context that is needed to either format it as a citation, or as a reference, assuming that there was no collision between the labels and the citation keys.

@kovla

This comment has been minimized.

Show comment
Hide comment
@kovla

kovla Jul 23, 2014

@edwardabraham It must be pointed out that the syntax [@lalune] is already used in pandoc for bibliographical citations.

kovla commented Jul 23, 2014

@edwardabraham It must be pointed out that the syntax [@lalune] is already used in pandoc for bibliographical citations.

@timtylin

This comment has been minimized.

Show comment
Hide comment
@timtylin

timtylin Jul 23, 2014

Contributor

@kovla @edwardabraham I don't see why #lalune couldn't be used also as a reference to the defined symbol. With this scheme [the moon](#lalune) could be a normal text link to the figure, while [#lalune] could do the numbered reference thing as mentioned. In fact, I have a custom build of Pandoc that does exactly this.

Contributor

timtylin commented Jul 23, 2014

@kovla @edwardabraham I don't see why #lalune couldn't be used also as a reference to the defined symbol. With this scheme [the moon](#lalune) could be a normal text link to the figure, while [#lalune] could do the numbered reference thing as mentioned. In fact, I have a custom build of Pandoc that does exactly this.

@edwardabraham

This comment has been minimized.

Show comment
Hide comment
@edwardabraham

edwardabraham Jul 24, 2014

@kovla The idea was to deliberately overload the [@lalune] syntax that is used for citations. The reason being that references to another part of the document are similar to citations (in essence they are self-citations). This has the benefit of avoiding introducing additional syntax. During processing the filter would identify which element the label was attached to, and use that information to appropriately format the text that is inserted into the document.

@evitaerc I prefer using the @ symbol, as it extends functionality that is already used by example_lists. Are you able to structure your pandoc build so that it may be implemented as a filter?

Note that this extension is [@lalune] a convenience and is not necessary, provided that the numbers are able to be accessed through the @lalune method.

@kovla The idea was to deliberately overload the [@lalune] syntax that is used for citations. The reason being that references to another part of the document are similar to citations (in essence they are self-citations). This has the benefit of avoiding introducing additional syntax. During processing the filter would identify which element the label was attached to, and use that information to appropriately format the text that is inserted into the document.

@evitaerc I prefer using the @ symbol, as it extends functionality that is already used by example_lists. Are you able to structure your pandoc build so that it may be implemented as a filter?

Note that this extension is [@lalune] a convenience and is not necessary, provided that the numbers are able to be accessed through the @lalune method.

@timtylin

This comment has been minimized.

Show comment
Hide comment
@timtylin

timtylin Jul 24, 2014

Contributor

@edwardabraham I tried the @ approach as well, but internal feedback in our lab showed that people get confused by what is a citation and what is an internal reference even when editing. The conclusion was that the mental model of keeping # for internal refs and using @ for external refs is the simplest to grok.

In fact, no one out of ten or so people have used example_lists (we are mostly writing extended abstracts and journal papers in the field of physics/engineering/applied math). When encountering a "list of scenarios" situation, the content was so static that people simply used literal numbers without issue.

Unfortunately the internal reference mechanism required heavy modification of the Markdown reader (additional state must be kept during the parsing process) and a custom AST, so I can't conceive of a filter implementation in the near future.

Contributor

timtylin commented Jul 24, 2014

@edwardabraham I tried the @ approach as well, but internal feedback in our lab showed that people get confused by what is a citation and what is an internal reference even when editing. The conclusion was that the mental model of keeping # for internal refs and using @ for external refs is the simplest to grok.

In fact, no one out of ten or so people have used example_lists (we are mostly writing extended abstracts and journal papers in the field of physics/engineering/applied math). When encountering a "list of scenarios" situation, the content was so static that people simply used literal numbers without issue.

Unfortunately the internal reference mechanism required heavy modification of the Markdown reader (additional state must be kept during the parsing process) and a custom AST, so I can't conceive of a filter implementation in the near future.

@elcritch

This comment has been minimized.

Show comment
Hide comment
@elcritch

elcritch Jul 24, 2014

Personally, the # symbol and the {#label} syntax would be easier to understand and use. In my mind citations and internal references follow very distinct "mental models". Many academic papers use distinct numbering for figures, tables, and equations but the proposed syntaxes don't appear to have a way to support distinct numberings by type. It would be an important design criteria (I only got to skim the comments, hopefully its not a redundant suggestion).
@edwardabraham You mentioned the scholmd. Is it currently just a repository of ideas or have they implemented any of the academic markdown features?
@evitaerc Great work! Is it possible for you to propose submitting the changes to the pandoc project or alternately creating a github fork to allow others to experiment?

Personally, the # symbol and the {#label} syntax would be easier to understand and use. In my mind citations and internal references follow very distinct "mental models". Many academic papers use distinct numbering for figures, tables, and equations but the proposed syntaxes don't appear to have a way to support distinct numberings by type. It would be an important design criteria (I only got to skim the comments, hopefully its not a redundant suggestion).
@edwardabraham You mentioned the scholmd. Is it currently just a repository of ideas or have they implemented any of the academic markdown features?
@evitaerc Great work! Is it possible for you to propose submitting the changes to the pandoc project or alternately creating a github fork to allow others to experiment?

@mangecoeur

This comment has been minimized.

Show comment
Hide comment
@mangecoeur

mangecoeur Jul 24, 2014

+1 for use of # symbol for internal references. But it's really important that the references can distinguish between equations, figures, and tables to have distinct numbering sequences.

There are two approaches to my mind

  1. make the "thing referenced" explicit in the tag, for instance using namespaces like #eqn.maxwells, #fig.hockeystick. Pandoc would have to track the objects in each namespace and format the references appropriately
  2. depend on pandoc's parser to know what type of thing is referenced and handle appropriately. So if you tag an image and then use a # reference pandoc automatically treats it as a "fig" reference, if you embed latex formula it because an equation reference etc. This would be cool but i suspect it would be a) complex and b) fragile - you get issues for instance if someone wants to embed an image for a formula.

+1 for use of # symbol for internal references. But it's really important that the references can distinguish between equations, figures, and tables to have distinct numbering sequences.

There are two approaches to my mind

  1. make the "thing referenced" explicit in the tag, for instance using namespaces like #eqn.maxwells, #fig.hockeystick. Pandoc would have to track the objects in each namespace and format the references appropriately
  2. depend on pandoc's parser to know what type of thing is referenced and handle appropriately. So if you tag an image and then use a # reference pandoc automatically treats it as a "fig" reference, if you embed latex formula it because an equation reference etc. This would be cool but i suspect it would be a) complex and b) fragile - you get issues for instance if someone wants to embed an image for a formula.
@bpj

This comment has been minimized.

Show comment
Hide comment
@bpj

bpj Jul 24, 2014

I agree very much that internal references, citations and references to numbered examples are different, and @ is already too overloaded by being used for both the latter. The problem with #reference which I can see is that it might get confused with a level one header since atx headers don't requre a space after the hashmarks as far as I know. I think {#anchor} and [#reference] would be good because then the id could be any valid HTML id including LaTeX-y things like {#img:la-lune}. As for doing different things with different anchors that is probably best left to filters.

Note that you could already do something like <span id="img:la-lune">![Voyage to the Moon](lalune.jpg)</span> and then [Voyage to the Moon](#img:la-lune) and get anchors/labels and links which work in both HTML and LaTeX. If you don't want hyperlinks in your LaTeX, numbered images in your HTML etc. that can be done with filters, e.g. replacing links with URLs like the one in my example with a span containing the link text plus a raw LaTeX string \ref{img:la-lune}.
(See
http://johnmacfarlane.net/pandoc/try/?text=%3Cspan+id%3D%22img%3Afoo-bar%22%3E!%5BA+bar+frequented+by+foos%5D(foo-bar)%3C%2Fspan%3E%0A%0A%5BThe+foo+bar%5D(%23img%3Afoo-bar).&from=markdown&to=latex)

It would be nice to have a less ugly syntax, but note that you would need to turn references into links when generating HTML, while it is rather trivial to have a filter do the opposite when generating LaTeX. Note also that you could abuse the link title, having the filter leave links with a title alone so that you get a hyperlink. It would even be easy to use spans with certain classes and/or attributes in the source and have one filter which turns them into references when generating LaTeX and one which turns them into links when generating HTML. I'm going on holiday tomorrow but I would be happy to write those filters when I get back! :-)

bpj commented Jul 24, 2014

I agree very much that internal references, citations and references to numbered examples are different, and @ is already too overloaded by being used for both the latter. The problem with #reference which I can see is that it might get confused with a level one header since atx headers don't requre a space after the hashmarks as far as I know. I think {#anchor} and [#reference] would be good because then the id could be any valid HTML id including LaTeX-y things like {#img:la-lune}. As for doing different things with different anchors that is probably best left to filters.

Note that you could already do something like <span id="img:la-lune">![Voyage to the Moon](lalune.jpg)</span> and then [Voyage to the Moon](#img:la-lune) and get anchors/labels and links which work in both HTML and LaTeX. If you don't want hyperlinks in your LaTeX, numbered images in your HTML etc. that can be done with filters, e.g. replacing links with URLs like the one in my example with a span containing the link text plus a raw LaTeX string \ref{img:la-lune}.
(See
http://johnmacfarlane.net/pandoc/try/?text=%3Cspan+id%3D%22img%3Afoo-bar%22%3E!%5BA+bar+frequented+by+foos%5D(foo-bar)%3C%2Fspan%3E%0A%0A%5BThe+foo+bar%5D(%23img%3Afoo-bar).&from=markdown&to=latex)

It would be nice to have a less ugly syntax, but note that you would need to turn references into links when generating HTML, while it is rather trivial to have a filter do the opposite when generating LaTeX. Note also that you could abuse the link title, having the filter leave links with a title alone so that you get a hyperlink. It would even be easy to use spans with certain classes and/or attributes in the source and have one filter which turns them into references when generating LaTeX and one which turns them into links when generating HTML. I'm going on holiday tomorrow but I would be happy to write those filters when I get back! :-)

@szarnyasg

This comment has been minimized.

Show comment
Hide comment
@szarnyasg

szarnyasg Aug 10, 2014

My goal is to produce HTML and PDF outputs from the same Markdown file, with the PDF containing references that can be printed (e.g. "See figure 1") . I found a cumbersome workaround inspired by @bpj's idea. Note that it does not work with pandoc 1.12.2.1 found in the Ubuntu APT repository, so I installed 1.12.4.2 from Cabal instead.

The following Markdown code:

<span id="pic.jpg"></span>

![A bar frequented by foos](pic.jpg)

[The foo bar](#pic.jpg).

Produces the following HTML code:

<p><span id="pic.jpg"></span></p>
<div class="figure">
<img src="pic.jpg" alt="A bar frequented by foos" /><p class="caption">A bar frequented by foos</p>
</div>
<p><a href="#pic.jpg">The foo bar</a>.</p>

This works reasonably well: the empty paragraph is not displayed so the link will navigate you to the image.

The generated LaTeX code is the following:

\label{pic.jpg}{}

\begin{figure}[htbp]
\centering
\includegraphics{pic.jpg}
\caption{A bar frequented by foos}
\end{figure}

\hyperref[pic.jpg]{The foo bar}.

The generated \label is of no use. Instead, we should add the label after the caption has been inserted. To do this, we save the filename of the current figure to a variable (\currentfigure) by redefining the \includegraphics command. We then redefine the \caption command to insert the caption and add the label from the variable. We also have to redefine the \hyperref command to \autoref.
To achieve this, we edit the LaTeX template file's preamble:

\let\oldincludegraphics\includegraphics
\renewcommand*{\includegraphics}[1]{\oldincludegraphics{#1}\def\currentfigure{#1}}
\let\oldcaption\caption
\renewcommand*{\caption}[1]{\oldcaption{#1}\label{\currentfigure}}
\renewcommand*{\hyperref}[2][\ar]{%
  \def\ar{#2}
  #2 (\autoref{#1})}

In the final PDF document, the caption and the reference look like this. "Figure 1" is also a hyperlink.

Figure 1: A bar frequented by foos

The foo bar (Figure 1).

While I think this workaround can be used in practice, it would be nice to have a syntax for inserting cross references in a simpler and less error-prone way.

My goal is to produce HTML and PDF outputs from the same Markdown file, with the PDF containing references that can be printed (e.g. "See figure 1") . I found a cumbersome workaround inspired by @bpj's idea. Note that it does not work with pandoc 1.12.2.1 found in the Ubuntu APT repository, so I installed 1.12.4.2 from Cabal instead.

The following Markdown code:

<span id="pic.jpg"></span>

![A bar frequented by foos](pic.jpg)

[The foo bar](#pic.jpg).

Produces the following HTML code:

<p><span id="pic.jpg"></span></p>
<div class="figure">
<img src="pic.jpg" alt="A bar frequented by foos" /><p class="caption">A bar frequented by foos</p>
</div>
<p><a href="#pic.jpg">The foo bar</a>.</p>

This works reasonably well: the empty paragraph is not displayed so the link will navigate you to the image.

The generated LaTeX code is the following:

\label{pic.jpg}{}

\begin{figure}[htbp]
\centering
\includegraphics{pic.jpg}
\caption{A bar frequented by foos}
\end{figure}

\hyperref[pic.jpg]{The foo bar}.

The generated \label is of no use. Instead, we should add the label after the caption has been inserted. To do this, we save the filename of the current figure to a variable (\currentfigure) by redefining the \includegraphics command. We then redefine the \caption command to insert the caption and add the label from the variable. We also have to redefine the \hyperref command to \autoref.
To achieve this, we edit the LaTeX template file's preamble:

\let\oldincludegraphics\includegraphics
\renewcommand*{\includegraphics}[1]{\oldincludegraphics{#1}\def\currentfigure{#1}}
\let\oldcaption\caption
\renewcommand*{\caption}[1]{\oldcaption{#1}\label{\currentfigure}}
\renewcommand*{\hyperref}[2][\ar]{%
  \def\ar{#2}
  #2 (\autoref{#1})}

In the final PDF document, the caption and the reference look like this. "Figure 1" is also a hyperlink.

Figure 1: A bar frequented by foos

The foo bar (Figure 1).

While I think this workaround can be used in practice, it would be nice to have a syntax for inserting cross references in a simpler and less error-prone way.

@bpj

This comment has been minimized.

Show comment
Hide comment
@bpj

bpj Aug 11, 2014

Note that the following works identically without the need to (re)define any LaTeX commands and without generating an 'empty' paragraph (including the fact that at least in my PDF reader the link jumps to the caption rather than to the top of the image):

<div id="fig:lalune">
![A voyage to the moon\label{fig:lalune}](lalune.jpg)

</div>

[The voyage to the moon](#fig:lalune).

It is slightly less elegant in that you have to specify the id/label twice, and slightly more elegant in that you avoid the empty span element and the resulting empty paragraph.

Note that the blank line inside the div is necessary in order to make pandoc see the div contents as a paragraph, and thus to get the image inside a figure environment. With the blank line the resulting LaTeX is like this:

\begin{figure}[htbp]
\centering
\includegraphics{lalune.jpg}
\caption{A voyage to the moon\label{fig:lalune}}
\end{figure}

but without it it is just like this:

\includegraphics{lalune.jpg}

Den 2014-08-10 22:38, Gábor Szárnyas skrev:

My goal is to produce HTML and PDF outputs from the same Markdown file, with the PDF containing references that can be printed (e.g. "See figure 1") . I found a cumbersome workaround inspired by @bpj's idea. Note that it does not work with pandoc 1.12.2.1 found in the Ubuntu APT repository, so I installed 1.12.4.2 from Cabal instead.

The following Markdown code:

<span id="pic.jpg"></span>

![A bar frequented by foos](pic.jpg)

[The foo bar](#pic.jpg).

Produces the following HTML code:

<p><span id="pic.jpg"></span></p>
<div class="figure">
<img src="pic.jpg" alt="A bar frequented by foos" /><p class="caption">A bar frequented by foos</p>
</div>

This works reasonably well: the empty paragraph is not displayed so the link will navigate you to the image.

The generated LaTeX code is the following:

\label{pic.jpg}{}

\begin{figure}[htbp]
\centering
\includegraphics{pic.jpg}
\caption{A bar frequented by foos}
\end{figure}

\hyperref[pic.jpg]{The foo bar}.

The generated \label is of no use. Instead, we should add the label after the caption has been inserted. To do this, we save the filename of the current figure to a variable (\currentfigure) by redefining the \includegraphics command. We then redefine the \caption command to insert the caption and add the label from the variable. We also have to redefine the \hyperref command to \autoref.
To achieve this, we edit the LaTeX template file's preamble:

\let\oldincludegraphics\includegraphics
\renewcommand*{\includegraphics}[1]{\oldincludegraphics{#1}\def\currentfigure{#1}}
\let\oldcaption\caption
\renewcommand*{\caption}[1]{\oldcaption{#1}\label{\currentfigure}}
\renewcommand*{\hyperref}[2][\ar]{%
   \def\ar{#2}
   #2 (\autoref{#1})}

In the final PDF document, the caption and the reference look like this. "Figure 1" is also a hyperlink.

Figure 1: A bar frequented by foos

The foo bar (Figure 1).

While I think this workaround can be used in practice, it would be nice to have a syntax for inserting cross references in a simpler and less error-prone way.


Reply to this email directly or view it on GitHub:
#813 (comment)

bpj commented Aug 11, 2014

Note that the following works identically without the need to (re)define any LaTeX commands and without generating an 'empty' paragraph (including the fact that at least in my PDF reader the link jumps to the caption rather than to the top of the image):

<div id="fig:lalune">
![A voyage to the moon\label{fig:lalune}](lalune.jpg)

</div>

[The voyage to the moon](#fig:lalune).

It is slightly less elegant in that you have to specify the id/label twice, and slightly more elegant in that you avoid the empty span element and the resulting empty paragraph.

Note that the blank line inside the div is necessary in order to make pandoc see the div contents as a paragraph, and thus to get the image inside a figure environment. With the blank line the resulting LaTeX is like this:

\begin{figure}[htbp]
\centering
\includegraphics{lalune.jpg}
\caption{A voyage to the moon\label{fig:lalune}}
\end{figure}

but without it it is just like this:

\includegraphics{lalune.jpg}

Den 2014-08-10 22:38, Gábor Szárnyas skrev:

My goal is to produce HTML and PDF outputs from the same Markdown file, with the PDF containing references that can be printed (e.g. "See figure 1") . I found a cumbersome workaround inspired by @bpj's idea. Note that it does not work with pandoc 1.12.2.1 found in the Ubuntu APT repository, so I installed 1.12.4.2 from Cabal instead.

The following Markdown code:

<span id="pic.jpg"></span>

![A bar frequented by foos](pic.jpg)

[The foo bar](#pic.jpg).

Produces the following HTML code:

<p><span id="pic.jpg"></span></p>
<div class="figure">
<img src="pic.jpg" alt="A bar frequented by foos" /><p class="caption">A bar frequented by foos</p>
</div>

This works reasonably well: the empty paragraph is not displayed so the link will navigate you to the image.

The generated LaTeX code is the following:

\label{pic.jpg}{}

\begin{figure}[htbp]
\centering
\includegraphics{pic.jpg}
\caption{A bar frequented by foos}
\end{figure}

\hyperref[pic.jpg]{The foo bar}.

The generated \label is of no use. Instead, we should add the label after the caption has been inserted. To do this, we save the filename of the current figure to a variable (\currentfigure) by redefining the \includegraphics command. We then redefine the \caption command to insert the caption and add the label from the variable. We also have to redefine the \hyperref command to \autoref.
To achieve this, we edit the LaTeX template file's preamble:

\let\oldincludegraphics\includegraphics
\renewcommand*{\includegraphics}[1]{\oldincludegraphics{#1}\def\currentfigure{#1}}
\let\oldcaption\caption
\renewcommand*{\caption}[1]{\oldcaption{#1}\label{\currentfigure}}
\renewcommand*{\hyperref}[2][\ar]{%
   \def\ar{#2}
   #2 (\autoref{#1})}

In the final PDF document, the caption and the reference look like this. "Figure 1" is also a hyperlink.

Figure 1: A bar frequented by foos

The foo bar (Figure 1).

While I think this workaround can be used in practice, it would be nice to have a syntax for inserting cross references in a simpler and less error-prone way.


Reply to this email directly or view it on GitHub:
#813 (comment)

@szarnyasg

This comment has been minimized.

Show comment
Hide comment
@szarnyasg

szarnyasg Aug 11, 2014

@bpj, thanks for the quick reply.

(including the fact that at least in my PDF reader the link jumps to the caption rather than to the top of the image)

I looked at this issue and found that it can be fixed easily by adding \usepackage{caption} to the template (see http://tex.stackexchange.com/questions/27096/href-to-an-image-label-how-to-jump-to-the-image-instead-of-the-caption-below-t for details).

It is slightly less elegant in that you have to specify the id/label twice, and slightly more elegant in that you avoid the empty span element and the resulting empty paragraph.

I agree, I also prefer your solution.

Note that the blank line inside the div is necessary in order to make pandoc see the div contents as a paragraph, and thus to get the image inside a figure environment.

Wow. I experimented with using a div element but couldn't get is working. Adding the extra newline did the trick.

@bpj, thanks for the quick reply.

(including the fact that at least in my PDF reader the link jumps to the caption rather than to the top of the image)

I looked at this issue and found that it can be fixed easily by adding \usepackage{caption} to the template (see http://tex.stackexchange.com/questions/27096/href-to-an-image-label-how-to-jump-to-the-image-instead-of-the-caption-below-t for details).

It is slightly less elegant in that you have to specify the id/label twice, and slightly more elegant in that you avoid the empty span element and the resulting empty paragraph.

I agree, I also prefer your solution.

Note that the blank line inside the div is necessary in order to make pandoc see the div contents as a paragraph, and thus to get the image inside a figure environment.

Wow. I experimented with using a div element but couldn't get is working. Adding the extra newline did the trick.

@bpj

This comment has been minimized.

Show comment
Hide comment
@bpj

bpj Aug 11, 2014

Den 2014-08-11 10:03, Gábor Szárnyas skrev:

@bpj, thanks for the quick reply.

@szarnyasg, you are welcome; sleeplessness has its advantages! ;)

(including the fact that at least in my PDF reader the link jumps to the caption rather than to the top of the image)

I looked at this issue and found that it can be fixed easily by adding \usepackage{caption} to the template (see http://tex.stackexchange.com/questions/27096/href-to-an-image-label-how-to-jump-to-the-image-instead-of-the-caption-below-t for details).

It does indeed! If a similar trick were possible in HTML one could just wrap the caption text in a span:

![<span id="fig:lalune">A voyage to the Moon</span>](lalune.jpg)

In practice you don't even see that you end up at a figure caption, though.

It is slightly less elegant in that you have to specify the id/label twice, and slightly more elegant in that you avoid the empty span element and the resulting empty paragraph.

I agree, I also prefer your solution.

Not needing to hack LaTeX is always preferable! :)

Note that the blank line inside the div is necessary in order to make pandoc see the div contents as a paragraph, and thus to get the image inside a figure environment.

Wow. I experimented with using a div element but couldn't get is working. Adding the extra newline did the trick.

See here and here for the explanation!

BTW for those who use Vim with the UltiSnips plugin I made a snippet for using this idiom which endeavors to (optionally) reduce typing duplication to a minimum:

# # ANCHORED FIGURE IN SOURCE FOR BOTH HTML AND LATEX
# 
#     <div id="fig:ID/LABEL/NAME">
#     ![CAPTION\label{fig:ID/LABEL/NAME}](ID/LABEL/NAME.jpg})
# 
#     </div>
# 
# ## Tabs
#
# Tab     Description                  Default
# ------  ---------------------------  -------
# $1      the entire id/label[^a]
# $2      id/label prefix              fig[^b]
# $3      id/label unique part
# $4      caption text
# $5      filename minus extension     $3
# $6      extension                    jpg[^c]
#
# [^a]: You should normally just skip this tab or the 'magic' with $5 will be lost!
# [^b]: The following `:` is automatically removed if you delete the prefix.
# [^c]: The separating dot is automatically removed if you delete the extension
#       -- for use with the --default-image-extension option!
#
# NOTE: The blank line inside inside the div is necessary to make pandoc
#   see a paragraph and thus place the image inside a figure environment!
#
snippet figdiv "Anchored figure in HTML and LaTeX" b
<div id="${1:${2:fig}${2/.+/:/}${3:ID/LABEL/NAME}}">
![${4:CAPTION}\\label\{$1}](${5:$3}${6/.+/./}${6:jpg})

</div>

$0
endsnippet

Reply to this email directly or view it on GitHub:
#813 (comment)

bpj commented Aug 11, 2014

Den 2014-08-11 10:03, Gábor Szárnyas skrev:

@bpj, thanks for the quick reply.

@szarnyasg, you are welcome; sleeplessness has its advantages! ;)

(including the fact that at least in my PDF reader the link jumps to the caption rather than to the top of the image)

I looked at this issue and found that it can be fixed easily by adding \usepackage{caption} to the template (see http://tex.stackexchange.com/questions/27096/href-to-an-image-label-how-to-jump-to-the-image-instead-of-the-caption-below-t for details).

It does indeed! If a similar trick were possible in HTML one could just wrap the caption text in a span:

![<span id="fig:lalune">A voyage to the Moon</span>](lalune.jpg)

In practice you don't even see that you end up at a figure caption, though.

It is slightly less elegant in that you have to specify the id/label twice, and slightly more elegant in that you avoid the empty span element and the resulting empty paragraph.

I agree, I also prefer your solution.

Not needing to hack LaTeX is always preferable! :)

Note that the blank line inside the div is necessary in order to make pandoc see the div contents as a paragraph, and thus to get the image inside a figure environment.

Wow. I experimented with using a div element but couldn't get is working. Adding the extra newline did the trick.

See here and here for the explanation!

BTW for those who use Vim with the UltiSnips plugin I made a snippet for using this idiom which endeavors to (optionally) reduce typing duplication to a minimum:

# # ANCHORED FIGURE IN SOURCE FOR BOTH HTML AND LATEX
# 
#     <div id="fig:ID/LABEL/NAME">
#     ![CAPTION\label{fig:ID/LABEL/NAME}](ID/LABEL/NAME.jpg})
# 
#     </div>
# 
# ## Tabs
#
# Tab     Description                  Default
# ------  ---------------------------  -------
# $1      the entire id/label[^a]
# $2      id/label prefix              fig[^b]
# $3      id/label unique part
# $4      caption text
# $5      filename minus extension     $3
# $6      extension                    jpg[^c]
#
# [^a]: You should normally just skip this tab or the 'magic' with $5 will be lost!
# [^b]: The following `:` is automatically removed if you delete the prefix.
# [^c]: The separating dot is automatically removed if you delete the extension
#       -- for use with the --default-image-extension option!
#
# NOTE: The blank line inside inside the div is necessary to make pandoc
#   see a paragraph and thus place the image inside a figure environment!
#
snippet figdiv "Anchored figure in HTML and LaTeX" b
<div id="${1:${2:fig}${2/.+/:/}${3:ID/LABEL/NAME}}">
![${4:CAPTION}\\label\{$1}](${5:$3}${6/.+/./}${6:jpg})

</div>

$0
endsnippet

Reply to this email directly or view it on GitHub:
#813 (comment)

@ivotron

This comment has been minimized.

Show comment
Hide comment
@ivotron

ivotron Aug 25, 2014

quick clarification regarding the div method. Redefining \hyperref is still needed in order to include a (Figure #)

ivotron commented Aug 25, 2014

quick clarification regarding the div method. Redefining \hyperref is still needed in order to include a (Figure #)

@srhb

This comment has been minimized.

Show comment
Hide comment
@srhb

srhb Sep 2, 2014

@bpj

I'm sorry but I had a bit of trouble following the last bit of your conversation with @szarnyasg. Does it mean that you found a way to to produce correct (LaTeX) references to figures, images and tables?

srhb commented Sep 2, 2014

@bpj

I'm sorry but I had a bit of trouble following the last bit of your conversation with @szarnyasg. Does it mean that you found a way to to produce correct (LaTeX) references to figures, images and tables?

@szarnyasg

This comment has been minimized.

Show comment
Hide comment
@szarnyasg

szarnyasg Sep 2, 2014

@srhb: I managed to get it working, although it's a bit cumbersome. I created a Hungarian thesis template which is available here:

It requires Pandoc 1.12.4.2+.

HTH,
Gabor

@srhb: I managed to get it working, although it's a bit cumbersome. I created a Hungarian thesis template which is available here:

It requires Pandoc 1.12.4.2+.

HTH,
Gabor

@srhb

This comment has been minimized.

Show comment
Hide comment
@srhb

srhb Sep 2, 2014

@szarnyasg Thank you kindly!

srhb commented Sep 2, 2014

@szarnyasg Thank you kindly!

@balachia

This comment has been minimized.

Show comment
Hide comment
@balachia

balachia Sep 16, 2014

I spent a bit of time trying to sort through this issue with a pandoc filter instead of by redefining \hyperref in default.latex.

TL;DR: compile the script below and use it as a pandoc --filter when converting to latex. Use the div trick to get html internal linking to work.

https://gist.github.com/balachia/d836f8829aec61cb4b54#file-pandoc-internalref-hs

Pandoc doesn't make \ref's anywhere when writing out latex so instead you have to inject them by using the pandoc format's RawInline type and using some kind of pattern matching to figure out where to do it. Right now I'm pattern matching on any internal link that starts with "#fig:" or "#tab:" and I'm wiping out whatever text the user specifies for the text words in favor of the \ref text. So you get the equivalent of:

Figure words -> Figure \ref*{fig:lefig}

That said, it's probably possible to pattern match on a better pattern. With some thinking, it might be possible to avoid the div trick too, by pulling out images in divs. Not sure about that one yet, though.

I spent a bit of time trying to sort through this issue with a pandoc filter instead of by redefining \hyperref in default.latex.

TL;DR: compile the script below and use it as a pandoc --filter when converting to latex. Use the div trick to get html internal linking to work.

https://gist.github.com/balachia/d836f8829aec61cb4b54#file-pandoc-internalref-hs

Pandoc doesn't make \ref's anywhere when writing out latex so instead you have to inject them by using the pandoc format's RawInline type and using some kind of pattern matching to figure out where to do it. Right now I'm pattern matching on any internal link that starts with "#fig:" or "#tab:" and I'm wiping out whatever text the user specifies for the text words in favor of the \ref text. So you get the equivalent of:

Figure words -> Figure \ref*{fig:lefig}

That said, it's probably possible to pattern match on a better pattern. With some thinking, it might be possible to avoid the div trick too, by pulling out images in divs. Not sure about that one yet, though.

@tomduck

This comment has been minimized.

Show comment
Hide comment
@tomduck

tomduck Dec 31, 2015

Reminder: there are filter-based solutions that can be used while this issue gets worked out. The following implement numbering and references using the syntax advocated by @scaramouche1:

They are python-based and easy to use. Alternatives are provided above by @aaren and @lierdakil.

Note: pandoc-fignos has been updated to work with the new figure attributes syntax that will appear in pandoc 1.16.

tomduck commented Dec 31, 2015

Reminder: there are filter-based solutions that can be used while this issue gets worked out. The following implement numbering and references using the syntax advocated by @scaramouche1:

They are python-based and easy to use. Alternatives are provided above by @aaren and @lierdakil.

Note: pandoc-fignos has been updated to work with the new figure attributes syntax that will appear in pandoc 1.16.

@beinvest

This comment has been minimized.

Show comment
Hide comment
@beinvest

beinvest Jan 10, 2016

Now that pandoc 1.16. is out, is this a bug or pointing at my misunderstanding of the new link_attributes extension?

Converting ![My caption](myfigure.png){#fig:myfigure} from Markdown to LaTeX, I would have expected

\begin{figure}[htbp]
\centering
\includegraphics{myfigure.png}
\caption{My caption}
\label{fig:myfigure}
\end{figure}

but instead the figure id/label is ignored.

Now that pandoc 1.16. is out, is this a bug or pointing at my misunderstanding of the new link_attributes extension?

Converting ![My caption](myfigure.png){#fig:myfigure} from Markdown to LaTeX, I would have expected

\begin{figure}[htbp]
\centering
\includegraphics{myfigure.png}
\caption{My caption}
\label{fig:myfigure}
\end{figure}

but instead the figure id/label is ignored.

@mb21

This comment has been minimized.

Show comment
Hide comment
@mb21

mb21 Jan 10, 2016

Collaborator

@beinvest good point, must have overlooked that back in the day when I did the image sizes ;) fixed in #2637

Collaborator

mb21 commented Jan 10, 2016

@beinvest good point, must have overlooked that back in the day when I did the image sizes ;) fixed in #2637

@beinvest

This comment has been minimized.

Show comment
Hide comment
@beinvest

beinvest Jan 10, 2016

@mb21 Thanks for the help and your work!!

@mb21 Thanks for the help and your work!!

@lierdakil

This comment has been minimized.

Show comment
Hide comment
@lierdakil

lierdakil Jan 26, 2016

Contributor

So, I mused on this for a bit, and here are some questions and ideas, in no particular order

  1. Do we want to add attributes (i.e. identifiers) to all elements that can be referenced? Images and sections have those already. But what about tables/equations/etc? Could we use divs/spans? Should we?
  2. Related to (1), if we go with divs/spans, would it be a good idea to add "implicit spans" extension, that would wrap any (or some) inline element followed by attribute specification in a span? It's an easy change to Markdown parser. A shorthand for divs that wrap a single Block element could also be a good idea.
  3. After some thought, I think a dedicated reference element/syntax is a must. [#id] seems nice, and semantics should be basically the same as Citation. It should probably be possible to reuse code that parses citations for references, with little effort.
  4. In general, it should be possible to determine referenced element type based on heuristic. There should be a way to override it though. I suppose a key-value attribute should do the trick here (e.g. {#someid ref-type=figure}). Classes are a tougher sell IMO, although syntax is a little cleaner.
  5. For numbering, I think that should mostly be delegated to writer. For formats that don't support that, a filter similar to pandoc-citeproc could be employed. That said, even HTML supports counters nowadays, so this would only be relevant for plain text formats or cases where native numbering is suboptimal for some reason (I shudder at the thought of pains I endured fighting MS Word counters, so from my point of view, this is definitely a use-case to consider). A filter could also be a good option for transitional period, while writer code is catching up. Not to sound as a shameless self-promotion, but pandoc-crossref could be easily repurposed for this.
Contributor

lierdakil commented Jan 26, 2016

So, I mused on this for a bit, and here are some questions and ideas, in no particular order

  1. Do we want to add attributes (i.e. identifiers) to all elements that can be referenced? Images and sections have those already. But what about tables/equations/etc? Could we use divs/spans? Should we?
  2. Related to (1), if we go with divs/spans, would it be a good idea to add "implicit spans" extension, that would wrap any (or some) inline element followed by attribute specification in a span? It's an easy change to Markdown parser. A shorthand for divs that wrap a single Block element could also be a good idea.
  3. After some thought, I think a dedicated reference element/syntax is a must. [#id] seems nice, and semantics should be basically the same as Citation. It should probably be possible to reuse code that parses citations for references, with little effort.
  4. In general, it should be possible to determine referenced element type based on heuristic. There should be a way to override it though. I suppose a key-value attribute should do the trick here (e.g. {#someid ref-type=figure}). Classes are a tougher sell IMO, although syntax is a little cleaner.
  5. For numbering, I think that should mostly be delegated to writer. For formats that don't support that, a filter similar to pandoc-citeproc could be employed. That said, even HTML supports counters nowadays, so this would only be relevant for plain text formats or cases where native numbering is suboptimal for some reason (I shudder at the thought of pains I endured fighting MS Word counters, so from my point of view, this is definitely a use-case to consider). A filter could also be a good option for transitional period, while writer code is catching up. Not to sound as a shameless self-promotion, but pandoc-crossref could be easily repurposed for this.
@edusantana

This comment has been minimized.

Show comment
Hide comment
@edusantana

edusantana Feb 17, 2016

Just a contribution for a workaround while this issue is open: http://tex.stackexchange.com/questions/139106/referencing-tables-in-pandoc

Just a contribution for a workaround while this issue is open: http://tex.stackexchange.com/questions/139106/referencing-tables-in-pandoc

@sjackman

This comment has been minimized.

Show comment
Hide comment
@sjackman

sjackman Nov 29, 2016

@lierdakil pandoc-crossref works great! Thanks for this work!

@lierdakil pandoc-crossref works great! Thanks for this work!

@jgm

This comment has been minimized.

Show comment
Hide comment
@jgm

jgm Mar 2, 2017

Owner

I'm adding the pandoc-2.0 milestone so we at least think about whether to add some of these features to standard pandoc. (I'm using pandoc-crossref now and it works very well indeed.)

Owner

jgm commented Mar 2, 2017

I'm adding the pandoc-2.0 milestone so we at least think about whether to add some of these features to standard pandoc. (I'm using pandoc-crossref now and it works very well indeed.)

@jgm jgm added this to the pandoc 2.0 milestone Mar 2, 2017

@ibutra

This comment has been minimized.

Show comment
Hide comment
@ibutra

ibutra Mar 2, 2017

It would be nice though if either pandoc or pandoc-crossref support the auto-identifiers.

ibutra commented Mar 2, 2017

It would be nice though if either pandoc or pandoc-crossref support the auto-identifiers.

@lierdakil

This comment has been minimized.

Show comment
Hide comment
@lierdakil

lierdakil Mar 2, 2017

Contributor

@ibutra, not sure what you mean by 'auto-identifiers' exactly.

Contributor

lierdakil commented Mar 2, 2017

@ibutra, not sure what you mean by 'auto-identifiers' exactly.

@ibutra

This comment has been minimized.

Show comment
Hide comment
@ibutra

ibutra Mar 2, 2017

Manual: The second entry named * auto_identifiers* is what I mean, basically the identifier given by pandoc on default if none is given manually for referencing

ibutra commented Mar 2, 2017

Manual: The second entry named * auto_identifiers* is what I mean, basically the identifier given by pandoc on default if none is given manually for referencing

@mangecoeur

This comment has been minimized.

Show comment
Hide comment
@mangecoeur

mangecoeur Mar 3, 2017

@lierdakil I think @ibutra is referring to how Pandoc can auto generate section reference tags from the heading text (crossref already supports it for headings, with caveats).

I can see the appeal, for example I end up following the pattern:

![Plot text](../fig/plot_filename){#fig:plot_filename}

It could be an idea to generate a tag fig:plot_filename if one isn't explicitly given. Might be a bit unnecessary though (I just added an editor snippet to generate the pattern) but on the other hand, why not?

@lierdakil I think @ibutra is referring to how Pandoc can auto generate section reference tags from the heading text (crossref already supports it for headings, with caveats).

I can see the appeal, for example I end up following the pattern:

![Plot text](../fig/plot_filename){#fig:plot_filename}

It could be an idea to generate a tag fig:plot_filename if one isn't explicitly given. Might be a bit unnecessary though (I just added an editor snippet to generate the pattern) but on the other hand, why not?

@jgm

This comment has been minimized.

Show comment
Hide comment
@jgm

jgm Mar 3, 2017

Owner
Owner

jgm commented Mar 3, 2017

@ibutra

This comment has been minimized.

Show comment
Hide comment
@ibutra

ibutra Mar 3, 2017

I specifically meant the headers though the same feature for figures and tables would be nice too.

What I didn't know @mangecoeur is that pandoc-crossref already supports this?

ibutra commented Mar 3, 2017

I specifically meant the headers though the same feature for figures and tables would be nice too.

What I didn't know @mangecoeur is that pandoc-crossref already supports this?

@lierdakil

This comment has been minimized.

Show comment
Hide comment
@lierdakil

lierdakil Mar 3, 2017

Contributor

@ibutra, from https://github.com/lierdakil/pandoc-crossref#section-labels

You can also use autoSectionLabels variable to automatically prepend all section labels (automatically generated with pandoc included) with "sec:". Bear in mind that references can't contain periods, commas etc, so some auto-generated labels will still be unusable.

Generating labels for figures/tables/other has another drawback. Right now, the default behavior in pandoc-crossref is to ignore unlabelled elements (since this is least intrusive), so

![Caption](image) 

will be an unnumbered (or rather, unprocessed) figure.

This kind of behavior is useful for informal writing, when you don't need to number the figures you're not referencing. Also for running pandoc-crossref on documents that don't need cross-referencing at all, f.ex. from an automated script.

@jgm, for figures, a better (more concise) source of auto identifiers is probably not a title, but a filename (or rather, basename). Tables and listings are another matter, and I don't think it's feasible for math.

Contributor

lierdakil commented Mar 3, 2017

@ibutra, from https://github.com/lierdakil/pandoc-crossref#section-labels

You can also use autoSectionLabels variable to automatically prepend all section labels (automatically generated with pandoc included) with "sec:". Bear in mind that references can't contain periods, commas etc, so some auto-generated labels will still be unusable.

Generating labels for figures/tables/other has another drawback. Right now, the default behavior in pandoc-crossref is to ignore unlabelled elements (since this is least intrusive), so

![Caption](image) 

will be an unnumbered (or rather, unprocessed) figure.

This kind of behavior is useful for informal writing, when you don't need to number the figures you're not referencing. Also for running pandoc-crossref on documents that don't need cross-referencing at all, f.ex. from an automated script.

@jgm, for figures, a better (more concise) source of auto identifiers is probably not a title, but a filename (or rather, basename). Tables and listings are another matter, and I don't think it's feasible for math.

@gappleto97

This comment has been minimized.

Show comment
Hide comment
@gappleto97

gappleto97 Apr 19, 2017

For RST the syntax should be much easier. Just use the already-available name field:

.. figure:: image.png
    :name: example
    :alt: an image

    This is the caption

For RST the syntax should be much easier. Just use the already-available name field:

.. figure:: image.png
    :name: example
    :alt: an image

    This is the caption
@DaveJarvis

This comment has been minimized.

Show comment
Hide comment
@DaveJarvis

DaveJarvis Apr 28, 2017

see Example (14)
...
see Figure 5

FWIW, the Markdown should not include the caption type text (e.g., "Equation", "Table", "Figure") as that is presentation logic. That is, without changing the source, it should be possible to replace "Figure" with "Illustration" throughout the output document.

Here are a few others, which suggests that the solution should be caption type agnostic. The complete set of possible captions is fairly long and we probably shouldn't try to restrain the syntax to a particular subset as some could get missed, such as:

see Listing (14)
see Algorithm 5

Thus with the text, As seen in Figure @fig:force, the word "Figure" is redundant (the @fig already signifies the caption is a figure). With that particular syntax, As seen in @fig:force allows the rendering component (e.g., LaTeX, ConTeXt, etc.) to determine what caption type text to inject, if any.

DaveJarvis commented Apr 28, 2017

see Example (14)
...
see Figure 5

FWIW, the Markdown should not include the caption type text (e.g., "Equation", "Table", "Figure") as that is presentation logic. That is, without changing the source, it should be possible to replace "Figure" with "Illustration" throughout the output document.

Here are a few others, which suggests that the solution should be caption type agnostic. The complete set of possible captions is fairly long and we probably shouldn't try to restrain the syntax to a particular subset as some could get missed, such as:

see Listing (14)
see Algorithm 5

Thus with the text, As seen in Figure @fig:force, the word "Figure" is redundant (the @fig already signifies the caption is a figure). With that particular syntax, As seen in @fig:force allows the rendering component (e.g., LaTeX, ConTeXt, etc.) to determine what caption type text to inject, if any.

@sjackman

This comment has been minimized.

Show comment
Hide comment
@sjackman

sjackman Apr 28, 2017

The above is also helpful when referencing multiple items, for example
As shown in @fig:a;@fig:b => As shown in Figures 1 and 2
and ranges
As shown in @fig:a;@fig:b;@fig:c => As shown in Figures 1–3

The above is also helpful when referencing multiple items, for example
As shown in @fig:a;@fig:b => As shown in Figures 1 and 2
and ranges
As shown in @fig:a;@fig:b;@fig:c => As shown in Figures 1–3

@mangecoeur

This comment has been minimized.

Show comment
Hide comment
@mangecoeur

mangecoeur May 30, 2017

Hopefully, if this is built into core pandoc the docx could gain the ability to output 'real' reference fields (using the office xml reference tags). This would allow you to post-process fields in Word, for example to generate tables of figures and tables of tables (Word can generate these when caption fields are used).

Hopefully, if this is built into core pandoc the docx could gain the ability to output 'real' reference fields (using the office xml reference tags). This would allow you to post-process fields in Word, for example to generate tables of figures and tables of tables (Word can generate these when caption fields are used).

@Hipomenes

This comment has been minimized.

Show comment
Hide comment
@Hipomenes

Hipomenes Jun 22, 2017

Here it goes again... How does one cross-reference figures in Pandoc?

Thanks!

Here it goes again... How does one cross-reference figures in Pandoc?

Thanks!

@iandol

This comment has been minimized.

Show comment
Hide comment
@iandol

iandol Jun 22, 2017

Contributor

For the moment you should use filters, either pandoc-crossref (installs via homebrew if you use a Mac: brew install pandoc-crossref) or pandoc-fignos (you need a working python install). Personally I do all my writing in Scrivener, which has its own crossref system that outputs to Pandoc so don't use these myself.

Contributor

iandol commented Jun 22, 2017

For the moment you should use filters, either pandoc-crossref (installs via homebrew if you use a Mac: brew install pandoc-crossref) or pandoc-fignos (you need a working python install). Personally I do all my writing in Scrivener, which has its own crossref system that outputs to Pandoc so don't use these myself.

@jgm jgm changed the title from New Feature: internal links to tables and figures to New Feature: internal links to tables and figures and headers Aug 13, 2017

@jgm jgm removed this from the pandoc 2.0 milestone Aug 20, 2017

@petterreinholdtsen

This comment has been minimized.

Show comment
Hide comment
@petterreinholdtsen

petterreinholdtsen Feb 25, 2018

It would be great if pandoc by default would support adding image/figure IDs and cross references when converting markdown to docbook. This would ensure the software needed is available in Debian.

I am currently typesetting a set of books using a Markdown->Docbook pipeline, and need a way to reference figures in the text.

It would be great if pandoc by default would support adding image/figure IDs and cross references when converting markdown to docbook. This would ensure the software needed is available in Debian.

I am currently typesetting a set of books using a Markdown->Docbook pipeline, and need a way to reference figures in the text.

@ikcalB

This comment has been minimized.

Show comment
Hide comment
@ikcalB

ikcalB Jul 27, 2018

@jgm is there any progress an this inside the main tree, or would you suggest using the filter pandoc-crossref?

ikcalB commented Jul 27, 2018

@jgm is there any progress an this inside the main tree, or would you suggest using the filter pandoc-crossref?

@mb21

This comment has been minimized.

Show comment
Hide comment
@mb21

mb21 Jul 27, 2018

Collaborator

for now, use pandoc-crossref

Collaborator

mb21 commented Jul 27, 2018

for now, use pandoc-crossref

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment