Add image caption support for ODT #570

Open
wants to merge 1 commit into
from

6 participants

@ArloL

I looked at how OpenOffice.org 3.4.0 renders image captions and tried to implement it for pandoc.

Please do check for code style issues or possible problems as I am still very new to Haskell.

Currently the numbers are all set to 1 when opening the output file but "Tools -> Update -> Update All" updates all of the numbers properly.

@jgm
Owner

Do we really need "Illustration" and the number (which isn't correctly updated at first)? Why not just include the caption, without the label and number? (I try hard to avoid hard-coded English in pandoc, so people can use it to write in other languages.)

I see your point. My motivation is that I am trying to use pandoc to write my thesis and therefore I need the ability to number figures.

As for the label "Illustration": I see this as a sensible command line option.

About the numbering not being updated: this is solvable but my Haskell skills are not sufficient yet. The images are already numbered (because they are renamed to 0.jpg, etc.) so one could just pass in that number as a separate parameter (or something Haskell-esque) and use it.

For internationalization of the caption: How are captions created in different languages? Would it be sufficient to have an option to just disable the label and number? Or would it make sense to add a broader implementation? I only know German and English so I can't tell. I could inquire people for Russian, Hebrew and Mandarin though - if they actually treat numbers differently.

I checked in OOO as to the options it gives for numbering: there are a whole bunch.

I then went and tested an idea:
Let's say I add a few images with a specific numbering style (А, Б, .., Аа, Аб, ... (bg) to be precise). Then later change the content.xml to be arabic numbers instead of the given numbers. But I left the num-format option! Zipped and opened it in OOO and ran "Update All". Et voilá it updates the numbers to the specified format.

Thus to support different number formats in this setup one could include a second command line option which changes the num-format option.

If you are OK with the additional step of running "Update all".

It seems that OOO adds additional numbering formats to the ones of the official standard:
http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part1.html#__RefHeading__1417966_253892949

Assuming that your target is ODF 1.2. So the question is whether to support OOO only (potentially losing portability) or ODF 1.2 (loosing internationalization).

For 1.1 the info can be found here http://docs.oasis-open.org/office/v1.1/OS/OpenDocument-v1.1-html/OpenDocument-v1.1.html under section 12.2.2..

For 1.0 the info is also under section 12.2.2: http://www.oasis-open.org/committees/download.php/12572/OpenDocument-v1.0-os.pdf

@jgm
Owner
@jgm

There is also a problem here. Pandoc only creates a "figure" from an image when the image is alone in its paragraph. This change will make every inline image into a figure, which is not right.

@jondo

Ping?

@andrewheiss

Any updates here, @jgm ? I have a slight dilemma because of this…

Because there's no way to use a template for docx files (according to jgm/pandoc-templates#20), I convert to odt and then soffice (bundled with LibreOffice) to convert to docx. But I can't get image captions that way.

Alternatively, I can go straight from Markdown to docx, but then I can't use a custom template.

Thanks!

@jkr
Collaborator

@andrewheiss : Out of curiosity, what do you want from a template that isn't available in a reference docx file?

@jondo

@jkr: This is off-topic and should be posted in issue jgm/pandoc-templates#20.

@andrewheiss andrewheiss referenced this pull request in jgm/pandoc-templates Sep 11, 2014
Closed

how to create docx template #20

@blindmelon

I'm very interested in having proper (i.e. using frames, autonumbering) captions for figures in .odt.
I think the implementation in this pull request might be slightly overcomplicated (see #2401 for an xml snippet) but I think it is much better to use the native method for captioning than the manually added captions odt currently receives from pandoc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment