New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom styles: fenced div syntax and ODT, ICML (and maybe LaTeX and RST) writers #2106

Open
matthijskooijman opened this Issue Apr 22, 2015 · 49 comments

Comments

Projects
None yet
@matthijskooijman

matthijskooijman commented Apr 22, 2015

Currently, you can customize formatting by providing some customized styles in a reference odt/docx file to pandoc. These styles are copied into the generated document and applied to selected elements in the document.

However, this only supports a limited, hard-coded list of styles. E.g. the 'Heading 1' style is applied to level-1 headings, the 'Text body' style to regular paragraphs, the 'Bullet Symbols' style to bullets. On top of this, I'd like to be able to specify other styles to use, on a block-by-block basis.

My usecase is that I'm writing a book, and my publisher requires the text to be delivered in docx or odt format, using a set of custom styles with non-standard style names. I have two challenges:

  1. Regular styles have different names, so I need to use e.g. "Publisher Heading 1" for first-level headings.
  2. Some parts of my document must use completely different styles, which have no equivalent in the pandoc AST (e.g. notes for the layouter must use the "Layout Notes" style).

If I have some way to specify custom style names in the AST, I can use a pandoc filter to introduce the proper style names in the AST and have the writers use the right styles. Ideally, I'd be able to specify the style names for challenge 2 above directly in my markdown source.

Attributes seem like a good fit for this, so that would work out to something like this in markdown:

# My Header {style="Fancy Header 1"}

Note that this is already supported by the current markdown parser, and results in this JSON:

[{"unMeta":{}},
 [{"t":"Header","c":[1,["my-header",[],[["style","Fancy Header 1"]]],
 [{"t":"Str","c":"My"},{"t":"Space","c":[]},{"t":"Str","c":"Header"}]]}]
]

However, converting this to ODT or DOCX drops the attributes completely.

I'm aware that not all elements currently allow specifying styles, but that's another discussion. What I'd like to see is that, for all elements that currently support attributes, the "style" attribute be interpreted as a style name in the ODT and DOCX writes.

ODT and DOCX have a distinction between paragraph and character styles, but it seems that using the paragraph styles for block elements and character styles for inline elements is sufficient?

Does this sound reasonable? Is the "style" attribute the right fit here? For the HTML writer, the style attribute should contain a bit of raw CSS. I don't think we can have a single syntax that works for both HTML and ODT/DOCX, though perhaps we can have something that allows specifying two distinct attributes to allow working with both? At first glance, the class is more appropriate than an attribute, since a class is also a name that indirectly specifies formatting to be used. However, a class can be specified multiple times, which I don't think is applicable to DOCX / ODT styles? Or perhaps just use the first / last class name specified?

I'll likely spend a bit of time implementing this for my own needs, but I'd rather know up-front if this has any chance of being merged.

@mpickering, It seems that such a style= attribute could perhaps also be used by an asciidoc reader to fix #1234 / #1235.

@gbjbaanb

This comment has been minimized.

Show comment
Hide comment
@gbjbaanb

gbjbaanb Sep 23, 2015

It is possible to alter the styles in the docx after the generation is complete. For example, here I set the style for all tables to BlueTableStyle (which is a style in the template dotx I use to generate docx from pandoc). The styles are contained in the document.xml file inside the docx archive.

Obviously this is not an ideal solution and being able for pandoc to spit out the desired style name given the right directives would be great, similar to how you can specify them for HTML. I don't see why they would need different input syntax, the use-case for generating different styles based on output type is a can of worms best left closed.

7z x -y %~1.docx word\document.xml 
sed "s/<w:tblStyle w:val=\"TableNormal\"/<w:tblStyle w:val=\"BlueTableStyle\"/g" word\document.xml > word\document2.xml
copy word\document2.xml word\document.xml /y
7z u -y %~1.docx word\document.xml

gbjbaanb commented Sep 23, 2015

It is possible to alter the styles in the docx after the generation is complete. For example, here I set the style for all tables to BlueTableStyle (which is a style in the template dotx I use to generate docx from pandoc). The styles are contained in the document.xml file inside the docx archive.

Obviously this is not an ideal solution and being able for pandoc to spit out the desired style name given the right directives would be great, similar to how you can specify them for HTML. I don't see why they would need different input syntax, the use-case for generating different styles based on output type is a can of worms best left closed.

7z x -y %~1.docx word\document.xml 
sed "s/<w:tblStyle w:val=\"TableNormal\"/<w:tblStyle w:val=\"BlueTableStyle\"/g" word\document.xml > word\document2.xml
copy word\document2.xml word\document.xml /y
7z u -y %~1.docx word\document.xml
@krtek4

This comment has been minimized.

Show comment
Hide comment
@krtek4

krtek4 Aug 3, 2016

I am in the exact same situation : writing a book and the publisher insists on docx file format using their custom styles.

Is there any chance this could get implemented ?

The workaround proposed works really great when you want to style all given elements the same way, or rename a given style to some other one. But when just some elements needs to use a specific style I don't see how it can be made to work.

Thanks !

krtek4 commented Aug 3, 2016

I am in the exact same situation : writing a book and the publisher insists on docx file format using their custom styles.

Is there any chance this could get implemented ?

The workaround proposed works really great when you want to style all given elements the same way, or rename a given style to some other one. But when just some elements needs to use a specific style I don't see how it can be made to work.

Thanks !

@matthijskooijman

This comment has been minimized.

Show comment
Hide comment
@matthijskooijman

matthijskooijman Aug 3, 2016

@krtek4, I ended up using docutils's rst2odt tool with the --odf-config-file and --stylesheet options to customize the output styles, which ended up suiting by needs just enough (with a few minor customizations to better handle images, code, etc.). I haven't found time to clean up my build process well enough to publish it, but if you're interested I can do a quick sweep, tar it up and send it to you. If so, drop me an email to not further pollute this issue.

matthijskooijman commented Aug 3, 2016

@krtek4, I ended up using docutils's rst2odt tool with the --odf-config-file and --stylesheet options to customize the output styles, which ended up suiting by needs just enough (with a few minor customizations to better handle images, code, etc.). I haven't found time to clean up my build process well enough to publish it, but if you're interested I can do a quick sweep, tar it up and send it to you. If so, drop me an email to not further pollute this issue.

@josdirksen

This comment has been minimized.

Show comment
Hide comment
@josdirksen

josdirksen Aug 31, 2016

I'm writing a book for Packt, which also uses a crappy docx template. I'm now using pandoc to convert my markdown file to docx, and afer that use a couple of XSLT steps to convert the styles from pandoc to the styles required by the template from Packt. This seems to work ok for the styles I'm currently using (basic bullets, lists, images etc.)

I haven't needed tables yet, but that should also be doable I guess. If you're interested let me know.

josdirksen commented Aug 31, 2016

I'm writing a book for Packt, which also uses a crappy docx template. I'm now using pandoc to convert my markdown file to docx, and afer that use a couple of XSLT steps to convert the styles from pandoc to the styles required by the template from Packt. This seems to work ok for the styles I'm currently using (basic bullets, lists, images etc.)

I haven't needed tables yet, but that should also be doable I guess. If you're interested let me know.

@krtek4

This comment has been minimized.

Show comment
Hide comment
@krtek4

krtek4 Aug 31, 2016

Hi @josdirksen,

Finally I am use rst2odt like @matthijskooijman suggested. The tool allows for custom mapping from node types to ODT styles. For now I am able to manage with that quite well.

Hadn't thought of XSLT at the time, but that is also definitively a great idea :) Good continuation with your book.

krtek4 commented Aug 31, 2016

Hi @josdirksen,

Finally I am use rst2odt like @matthijskooijman suggested. The tool allows for custom mapping from node types to ODT styles. For now I am able to manage with that quite well.

Hadn't thought of XSLT at the time, but that is also definitively a great idea :) Good continuation with your book.

@mb21

This comment has been minimized.

Show comment
Hide comment
@mb21

mb21 Feb 4, 2017

Collaborator

Is this request fulfilled with http://pandoc.org/MANUAL.html#custom-styles-in-docx-output ? (at least for docx)

Collaborator

mb21 commented Feb 4, 2017

Is this request fulfilled with http://pandoc.org/MANUAL.html#custom-styles-in-docx-output ? (at least for docx)

@krtek4

This comment has been minimized.

Show comment
Hide comment
@krtek4

krtek4 Feb 4, 2017

As far as I can tell, this is exactly what is needed for docx. It is really similar to the feature I am using in rst2odt

krtek4 commented Feb 4, 2017

As far as I can tell, this is exactly what is needed for docx. It is really similar to the feature I am using in rst2odt

@mb21 mb21 changed the title from Custom styles in ODT / DOCX writers to Custom styles in ODT / ICML writers Feb 5, 2017

@mb21

This comment has been minimized.

Show comment
Hide comment
@mb21

mb21 Feb 5, 2017

Collaborator

Since this feature was implemented in docx, we might think about doing the same for ODT, ICML and maybe LaTeX and RST writers...

Collaborator

mb21 commented Feb 5, 2017

Since this feature was implemented in docx, we might think about doing the same for ODT, ICML and maybe LaTeX and RST writers...

@matthijskooijman

This comment has been minimized.

Show comment
Hide comment
@matthijskooijman

matthijskooijman Feb 6, 2017

From a quick glance at the docs, I think what I requested in this issue would be made possible with that new feature. However, I've long since finished writing my book, so I have no specific interest in this feature anymore, so feel free to close this issue (or leave it open if someone else wants to do actual testing of the feature and report back).

matthijskooijman commented Feb 6, 2017

From a quick glance at the docs, I think what I requested in this issue would be made possible with that new feature. However, I've long since finished writing my book, so I have no specific interest in this feature anymore, so feel free to close this issue (or leave it open if someone else wants to do actual testing of the feature and report back).

@simongareste

This comment has been minimized.

Show comment
Hide comment
@simongareste

simongareste Feb 28, 2017

@mb21: I am very much interested in this for odt, and available to test.

simongareste commented Feb 28, 2017

@mb21: I am very much interested in this for odt, and available to test.

@mb21

This comment has been minimized.

Show comment
Hide comment
@mb21

mb21 Feb 28, 2017

Collaborator

@simongareste Personally, I'm not too familiar with the ODT format... feel free to take a look at the writer generating the OpenDocument format and the one packaging it into a .odt zip, pull requests welcome!

Collaborator

mb21 commented Feb 28, 2017

@simongareste Personally, I'm not too familiar with the ODT format... feel free to take a look at the writer generating the OpenDocument format and the one packaging it into a .odt zip, pull requests welcome!

@marcban

This comment has been minimized.

Show comment
Hide comment
@marcban

marcban May 10, 2017

Unfortunately I'm not a Haskell developer, but here are some specs to help a little:

  • source md file : this is a <span custom-style="mystyle">custom style span</span>
  • in content.xml part of .odt file, this should generate:
<text:p text:style-name="Text_20_body">
  this is a <text:span text:style-name="mystyle">custom style span</text:span>
</text:p>`
  • in styles.xml part, this should add a style (if not present) under <office:styles> tag, with value:
<style:style style:name="mystyle" style:family="text">
	<style:text-properties fo:font-weight="bold"/>
</style:style>

I propose that the custom style is bold by default, so that it will be visible in the output. Of course, by using a custom template, the user can define his own style.

marcban commented May 10, 2017

Unfortunately I'm not a Haskell developer, but here are some specs to help a little:

  • source md file : this is a <span custom-style="mystyle">custom style span</span>
  • in content.xml part of .odt file, this should generate:
<text:p text:style-name="Text_20_body">
  this is a <text:span text:style-name="mystyle">custom style span</text:span>
</text:p>`
  • in styles.xml part, this should add a style (if not present) under <office:styles> tag, with value:
<style:style style:name="mystyle" style:family="text">
	<style:text-properties fo:font-weight="bold"/>
</style:style>

I propose that the custom style is bold by default, so that it will be visible in the output. Of course, by using a custom template, the user can define his own style.

@mb21 mb21 changed the title from Custom styles in ODT / ICML writers to Custom styles in ODT, ICML (and maybe LaTeX and RST) writers Nov 3, 2017

@mb21

This comment has been minimized.

Show comment
Hide comment
@mb21

mb21 Nov 3, 2017

Collaborator

I'm reposting here some discussion from the now closed #2542

+++ Mauro Bieg [Dec 31 15 09:38 ]:

In LaTeX those divs/spans could be rendered as custom
environments/commands. This was actually proposed by hadley in
#168 (comment), or do you think the concepts are not analogous enough?

+++ jgm:

I want to avoid generating environments and commands that
aren't defined (and similarly for styles in Word and ICML).
If we parse the styles, and thus know what is available,
that may not be a big problem. In LaTeX it's harder,
because commands and environments may be defined in
included packages. The idea of having a special prefix
like style- might be a good one.

Since the following already works for docx output:

::: {custom-style=poetry}
My example poem,
is bad.
:::

the idea is to change the syntax (and probably AST representation) to use a class with the style- prefix:

::: style-poetry
My example poem,
is bad.
:::

and make it work for ODT and ICML (and maybe even LaTeX and RST) as well (although for RST output, pandoc already uses the role attribute). For example LaTeX output:

\begin{poetry}
  My example poem, is bad.
\end{poetry}
Collaborator

mb21 commented Nov 3, 2017

I'm reposting here some discussion from the now closed #2542

+++ Mauro Bieg [Dec 31 15 09:38 ]:

In LaTeX those divs/spans could be rendered as custom
environments/commands. This was actually proposed by hadley in
#168 (comment), or do you think the concepts are not analogous enough?

+++ jgm:

I want to avoid generating environments and commands that
aren't defined (and similarly for styles in Word and ICML).
If we parse the styles, and thus know what is available,
that may not be a big problem. In LaTeX it's harder,
because commands and environments may be defined in
included packages. The idea of having a special prefix
like style- might be a good one.

Since the following already works for docx output:

::: {custom-style=poetry}
My example poem,
is bad.
:::

the idea is to change the syntax (and probably AST representation) to use a class with the style- prefix:

::: style-poetry
My example poem,
is bad.
:::

and make it work for ODT and ICML (and maybe even LaTeX and RST) as well (although for RST output, pandoc already uses the role attribute). For example LaTeX output:

\begin{poetry}
  My example poem, is bad.
\end{poetry}
@DaveJarvis

This comment has been minimized.

Show comment
Hide comment
@DaveJarvis

DaveJarvis Nov 7, 2017

Writing custom-style everywhere a custom style is required is redundant and differs in behaviour from HTML output. Consider:

::: {.projection}
    Executing reverse bias polarity on neurowafer 1 of 7.
    Estimated time remaining: 3h 39m 57s
:::

Thanks to #168, this produces the expected output:

<div class="projection">
<pre><code>Executing reverse bias polarity on neurowafer 1 of 7.
Estimated time remaining: 3h 39m 57s</code></pre>
</div>

However, when the output is ConTeXt, this produces the astonishing result:

\starttyping
Executing reverse bias polarity on neurowafer 1 of 7.
Estimated time remaining: 3h 39m 57s
\stoptyping

I consider it astonishing because the {.projection} class is swallowed, silently, instead of being retained in some manner, such as:

\starttyping[class=projection]
Executing reverse bias polarity on neurowafer 1 of 7.
Estimated time remaining: 3h 39m 57s
\stoptyping

Since pandoc cannot presume environments exist, it would be convenient if there was a command line argument that indicates that they do. Consider:

pandoc -t context --tex-environments --top-level-division=chapter file.md -o file.tex

Here, the --tex-environments option indicates that the developer (author) has created the required environments and that pandoc can use them. For ConTeXt, this could resemble:

\startprojection
\starttyping
Executing reverse bias polarity on neurowafer 1 of 7.
Estimated time remaining: 3h 39m 57s
\stoptyping
\stopprojection

Returning to the HTML output, if the source document contains:

::: {.style-projection}
    Executing reverse bias polarity on neurowafer 1 of 7.
    Estimated time remaining: 3h 39m 57s
:::

This would produce:

<div class="style-projection">
<pre><code>Executing reverse bias polarity on neurowafer 1 of 7.
Estimated time remaining: 3h 39m 57s</code></pre>
</div>

Clearly, the style- is superfluous and would only be necessary because there is no other way to coerce pandoc into generating LaTeX/ConTeXt environments. IMO, a command line argument would allow the Markdown document to be written as per the author's descriptive intent, without being encumbered with knowledge of the internal workings of pandoc (that is, how it treats Markdown differently depending on whether the output is HTML or TeX).

WDYT? @adityam? @jgm?

DaveJarvis commented Nov 7, 2017

Writing custom-style everywhere a custom style is required is redundant and differs in behaviour from HTML output. Consider:

::: {.projection}
    Executing reverse bias polarity on neurowafer 1 of 7.
    Estimated time remaining: 3h 39m 57s
:::

Thanks to #168, this produces the expected output:

<div class="projection">
<pre><code>Executing reverse bias polarity on neurowafer 1 of 7.
Estimated time remaining: 3h 39m 57s</code></pre>
</div>

However, when the output is ConTeXt, this produces the astonishing result:

\starttyping
Executing reverse bias polarity on neurowafer 1 of 7.
Estimated time remaining: 3h 39m 57s
\stoptyping

I consider it astonishing because the {.projection} class is swallowed, silently, instead of being retained in some manner, such as:

\starttyping[class=projection]
Executing reverse bias polarity on neurowafer 1 of 7.
Estimated time remaining: 3h 39m 57s
\stoptyping

Since pandoc cannot presume environments exist, it would be convenient if there was a command line argument that indicates that they do. Consider:

pandoc -t context --tex-environments --top-level-division=chapter file.md -o file.tex

Here, the --tex-environments option indicates that the developer (author) has created the required environments and that pandoc can use them. For ConTeXt, this could resemble:

\startprojection
\starttyping
Executing reverse bias polarity on neurowafer 1 of 7.
Estimated time remaining: 3h 39m 57s
\stoptyping
\stopprojection

Returning to the HTML output, if the source document contains:

::: {.style-projection}
    Executing reverse bias polarity on neurowafer 1 of 7.
    Estimated time remaining: 3h 39m 57s
:::

This would produce:

<div class="style-projection">
<pre><code>Executing reverse bias polarity on neurowafer 1 of 7.
Estimated time remaining: 3h 39m 57s</code></pre>
</div>

Clearly, the style- is superfluous and would only be necessary because there is no other way to coerce pandoc into generating LaTeX/ConTeXt environments. IMO, a command line argument would allow the Markdown document to be written as per the author's descriptive intent, without being encumbered with knowledge of the internal workings of pandoc (that is, how it treats Markdown differently depending on whether the output is HTML or TeX).

WDYT? @adityam? @jgm?

@jcbagneris

This comment has been minimized.

Show comment
Hide comment
@jcbagneris

jcbagneris Nov 9, 2017

Hello there,

I do agree that it would be convenient to have a way to tell pandoc "create an environment, I know what I am doing" but instead of a general option, I guess that an extension would be more adapted.

Something in the line of:

pandoc -f markdown -t latex+create_envs

That would allow for the extension to be output dependent, which is probably needed.

Fenced divs are great, let's make good use of those :)

Thanks

jcbagneris commented Nov 9, 2017

Hello there,

I do agree that it would be convenient to have a way to tell pandoc "create an environment, I know what I am doing" but instead of a general option, I guess that an extension would be more adapted.

Something in the line of:

pandoc -f markdown -t latex+create_envs

That would allow for the extension to be output dependent, which is probably needed.

Fenced divs are great, let's make good use of those :)

Thanks

@DaveJarvis

This comment has been minimized.

Show comment
Hide comment
@DaveJarvis

DaveJarvis Nov 9, 2017

An extension would also work, though my preference would be for slightly different names:

pandoc -f markdown -t context+environments
pandoc -f markdown -t latex+environments

DaveJarvis commented Nov 9, 2017

An extension would also work, though my preference would be for slightly different names:

pandoc -f markdown -t context+environments
pandoc -f markdown -t latex+environments
@jcbagneris

This comment has been minimized.

Show comment
Hide comment
@jcbagneris

jcbagneris Nov 10, 2017

Ah, I just meant to advocate for extension vs. option. I don't really care about the name, provided it's meaningful. +environments is ok for me.

jcbagneris commented Nov 10, 2017

Ah, I just meant to advocate for extension vs. option. I don't really care about the name, provided it's meaningful. +environments is ok for me.

@mb21 mb21 changed the title from Custom styles in ODT, ICML (and maybe LaTeX and RST) writers to Custom styles: fenced div syntax and ODT, ICML (and maybe LaTeX and RST) writers Dec 11, 2017

@mb21

This comment has been minimized.

Show comment
Hide comment
@mb21

mb21 Dec 11, 2017

Collaborator

In order to use the fenced div class syntax for custom-styles, the user has to specify his intent to use the div as such an environment - either by switching on an extension (e.g. +environment) or using a prefix like style- on the class.

The prefix is more verbose, but also more explicit: you then could have multiple divs in the same document and have only some converted to custom-styles aka environments. The writers could then strip the style- prefix when creating the style name. Do you think writing style- a few times is so bad?

Collaborator

mb21 commented Dec 11, 2017

In order to use the fenced div class syntax for custom-styles, the user has to specify his intent to use the div as such an environment - either by switching on an extension (e.g. +environment) or using a prefix like style- on the class.

The prefix is more verbose, but also more explicit: you then could have multiple divs in the same document and have only some converted to custom-styles aka environments. The writers could then strip the style- prefix when creating the style name. Do you think writing style- a few times is so bad?

@iandol

This comment has been minimized.

Show comment
Hide comment
@iandol

iandol Dec 12, 2017

Contributor

On the duplicate #4139, I had asked for just a simple translation from ::: style ::: to a paragraph style in a format such as DOCX. The argument here is that to make this universal across other writers like LaTeX, an implicit conversion is not preferred by @jgm (although implicit conversion is the default for HTML presently). So the current options are to manually enable a universal mechanism (via either style- in the document or a +environment).

A third option is that this could depend on the writer, implicit for ODT/DOCX/HTML, explicit for LaTex and friends that change the actual output structure. I suspect this is also not preferred by @jgm, but Pandoc already has implicit HTML support and I think a div.class is semantically identical to a paragraph.style which is why the implicit rule could be extended to this type of document structured output...

Contributor

iandol commented Dec 12, 2017

On the duplicate #4139, I had asked for just a simple translation from ::: style ::: to a paragraph style in a format such as DOCX. The argument here is that to make this universal across other writers like LaTeX, an implicit conversion is not preferred by @jgm (although implicit conversion is the default for HTML presently). So the current options are to manually enable a universal mechanism (via either style- in the document or a +environment).

A third option is that this could depend on the writer, implicit for ODT/DOCX/HTML, explicit for LaTex and friends that change the actual output structure. I suspect this is also not preferred by @jgm, but Pandoc already has implicit HTML support and I think a div.class is semantically identical to a paragraph.style which is why the implicit rule could be extended to this type of document structured output...

@DaveJarvis

This comment has been minimized.

Show comment
Hide comment
@DaveJarvis

DaveJarvis Dec 12, 2017

Do you think writing style- a few times is so bad?

Yes. It's redundant, verbose, inconsistent, and assumes pandoc-specific implementation knowledge. (Aside, there's no way to know how many times a style will be used in a given document.)

The writers could then strip the style- prefix when creating the style name.

Seems to violate the KISS principle: it incurs extra work (post-processing) to strip the prefix. Simpler (for writers) to not use or require a prefix altogether.

DaveJarvis commented Dec 12, 2017

Do you think writing style- a few times is so bad?

Yes. It's redundant, verbose, inconsistent, and assumes pandoc-specific implementation knowledge. (Aside, there's no way to know how many times a style will be used in a given document.)

The writers could then strip the style- prefix when creating the style name.

Seems to violate the KISS principle: it incurs extra work (post-processing) to strip the prefix. Simpler (for writers) to not use or require a prefix altogether.

@mb21

This comment has been minimized.

Show comment
Hide comment
@mb21

mb21 Dec 13, 2017

Collaborator

@DaveJarvis but you agree that the toggle-by-extension-approach has the disadvantage that in one document, you cannot use some divs for custom-styling while using other divs for other purposes, right?

Collaborator

mb21 commented Dec 13, 2017

@DaveJarvis but you agree that the toggle-by-extension-approach has the disadvantage that in one document, you cannot use some divs for custom-styling while using other divs for other purposes, right?

@DaveJarvis

This comment has been minimized.

Show comment
Hide comment
@DaveJarvis

DaveJarvis Dec 13, 2017

you cannot use some divs for custom-styling while using other divs for other purposes

There can be options for both scenarios. For example, the custom styles could be explicitly listed once on the command line (or read from an external data source), rather than repeated throughout the text, freeing the unlisted div demarcations for other purposes. Or, depending on what list is shorter, an option to list the unstyled divs may be useful.

Also, a style- prefix suggests that the class is used for style-ing, which hints at presentation logic. As you pointed out, classes can be used for more than styling. Whether and how content marked with classes is styled or otherwise manipulated belongs outside of the document.

$ pandoc --styled poetry,stanza,prose ...
$ pandoc --styled-file styled.txt ...
$ cat styled.txt
poetry
stanza
prose
$ pandoc --unstyled foo,bar ...
$ pandoc --unstyled-file unstyled.txt ...
$ cat unstyled.txt
foo
bar

DaveJarvis commented Dec 13, 2017

you cannot use some divs for custom-styling while using other divs for other purposes

There can be options for both scenarios. For example, the custom styles could be explicitly listed once on the command line (or read from an external data source), rather than repeated throughout the text, freeing the unlisted div demarcations for other purposes. Or, depending on what list is shorter, an option to list the unstyled divs may be useful.

Also, a style- prefix suggests that the class is used for style-ing, which hints at presentation logic. As you pointed out, classes can be used for more than styling. Whether and how content marked with classes is styled or otherwise manipulated belongs outside of the document.

$ pandoc --styled poetry,stanza,prose ...
$ pandoc --styled-file styled.txt ...
$ cat styled.txt
poetry
stanza
prose
$ pandoc --unstyled foo,bar ...
$ pandoc --unstyled-file unstyled.txt ...
$ cat unstyled.txt
foo
bar
@mb21

This comment has been minimized.

Show comment
Hide comment
@mb21

mb21 Dec 14, 2017

Collaborator

So we have three options now (the exact naming/implementation is also still debatable):

  • mark up divs with style- prefixed class
  • +environments extension that would tell pandoc to treat all divs as custom-styled-environments
  • --styled argument that lists all divs to be styled

I'd like to get some feedback from more people on which of the three they prefer... @jgm?

Collaborator

mb21 commented Dec 14, 2017

So we have three options now (the exact naming/implementation is also still debatable):

  • mark up divs with style- prefixed class
  • +environments extension that would tell pandoc to treat all divs as custom-styled-environments
  • --styled argument that lists all divs to be styled

I'd like to get some feedback from more people on which of the three they prefer... @jgm?

@jkr

This comment has been minimized.

Show comment
Hide comment
@jkr

jkr Dec 14, 2017

Collaborator

I think I prefer (a) explicit over implict, and (b) not having to change the command-line every time you change the source file (or consult a long file as reference for the sake of constructing a command line). For those reasons, FWIW, I guess I like the first option (style-) and pretty strongly dislike the others.

  • I think +environments could have major side-effects in moving to word-processing docs, where you might not even know the names of all possible styles in your reference file.

  • Keeping a list of all environments to be made into a custom style seems overly difficult to maintain. I can't think of another place in pandoc where the user has to specify content from within the document on the command line.

So -- style- seems the best of the three. Perhaps the least elegant, but more easier to use and maintain precisely for that reason.

BUT I'm not sure I see why style-* is preferable to custom-style=* (or some shorter key-value attr, style=*). It saves a few keystrokes, and replaces an = with a hyphen, at the expense of parsing values, and essentially creating an inconsistent ad-hoc syntax for certain class attributes. It's not a big deal, but it seems a bit more obscure for not that much gain.

Collaborator

jkr commented Dec 14, 2017

I think I prefer (a) explicit over implict, and (b) not having to change the command-line every time you change the source file (or consult a long file as reference for the sake of constructing a command line). For those reasons, FWIW, I guess I like the first option (style-) and pretty strongly dislike the others.

  • I think +environments could have major side-effects in moving to word-processing docs, where you might not even know the names of all possible styles in your reference file.

  • Keeping a list of all environments to be made into a custom style seems overly difficult to maintain. I can't think of another place in pandoc where the user has to specify content from within the document on the command line.

So -- style- seems the best of the three. Perhaps the least elegant, but more easier to use and maintain precisely for that reason.

BUT I'm not sure I see why style-* is preferable to custom-style=* (or some shorter key-value attr, style=*). It saves a few keystrokes, and replaces an = with a hyphen, at the expense of parsing values, and essentially creating an inconsistent ad-hoc syntax for certain class attributes. It's not a big deal, but it seems a bit more obscure for not that much gain.

@DaveJarvis

This comment has been minimized.

Show comment
Hide comment
@DaveJarvis

DaveJarvis Dec 14, 2017

How would I have to mark-up a paragraph so that the same class is generated in HTML and ConTeXt without having to change the source document?

I think +environments could have major side-effects in moving to word-processing docs, where you might not even know the names of all possible styles in your reference file.

Then don't use it when generating those formats?

Consider a source document that must be styled differently depending on the output format (HTML, TeX, ODS, etc.). Changing the source document (either manually or by post-processing) for different output document formats subverts some benefits of using Markdown (output format agnostic, clean separation of content and presentation). I think we're agreed on this point.

Ideally, the source Markdown document would not include pandoc-specific instructions (mark up) to control pandoc's output behaviour, either. This is where we disagree.

Perhaps different modes could allow both types of documents? One where the styles are embedded in the document, for those for whom maintaining the list would be burdensome, and another mode where the styles are listed for those who know the list of styles rarely changes? This gives people the freedom to choose how they want to mark up their documents.

DaveJarvis commented Dec 14, 2017

How would I have to mark-up a paragraph so that the same class is generated in HTML and ConTeXt without having to change the source document?

I think +environments could have major side-effects in moving to word-processing docs, where you might not even know the names of all possible styles in your reference file.

Then don't use it when generating those formats?

Consider a source document that must be styled differently depending on the output format (HTML, TeX, ODS, etc.). Changing the source document (either manually or by post-processing) for different output document formats subverts some benefits of using Markdown (output format agnostic, clean separation of content and presentation). I think we're agreed on this point.

Ideally, the source Markdown document would not include pandoc-specific instructions (mark up) to control pandoc's output behaviour, either. This is where we disagree.

Perhaps different modes could allow both types of documents? One where the styles are embedded in the document, for those for whom maintaining the list would be burdensome, and another mode where the styles are listed for those who know the list of styles rarely changes? This gives people the freedom to choose how they want to mark up their documents.

@iandol

This comment has been minimized.

Show comment
Hide comment
@iandol

iandol Dec 17, 2017

Contributor

I also feel it is against the spirit of Pandoc to have to change the source document for different output formats, so vote against style- or other fussy in-doc markup. I don't need to have different "purposes" for divs, I would only use divs / spans to add a structural separator (named style in a word processing doc or a style on a HTML element) to a section of a document; wouldn't this be the major use of this feature? +environments would the simpler option in my use case if we have to make this explicit in the pandoc settings. If we had to use --styled I'd argue the logic is inverted, we should use --unstyled, i.e. exclude divs that we don't want converted rather than those we do, but again I'm coming at this from a HTML/ODT/DOCX perspective where style is a semantic label for a container.

So this complexity only exists for LateX-family writers (where divs are used for multiple uses?), so I raise the simplest option of all again: why not make other formats like word processors implicit like HTML already is?

Contributor

iandol commented Dec 17, 2017

I also feel it is against the spirit of Pandoc to have to change the source document for different output formats, so vote against style- or other fussy in-doc markup. I don't need to have different "purposes" for divs, I would only use divs / spans to add a structural separator (named style in a word processing doc or a style on a HTML element) to a section of a document; wouldn't this be the major use of this feature? +environments would the simpler option in my use case if we have to make this explicit in the pandoc settings. If we had to use --styled I'd argue the logic is inverted, we should use --unstyled, i.e. exclude divs that we don't want converted rather than those we do, but again I'm coming at this from a HTML/ODT/DOCX perspective where style is a semantic label for a container.

So this complexity only exists for LateX-family writers (where divs are used for multiple uses?), so I raise the simplest option of all again: why not make other formats like word processors implicit like HTML already is?

@mb21

This comment has been minimized.

Show comment
Hide comment
@mb21

mb21 Dec 18, 2017

Collaborator

@DaveJarvis and @iandol, you are mentioning not having to change the source document for different output formats. While I would usually agree with you that the pandoc conversion should "just work", I can't really see this happening here.

If you want to take advantage of custom styles/environments defined in a reference.docx or custom LaTeX template, you'll always have to adjust the input to contain the names of those styles/environments. It is in the very nature of a "custom" environment. It's not a standardized element like bold or a link that pandoc can automatically translate. If we blindly take all the classes in HTML input and automatically produce docx styles with the same name, you'll have nothing gained, since those style-names in all likelihood don't exist in your reference.docx.

Collaborator

mb21 commented Dec 18, 2017

@DaveJarvis and @iandol, you are mentioning not having to change the source document for different output formats. While I would usually agree with you that the pandoc conversion should "just work", I can't really see this happening here.

If you want to take advantage of custom styles/environments defined in a reference.docx or custom LaTeX template, you'll always have to adjust the input to contain the names of those styles/environments. It is in the very nature of a "custom" environment. It's not a standardized element like bold or a link that pandoc can automatically translate. If we blindly take all the classes in HTML input and automatically produce docx styles with the same name, you'll have nothing gained, since those style-names in all likelihood don't exist in your reference.docx.

@iandol

This comment has been minimized.

Show comment
Hide comment
@iandol

iandol Dec 19, 2017

Contributor

HTML is exactly the same, you can add a class to an element which may or may not be present in the CSS. The element has been semantically tagged, but unless the CSS does something presentational with it, it will not be visualised. There is no conceptual difference in my mind to what I would expect of DOCX (except the conversion may create default styles if none exist in the reference.docx, but they will inherit from Normal and thus not be visible, that is OK). I don't believe HTML is a special case conceptually, thus don't think it should be a special case practically…

The point is the semantic tags are now present in the document. In Word you simply edit the style of one of these tagged paragraphs, and all the others with the same semantic tag will adjust automatically, just as if we add a CSS rule to the HTML... The markdown author obviously wants a tag, as why would he use a fenced div with an explicit class name otherwise?

My knowledge of LaTeX is weak, so can't comment on how environments are conceptually considered there.

Contributor

iandol commented Dec 19, 2017

HTML is exactly the same, you can add a class to an element which may or may not be present in the CSS. The element has been semantically tagged, but unless the CSS does something presentational with it, it will not be visualised. There is no conceptual difference in my mind to what I would expect of DOCX (except the conversion may create default styles if none exist in the reference.docx, but they will inherit from Normal and thus not be visible, that is OK). I don't believe HTML is a special case conceptually, thus don't think it should be a special case practically…

The point is the semantic tags are now present in the document. In Word you simply edit the style of one of these tagged paragraphs, and all the others with the same semantic tag will adjust automatically, just as if we add a CSS rule to the HTML... The markdown author obviously wants a tag, as why would he use a fenced div with an explicit class name otherwise?

My knowledge of LaTeX is weak, so can't comment on how environments are conceptually considered there.

@mb21

This comment has been minimized.

Show comment
Hide comment
@mb21

mb21 Dec 19, 2017

Collaborator

the conversion may create default styles if none exist in the reference.docx, but they will inherit from Normal

Will this not lead to a huge list of styles that clutter the GUI in Word?

In LaTeX, the styles cannot (easily) be generated but need to be defined in the template before you can use them (otherwise the PDF compilation will fail).

Collaborator

mb21 commented Dec 19, 2017

the conversion may create default styles if none exist in the reference.docx, but they will inherit from Normal

Will this not lead to a huge list of styles that clutter the GUI in Word?

In LaTeX, the styles cannot (easily) be generated but need to be defined in the template before you can use them (otherwise the PDF compilation will fail).

@iandol

This comment has been minimized.

Show comment
Hide comment
@iandol

iandol Dec 19, 2017

Contributor

No, by default Word or LibreOffice does not show all styles in its UI, indeed I consider it a deficit as it encourages a mess of adhoc formatting and users ignorant to the benefit of styles... I always enable the optional style sidebar in both while I work. And both offer good tools to manage large style lists even if a user were to add many different named fenced divs!

More fundamentally, a user has deliberately used a fenced div and deliberately given it a class name. They are deliberately identifying this paragraph/span. It seems to me that the default behaviour should be, just as for more specific markup like code blocks etc. to convert this deliberate intent as the default. It aligns better with the existing HTML behaviour, the document structure, and the tools afforded by the editors for these output formats...

Contributor

iandol commented Dec 19, 2017

No, by default Word or LibreOffice does not show all styles in its UI, indeed I consider it a deficit as it encourages a mess of adhoc formatting and users ignorant to the benefit of styles... I always enable the optional style sidebar in both while I work. And both offer good tools to manage large style lists even if a user were to add many different named fenced divs!

More fundamentally, a user has deliberately used a fenced div and deliberately given it a class name. They are deliberately identifying this paragraph/span. It seems to me that the default behaviour should be, just as for more specific markup like code blocks etc. to convert this deliberate intent as the default. It aligns better with the existing HTML behaviour, the document structure, and the tools afforded by the editors for these output formats...

@DaveJarvis

This comment has been minimized.

Show comment
Hide comment
@DaveJarvis

DaveJarvis Dec 19, 2017

or custom LaTeX template, you'll always have to adjust the input to contain the names of those styles/environments. It is in the very nature of a "custom" environment.

How does it follow that the input needs to be adjusted, or pandoc informed, of the style names?

Consider the following source document:

::: {.poem}
It was many and many a year ago,
In a kingdom by the sea,
That a maiden there lived whom you may know
By the name of Annabel Lee;
:::

This can be run through pandoc as follows:

pandoc -f markdown -t context+environments < poem.md > poem.tex

To produce the following TeX output document:

\startpoem
It was many and many a year ago,
In a kingdom by the sea,
That a maiden there lived whom you may know
By the name of Annabel Lee;
\stoppoem

Knowledge about the input document is not necessary. Rather, the -t context+environments option is the author's way of indicating that this is deliberate. The style name is appended verbatim to the \start and \stop macros. The +environments command line option denotes that the poem start/stop environment (\startpoem and \stoppoem) has been defined elsewhere. These environments don't need to be listed for each style.

For LaTeX, a similar command line invocation may produce:

\begin{poem}
It was many and many a year ago,
In a kingdom by the sea,
That a maiden there lived whom you may know
By the name of Annabel Lee;
\end{poem}

Again, it is the author's responsibility to ensure that these environments already exist. The user manual can clearly state this requirement. (Aside, I suspect that most people who use *TeX know that documents will not compile if coded incorrectly.)

DaveJarvis commented Dec 19, 2017

or custom LaTeX template, you'll always have to adjust the input to contain the names of those styles/environments. It is in the very nature of a "custom" environment.

How does it follow that the input needs to be adjusted, or pandoc informed, of the style names?

Consider the following source document:

::: {.poem}
It was many and many a year ago,
In a kingdom by the sea,
That a maiden there lived whom you may know
By the name of Annabel Lee;
:::

This can be run through pandoc as follows:

pandoc -f markdown -t context+environments < poem.md > poem.tex

To produce the following TeX output document:

\startpoem
It was many and many a year ago,
In a kingdom by the sea,
That a maiden there lived whom you may know
By the name of Annabel Lee;
\stoppoem

Knowledge about the input document is not necessary. Rather, the -t context+environments option is the author's way of indicating that this is deliberate. The style name is appended verbatim to the \start and \stop macros. The +environments command line option denotes that the poem start/stop environment (\startpoem and \stoppoem) has been defined elsewhere. These environments don't need to be listed for each style.

For LaTeX, a similar command line invocation may produce:

\begin{poem}
It was many and many a year ago,
In a kingdom by the sea,
That a maiden there lived whom you may know
By the name of Annabel Lee;
\end{poem}

Again, it is the author's responsibility to ensure that these environments already exist. The user manual can clearly state this requirement. (Aside, I suspect that most people who use *TeX know that documents will not compile if coded incorrectly.)

@mb21

This comment has been minimized.

Show comment
Hide comment
@mb21

mb21 Dec 20, 2017

Collaborator

I guess where the concerns that jkr, jgm and me have voiced come from, is that throwing everything that vaguely resembles a class/role/environment/style into the same namespace just feels very wrong. (That’s probably why we have the custom-style attribute instead of just using the class.)

Naming is actually a well known problem in Computer Science and is usually solved with hierarchical namespaces. But what all notions of class/role/environment/style we’re discussing here have in common, is that they are neither hierarchical, nor have a standardized vocabulary. (That’s what I meant above with ‘it is in the very nature of a "custom" environment’ to have an unknown name.)

Coming up with good class names is hard enough in HTML; there are multiple conventions and naming schemes that try to somewhat standardize this: BEM, or using only classes for CSS and only ids or even only data-attributes for JavaScript. This problem only gets worse when you consider other formats like docx, ICML and LaTeX where common naming practices might be quite different.

The following is unlikely to yield good results:

curl http://somerandomwebsite.com | pandoc -f html -o out.docx

I’m not sure where this leaves us for pandoc. It’s possible that throwing in the towel is the most practical solution: Just give them a +environments extension, and tell them “you are on your own and we’re throwing in all identifiers in the same namespace (namely the class attribute)”. As long as the author carefully crafts his input, and knows the target environment, she’ll be fine. If not, then don’t use the extension, or customize the behaviour with a filter (e.g. one whitelisting certain class names).

Collaborator

mb21 commented Dec 20, 2017

I guess where the concerns that jkr, jgm and me have voiced come from, is that throwing everything that vaguely resembles a class/role/environment/style into the same namespace just feels very wrong. (That’s probably why we have the custom-style attribute instead of just using the class.)

Naming is actually a well known problem in Computer Science and is usually solved with hierarchical namespaces. But what all notions of class/role/environment/style we’re discussing here have in common, is that they are neither hierarchical, nor have a standardized vocabulary. (That’s what I meant above with ‘it is in the very nature of a "custom" environment’ to have an unknown name.)

Coming up with good class names is hard enough in HTML; there are multiple conventions and naming schemes that try to somewhat standardize this: BEM, or using only classes for CSS and only ids or even only data-attributes for JavaScript. This problem only gets worse when you consider other formats like docx, ICML and LaTeX where common naming practices might be quite different.

The following is unlikely to yield good results:

curl http://somerandomwebsite.com | pandoc -f html -o out.docx

I’m not sure where this leaves us for pandoc. It’s possible that throwing in the towel is the most practical solution: Just give them a +environments extension, and tell them “you are on your own and we’re throwing in all identifiers in the same namespace (namely the class attribute)”. As long as the author carefully crafts his input, and knows the target environment, she’ll be fine. If not, then don’t use the extension, or customize the behaviour with a filter (e.g. one whitelisting certain class names).

@iandol

This comment has been minimized.

Show comment
Hide comment
@iandol

iandol Dec 20, 2017

Contributor

@mb21 — thank you for the explanation. I agree in a formal logical sense (especially for programming) that naming can be a complicated space. But to my mind at least writing many kinds of texts in a structured manner is not equivalent to programming. It would be incredibly rare that a wordsmith would require hierarchical levels of structural content, let alone one that had to be standardised across all types of wordsmiths.

The fact that the "vocabulary" is non-standard is part of the flexibility of structured writing. Poets can use blocks named so they are relevant to poetry, scientists to science. The wordsmith-specific blocks being one-level deep is not a problem: I don't think name collision is a significant issue (i.e. an block called "allegory" in one part of a document is highly probably an "allegory" in another). And for DOCX/ODT/ICML their style system will not break a document (as is the potential for LaTeX), will not obscure a UI.

So while I respect your formalist unease; for a large variety of writing, it is not in my mind a practical issue. I would standardise writers into those that would depend on a custom environment (i.e. LaTeX, requiring a command-line enable) and those that don't (i.e. HTML, DOCX, default enabled, optional command line disable). But it may be that the formalist would require +environments for both classes...

Contributor

iandol commented Dec 20, 2017

@mb21 — thank you for the explanation. I agree in a formal logical sense (especially for programming) that naming can be a complicated space. But to my mind at least writing many kinds of texts in a structured manner is not equivalent to programming. It would be incredibly rare that a wordsmith would require hierarchical levels of structural content, let alone one that had to be standardised across all types of wordsmiths.

The fact that the "vocabulary" is non-standard is part of the flexibility of structured writing. Poets can use blocks named so they are relevant to poetry, scientists to science. The wordsmith-specific blocks being one-level deep is not a problem: I don't think name collision is a significant issue (i.e. an block called "allegory" in one part of a document is highly probably an "allegory" in another). And for DOCX/ODT/ICML their style system will not break a document (as is the potential for LaTeX), will not obscure a UI.

So while I respect your formalist unease; for a large variety of writing, it is not in my mind a practical issue. I would standardise writers into those that would depend on a custom environment (i.e. LaTeX, requiring a command-line enable) and those that don't (i.e. HTML, DOCX, default enabled, optional command line disable). But it may be that the formalist would require +environments for both classes...

@DaveJarvis

This comment has been minimized.

Show comment
Hide comment
@DaveJarvis

DaveJarvis Dec 20, 2017

the same namespace just feels very wrong.

From https://tex.stackexchange.com/a/37092/2148:

Apart from the experimental ExTeX program, none of the *TeX engines support namespaces...

Prefix your macros with a consistent naming scheme (e.g., jonathan@). The chance that a package uses the same prefix is small; in such a case change your macro prefix.

See also: https://www.texdev.net/2009/01/02/tex-and-namespaces/

DaveJarvis commented Dec 20, 2017

the same namespace just feels very wrong.

From https://tex.stackexchange.com/a/37092/2148:

Apart from the experimental ExTeX program, none of the *TeX engines support namespaces...

Prefix your macros with a consistent naming scheme (e.g., jonathan@). The chance that a package uses the same prefix is small; in such a case change your macro prefix.

See also: https://www.texdev.net/2009/01/02/tex-and-namespaces/

@labdsf labdsf added the format:ODT label Feb 3, 2018

@jzeneto

This comment has been minimized.

Show comment
Hide comment
@jzeneto

jzeneto Feb 12, 2018

Hi, is there any progress on this issue, apart from discussion about "custom-style" x "class name"? I'm very interested in custom-styles in ODT. I've tried get this writing a Lua filter, but without success.

I'm interested in testing this issue on ODT.

(Personally, I did already think that .class would be more clean than custom-style, but reading here showed me that this is not that simple)

jzeneto commented Feb 12, 2018

Hi, is there any progress on this issue, apart from discussion about "custom-style" x "class name"? I'm very interested in custom-styles in ODT. I've tried get this writing a Lua filter, but without success.

I'm interested in testing this issue on ODT.

(Personally, I did already think that .class would be more clean than custom-style, but reading here showed me that this is not that simple)

@mb21 mb21 referenced this issue Feb 27, 2018

Merged

RFC: Verbose docx #4299

@hadley

This comment has been minimized.

Show comment
Hide comment
@hadley

hadley Feb 27, 2018

I too would love to see this feature (in any of the proposed forms), and I'd be happy to supply some consulting dollars if that would help.

hadley commented Feb 27, 2018

I too would love to see this feature (in any of the proposed forms), and I'd be happy to supply some consulting dollars if that would help.

@jkr

This comment has been minimized.

Show comment
Hide comment
@jkr

jkr Feb 27, 2018

Collaborator
Collaborator

jkr commented Feb 27, 2018

@hadley

This comment has been minimized.

Show comment
Hide comment
@hadley

hadley Feb 27, 2018

Ok, awesome! Thanks for the update. Just let me know if there's anything I can do to help.

hadley commented Feb 27, 2018

Ok, awesome! Thanks for the update. Just let me know if there's anything I can do to help.

@marcban

This comment has been minimized.

Show comment
Hide comment
@marcban

marcban Feb 28, 2018

actually, for the ODT format, the output would be quite simple

for example, for span style, in content.xml part of .odt file, this would generate:

  this is a <text:span text:style-name="bar">custom style span</text:span>`

in styles.xml part, this should add a style (if not present) under office:styles tag, with value:

<style:style style:name="bar" style:family="text">
	<style:text-properties fo:font-weight="bold"/>
</style:style>

marcban commented Feb 28, 2018

actually, for the ODT format, the output would be quite simple

for example, for span style, in content.xml part of .odt file, this would generate:

  this is a <text:span text:style-name="bar">custom style span</text:span>`

in styles.xml part, this should add a style (if not present) under office:styles tag, with value:

<style:style style:name="bar" style:family="text">
	<style:text-properties fo:font-weight="bold"/>
</style:style>
@fintelkai

This comment has been minimized.

Show comment
Hide comment
@fintelkai

fintelkai Mar 7, 2018

I would really really like this. I'm imagining writing

::: whisper :::
Psst! - except for those possibilities that we are properly ignoring
:::

in an md file destined to be a LaTeX beamer presentation. And then my whisper environment (defined in my custom document class) could take care of the typesetting (small, grayed out, etc.)

fintelkai commented Mar 7, 2018

I would really really like this. I'm imagining writing

::: whisper :::
Psst! - except for those possibilities that we are properly ignoring
:::

in an md file destined to be a LaTeX beamer presentation. And then my whisper environment (defined in my custom document class) could take care of the typesetting (small, grayed out, etc.)

@jgm

This comment has been minimized.

Show comment
Hide comment
@jgm

jgm Mar 8, 2018

Owner

@fintelkai - you don't need to wait for this to get implemented. It's really easy to make this work using lua filters (requires a recent pandoc version). Create whisper.lua:

function Div(el)
  if el.classes:includes("whisper") then
    return { pandoc.RawBlock("latex", "\\begin{whisper}"),
             el,
             pandoc.RawBlock("latex", "\\end{whisper}") }
  end
end

Then run with --lua-filter whisper.lua, and your whisper Div will be magically transformed into a whisper environment.

Owner

jgm commented Mar 8, 2018

@fintelkai - you don't need to wait for this to get implemented. It's really easy to make this work using lua filters (requires a recent pandoc version). Create whisper.lua:

function Div(el)
  if el.classes:includes("whisper") then
    return { pandoc.RawBlock("latex", "\\begin{whisper}"),
             el,
             pandoc.RawBlock("latex", "\\end{whisper}") }
  end
end

Then run with --lua-filter whisper.lua, and your whisper Div will be magically transformed into a whisper environment.

@fintelkai

This comment has been minimized.

Show comment
Hide comment
@fintelkai

fintelkai Mar 8, 2018

@jgm Whoa, insert your favorite "mind blown" gif! Thanks, John.

fintelkai commented Mar 8, 2018

@jgm Whoa, insert your favorite "mind blown" gif! Thanks, John.

@jkr

This comment has been minimized.

Show comment
Hide comment
@jkr

jkr Mar 8, 2018

Collaborator

@fintelkai: note that you can also use this basic idea to make a more general (but still very short) script beyond just the whisper class. Let's say you wanted any class of the form tex-foo to produce a block of type foo. You could do this:

function Div(el)
   local kls, _ = el.classes:find_if(function (s) return string.match(s, "^tex%-") end)
   if kls then
      local texkls = kls:gsub("^tex%-","",1)
      if texkls then
	 return { pandoc.RawBlock("latex", "\\begin{" .. texkls .. "}"),
		  el,
		  pandoc.RawBlock("latex", "\\end{" .. texkls ..  "}")}
      end
   end
end

call this texclass.lua, then given this markdown file

::: tex-whisper :::

this

that

:::

Hello

::: shouting

yay

:::

::: tex-shouting :::

beep

boop

:::

running pandoc input.md --lua-filter=texclass.lua -t latex would get you:

\begin{whisper}

this

that

\end{whisper}

Hello

yay

\begin{shouting}

beep

boop

\end{shouting}

Collaborator

jkr commented Mar 8, 2018

@fintelkai: note that you can also use this basic idea to make a more general (but still very short) script beyond just the whisper class. Let's say you wanted any class of the form tex-foo to produce a block of type foo. You could do this:

function Div(el)
   local kls, _ = el.classes:find_if(function (s) return string.match(s, "^tex%-") end)
   if kls then
      local texkls = kls:gsub("^tex%-","",1)
      if texkls then
	 return { pandoc.RawBlock("latex", "\\begin{" .. texkls .. "}"),
		  el,
		  pandoc.RawBlock("latex", "\\end{" .. texkls ..  "}")}
      end
   end
end

call this texclass.lua, then given this markdown file

::: tex-whisper :::

this

that

:::

Hello

::: shouting

yay

:::

::: tex-shouting :::

beep

boop

:::

running pandoc input.md --lua-filter=texclass.lua -t latex would get you:

\begin{whisper}

this

that

\end{whisper}

Hello

yay

\begin{shouting}

beep

boop

\end{shouting}

@bumatic

This comment has been minimized.

Show comment
Hide comment
@bumatic

bumatic Jun 17, 2018

It's great that paragraph styles can be preserved when converting docx -> md -> docx now! A feature that I was long waiting for. For me (and judging from this and a bunch of other open issues for others as well), it would be great if these custom styles could be written to icml (and odt and possibly all other output formats). I know this takes time and I appreciate the effort. Really! In case this can be achieved with lua-filters for writing to icml similar to the solution @jgm proposed to @fintelkai on March 8, 2018 (see above) for latex, I’d really appreciate any specific pointers? I tried my best figuring out myself, but wasn't able to.

bumatic commented Jun 17, 2018

It's great that paragraph styles can be preserved when converting docx -> md -> docx now! A feature that I was long waiting for. For me (and judging from this and a bunch of other open issues for others as well), it would be great if these custom styles could be written to icml (and odt and possibly all other output formats). I know this takes time and I appreciate the effort. Really! In case this can be achieved with lua-filters for writing to icml similar to the solution @jgm proposed to @fintelkai on March 8, 2018 (see above) for latex, I’d really appreciate any specific pointers? I tried my best figuring out myself, but wasn't able to.

@matthewlehew

This comment has been minimized.

Show comment
Hide comment
@matthewlehew

matthewlehew Jun 27, 2018

I would also second the issue @bumatic raised: the custom-style kv syntax would be great if it could write to icml paragraph styles.

matthewlehew commented Jun 27, 2018

I would also second the issue @bumatic raised: the custom-style kv syntax would be great if it could write to icml paragraph styles.

@glassdimly

This comment has been minimized.

Show comment
Hide comment
@glassdimly

glassdimly Jul 22, 2018

If anyone finds their way here, as I did, wanting to transfer custom styles from html or md into a docx file, this has already been implemented here.

glassdimly commented Jul 22, 2018

If anyone finds their way here, as I did, wanting to transfer custom styles from html or md into a docx file, this has already been implemented here.

@jzeneto

This comment has been minimized.

Show comment
Hide comment
@jzeneto

jzeneto Jul 23, 2018

For anyone who needs custom styles in ODT: I've written a couple of lua filters, and among them is odt-custom-styles.lua, which accomplishes that almost well:

https://github.com/jzeneto/pandoc-odt-filters

"Almost" because to do that I turned everything inside a custom-styled-div or -span into a raw block/inline with opendocument type (if and only if the output format is odt). For this I had to figure out and write filters for every element.

(I've put all these element filters in another lua file, util.lua, for reuse in other filters, that needs to be in the same directory of odt-custom-styles.lua.)

jzeneto commented Jul 23, 2018

For anyone who needs custom styles in ODT: I've written a couple of lua filters, and among them is odt-custom-styles.lua, which accomplishes that almost well:

https://github.com/jzeneto/pandoc-odt-filters

"Almost" because to do that I turned everything inside a custom-styled-div or -span into a raw block/inline with opendocument type (if and only if the output format is odt). For this I had to figure out and write filters for every element.

(I've put all these element filters in another lua file, util.lua, for reuse in other filters, that needs to be in the same directory of odt-custom-styles.lua.)

@jzeneto

This comment has been minimized.

Show comment
Hide comment
@jzeneto

jzeneto Jul 23, 2018

To add to discussion about ODT, I think it's not needed to add custom styles to styles.xml, as @marcban suggested, nor turn them to default style if not defined, as @iandol suggested, because at least LibreOffice already reads undefined styles as default styles. Keeping custom style allows manually checking if styles were applied (by opening odt file with zip tool, and looking at content.xml), and also is easier to implement, I think.

jzeneto commented Jul 23, 2018

To add to discussion about ODT, I think it's not needed to add custom styles to styles.xml, as @marcban suggested, nor turn them to default style if not defined, as @iandol suggested, because at least LibreOffice already reads undefined styles as default styles. Keeping custom style allows manually checking if styles were applied (by opening odt file with zip tool, and looking at content.xml), and also is easier to implement, I think.

@glassdimly

This comment has been minimized.

Show comment
Hide comment
@glassdimly

glassdimly Jul 23, 2018

@jzeneto Thank you, your repo is amazing, and just what I'm looking for. For those on the thread, specifically see this filter.

glassdimly commented Jul 23, 2018

@jzeneto Thank you, your repo is amazing, and just what I'm looking for. For those on the thread, specifically see this filter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment