Supporting pandoc 2.11 #142

ickc · 2020-07-02T01:08:20Z

There's a change in pandoc AST since 2.10. c.f. jgm/pandoc#1024, jgm/pandoc#6277 and https://github.com/jgm/pandoc/releases/tag/2.10

The new Underline is probably easy. The change of AST regarding table is huge (jgm/pandoc#1024) including support for span.

Edit: Links on new table AST

PR
- Feature request started here: Support for table column spans, table attributes in AST jgm/pandoc#1024, Improving tables jgm/pandoc-types#65
- this is where the actual change of the AST happened: https://github.com/jgm/pandoc-types/pull/66/files?file-filters%5B%5D=.hs#diff-01f4ffe52cf097ab9ff89eebce394c86
- pandoc 2.10 release note: https://github.com/jgm/pandoc/releases/tag/2.10
- pandoc 2.11 release note (see HTML writer): https://github.com/jgm/pandoc/releases/tag/2.11
- pandoc-types 1.21 changelog: https://github.com/jgm/pandoc-types/blob/master/changelog
def
- https://hackage.haskell.org/package/pandoc-types-1.22/docs/Text-Pandoc-Definition.html
- doc on lua filters has some info on this: https://pandoc.org/lua-filters.html
- See comments at https://github.com/jgm/pandoc-types/pull/66#issuecomment-611053332 and after on normalizing tables
example ASTs:
- https://github.com/despresc/pandoc-types/blob/master/test/test-pandoc-types.hs
- https://github.com/jgm/pandoc/tree/master/test/tables

sergiocorreia · 2020-07-02T03:59:20Z

Thanks! This is indeed an important change.

First of all, I feel like I should just add support for id/attrib/classes to all elements, because we seem to be converging towards that.

For the tables, is there an actual summary of the changes of the new AST? That issue thread looks huge

ickc · 2020-07-02T23:53:59Z

More links related to the AST:

Support for table column spans, table attributes in AST jgm/pandoc#1024
Improving tables jgm/pandoc-types#65
Better tables jgm/pandoc-types#66 : this is where the actual change of the AST happened
https://hackage.haskell.org/package/pandoc

ickc · 2020-07-02T23:57:45Z

I feel like I should just add support for id/attrib/classes to all elements, because we seem to be converging towards that.

IIRC @jgm is reluctant to that idea. I don't think it will really happen, although many might ask for it. The solution seems to be just wrap whichever element one want to have attributes by a div and do it there.

jgm · 2020-07-03T18:53:39Z

If I were designing afresh, I'd have attributes on everything. But it would be quite a big change now.

sergiocorreia · 2020-07-06T14:58:26Z

@ickc by any chance do you have on hand any input file I can use to test the recent changes? I was initially thinking about having a markdown file with a complex table, but I understand that although the AST changed, the readers/writers haven't (yet).

So, are you aware of a way to receive and feed data into pandoc? One option is to try using a filter and seeing if pandoc accepts it, but seems a bit fragile.

ickc · 2020-07-08T23:18:28Z

I think may be the best way to obtain a test file would be starting from https://github.com/despresc/pandoc-types/blob/09cb4314010365abc4512c2363b83711c92ac18b/test/test-pandoc-types.hs. Here's the diff leading to this: https://github.com/jgm/pandoc-types/pull/66/files?file-filters%5B%5D=.hs#diff-01f4ffe52cf097ab9ff89eebce394c86. There's a JSON AST there that might be useful.

Other than the JSON, pandoc native AST could be a start. For short tables it is still manageable by hand.

ickc · 2020-07-08T23:29:13Z

Add some comments related to chdemko/pandoc-numbering#18 (comment)

According to https://github.com/jgm/pandoc-types/pull/66/files#diff-8c56b05df299ade82ea6265ee0b71bd0, these are added:

, Caption(..)
                               , ShortCaption
                               , RowHeadColumns(..)
                               , Alignment(..)
                               , ColWidth(..)
                               , ColSpec
                               , Row(..)
                               , TableHead(..)
                               , TableBody(..)
                               , TableFoot(..)
                               , Cell(..)
                               , RowSpan(..)
                               , ColSpan(..)

and TableCell is removed.

Then later,

		 -- | A short caption, for use in, for instance, lists of figures.
 type ShortCaption = [Inline]

  -- | The caption of a table, with an optional short caption.
 data Caption = Caption (Maybe ShortCaption) [Block]

So the ShortCaption is an Inline and Caption is block element.

dilawar · 2020-07-19T10:00:16Z

Is there a beta version for this feature? I have pandoc-2.10 installed and if possible if would rather not downgrade.

NMarkgraf · 2020-07-26T20:05:39Z

As I can not see any commits toward a solution to this problem, is there any way we could help?

aubertc · 2020-07-27T19:29:46Z

The reasons why this is more challenging than thought have been sketched by @sergiocorreia , one of panflute's author, at chdemko/pandoc-numbering#18 (comment) . If you find a good solution to the absence of test, I believe this could be extremely useful. You can also, probably, "hack" something, but there will be no guarantee that it will work on the long run (that is, when writers / readers will change too).

NMarkgraf · 2020-07-27T20:12:27Z

Okay, but so panflute is currently unusable ... I find that is also not a good idea...

ickc · 2020-07-28T03:22:35Z

To everyone who complain, Bare in mind you need to maintain your whole dependency stack. If you depends on panflute and panflute depends on pandoc<=2.9 for now, don't upgrade your pandoc and then complain. All of these open source developments are out of everyone's free time and is uncompensated.

This is really nothing out of ordinary whenever software libraries are used. Remember this whenever you want to spam any repos.

fralau · 2020-09-10T18:51:51Z

@sergiocorreia Clearly this is not your fault. The pandoc team thought they were doing something right by adding merged cells, but they did not anticipate the repercussions (IMHO this change was so significant it warranted a change of version of 3.0). Oh, well.

@ickc I did what you recommended (downgrade pandoc to 2.9). It was a bit of fuss with brew on MacOS, but that's doable (in case somebody wants indications, I kept a few notes). Plus I warned my colleagues about the issue.

I'd say we take this with philosophy. If there is something that the community can do to help, let us know.

ickc · 2020-09-15T18:42:19Z

The pandoc team thought they were doing something right by adding merged cells, but they did not anticipate the repercussions (IMHO this change was so significant it warranted a change of version of 3.0)

In fact it is a much anticipated feature for years. Pandoc's versioning model is major.major.minor.critical. So 2.9 to 2.10 is a major change that does not guarantee backward compatibility.

Many people surprised by something like this for the first time. But whichever software stack one depends on has to be managed can should not be blindly upgraded especially for major change that may break backward compatibility. Homebrew is one of the package manager that is not good at managing multiple versions (i.e. older versions.) For better control, you can use something like conda, conda create -n panflute pandoc<2.10 panflute -c conda-forge will gives you the necessary versions. OS level package managers often has this difference with those managing software stacks. In this case homebrew/macports/pacman from Arch, etc. can help you manage OS level software stacks, but other package manager such as conda can help managing your development software stacks.

While users of pandoc filters might not think that using them are "developing", but as long as they are using pandoc filters, they are developing in the sense that you have to manage the software stacks more carefully. That's partly why the idea of a pandoc package manager (pandocpm) died. In the case of this new table in AST, even the native filter framework embedded in pandoc failed. But in general the embedded framework should be the safest bet because it is officially supported and supposed to be upgraded in sync with pandoc itself.

ickc · 2020-09-15T18:50:48Z

By the way, the panflute package at conda-forge in principle can include dependence on pandoc: https://github.com/conda-forge/panflute-feedstock, maintained by @kiwi0fruit. With that it would be impossible to create a conflicted conda environment for panflute.

I know it is possible but don't know exactly how, @kiwi0fruit, any comments?

sergiocorreia · 2020-09-15T19:38:47Z

Hi Kolen, (and Laurent, Damien, etc.)

I agree with you that adding support for 2.10 (and now 2.10.1) is crucial. Due to family health issues, I am currently on leave from work and not in town, so even if this is important I have almost zero time to spend on updating.

One thing though: last time I looked my main problem was that I could not create or find an example of how the JSON is supposed to look, as no readers were currently supporting multirows and other features. Has this changed? If any of you can create a sample input that I can use that would be incredibly useful.

ickc · 2020-09-15T19:48:16Z

@sergiocorreia, I'm sorry to hear that and take your time to heal with your family! I totally understand and lately I don't have much time on these open source projects either. It's certainly not your responsibility to come up with a fix.

I know that there's PRs in pandoc on some readers/writers on the new table feature such as HTML, LaTeX, etc. And the test cases can provide some JSON to work with.

Anyone are welcome to step up to provide this or even make a PR too.

fralau · 2020-09-16T15:59:23Z

@sergiocorreia I would be willing to get samples, since that would also allow me to better understand what this evolution is about.

My only problem, right now, would be that I don't have access to a 2.10 platform on my Mac 🤷‍♂️ and I am not sure I wish to risk changing my setup (since brew is a pain for that, as far as pandoc is concerned). Anybody, any idea?

tarleb · 2020-11-06T09:51:57Z

@tarleb, is all the env var def documented somewhere?

You are right, this is only ever mentioned in the changelog, but not in any other documentation. I'll write a PR to add this to the JSON filter docs. (Edit: PR)

There is one other environment variable: PANDOC_READER_OPTIONS. This should contain the same information as the Lua global of the same name.

There is an open pandoc issue about exposing more info: jgm/pandoc#5221

ickc · 2020-11-06T23:12:35Z

Thanks for the info, I just double checked and only PANDOC_READER_OPTIONS, PANDOC_VERSION are available to JSON filters, so that's a bit different comparing to that documented in the lua filter section. PANDOC_READER_OPTIONS is going to be useful for us.

(But the info from the other 4 env var available to lua filter is either already known in JSON filters, or irrelevant for our case. So nothing's lost here.)

Looking forward to PANDOC_WRITER_OPTIONS.

ickc · 2020-11-07T01:30:28Z

pandoc 2.10 uses pandoc-types 1.21

pandoc 2.11 uses pandoc-types 1.22

IIRC something from pandoc-discuss mentioned a slight API changes is made to fix some consistency issue in pandoc 2.11.

So in short probably we'd jump to supporting 2.11 instead.

fralau · 2020-11-07T05:45:40Z

@sergiocorreia @ickc @tarleb Thanks a lot for this work for the community, this issue is getting really interesting ! Loads of things happened in this thread. After 50+ messages, perhaps someone could make a little summary of where we are all going with this issue, where we have come from, and where we are ?

sergiocorreia · 2020-11-07T13:47:14Z

@fralau

Goal is to add support 2.11 soon (within the next couple of days)
Going forward, we want to stay as close as possible to pandoc-latest , but avoid spending resources in keeping compat with older versions of pandoc. For that, people can just install an earlier version of panflute (so we will add a table indicating what version of panflute should they install if they have pandoc x.y)

Beyond that,

We want to expose a bit more metadata (the env vars that pandoc exposes)
We will probably need helper functions to make creating and modifying tables a bit easier, as 2.11 added quite a bit of complexity (still worth it though!). Right now, if you want to do something as simple as adding an extra column to a table, you would have to do a lot of changes to a lot of objects (you would need to touch every TableRow in TableHead, TableFoot, and all TableBody objects, plus adjust the ColSpec list)

fralau · 2020-11-07T16:24:30Z

@sergiocorreia Thanks a lot: this sounds very good !

Indeed, we might have an interesting problem in our project to determine (on the fly?) which version of pandoc is being used, so as to determine which version of panflute to use. Or else, ban versions of pandoc < 2.11 🤔

ickc · 2020-11-08T02:48:27Z

@fralau, regarding the lookup table between panflute and pandoc versions, for now it is just https://github.com/sergiocorreia/panflute/tree/stable#supported-pandoc-versions in the stable branch. It is an information for the end users and has to be acted upon manually.

Long term I'd like to have proper version control using conda. c.f. https://github.com/conda-forge/panflute-feedstock currently maintained by @kiwi0fruit. The pandoc dependence should be added there. And if done correctly people can do conda install panflute to have the matching version of compatible pandoc, panflute. (And say conda install pantable would have matching version of compatible pandoc, panflute, pantable and so on.)

For pip users, another option would be for panflute to bundle pandoc inside its wheel. I know that pypandoc used to do it in the past but has too much trouble that we wrote a download_pandoc function to just inform the endusers to run that function. However, it seems that they are building it from the source. Bundling the pre-compiled binaries in the wheels probably is much easier.

We probably should open another issue on this matter.

ickc · 2020-11-08T04:23:09Z

Here are the files which we currently use internally to test table output. The planets table tests colspans and rowspans, nordics checks element attributes and table footer, and students contains multiple table bodies, each with its own subheader.

nordics.txt
planets.txt
students.txt

GitHub requires a .txt ending, but all files contain pandoc JSON.

@tarleb, do you have a version of these with pandoc api 1.22 instead? Thanks.

gabyx · 2020-11-09T08:25:49Z

So good when this is implemented! :-) 👍 @ickc Do you know how much time this needs?. Thanks a lot for your effort.

ickc · 2020-11-09T12:31:25Z

@tarleb, do you have a version of these with pandoc api 1.22 instead? Thanks.

Nevermind, I converted them from pandoc 2.10.1 to 2.11.1.1:

nordics.txt
planets.txt
students.txt

Seems like only the JSON format has changed.

sergiocorreia · 2020-11-09T15:33:13Z

@tarleb, do you have a version of these with pandoc api 1.22 instead? Thanks.

Nevermind, I converted them from pandoc 2.10.1 to 2.11.1.1:

Is there a markdown source? In that way, we can generate the json each time we test; else the build might/will break next time pandoc_api changes even slightly.

tarleb · 2020-11-09T15:40:06Z

There are only .native sources: https://github.com/jgm/pandoc/tree/master/test/tables. Those can be converted to json with pandoc planets.native -o planets.json.

sergiocorreia · 2020-11-09T17:41:23Z

Thanks! Just added those files to the test cases!

@ickc

- See discussion in #142 and comments from @ickc and @tarleb - Fixed bugs revealed by the new tests

sergiocorreia · 2020-11-09T17:43:28Z

By the way, the develop branch now supports the latest version of pandoc, will be adding that to master and releasing to pypi soon.

ickc · 2020-11-10T00:40:20Z

the master branch now passes tests including the native files pointed by @tarleb. Feel free to test the master branch, e.g. by pip install git+https://github.com/sergiocorreia/panflute.git@master.

sergiocorreia · 2020-11-10T01:09:43Z

Closing it; code is now on pypi

Also, definitely let us know if you find any bugs or have any ideas for easier table constructors. For instance, most filter creators probably don't want to specify table foot, header of table body, colspec, etc., so a possibility is a SimpleTable() that then populates the full-on Table().

gabyx · 2020-11-10T08:34:48Z

@sergiocorreia , @ickc, @tarleb
Huge thanks for updating! Really appreciated!

ickc mentioned this issue Jul 2, 2020

Supporting pandoc 2.11 ickc/pantable#51

Closed

chdemko mentioned this issue Jul 7, 2020

Accomodating new tables definition chdemko/pandoc-numbering#18

Closed

mpark mentioned this issue Jul 15, 2020

Broken with Pandoc 2.10 mpark/wg21#65

Closed

marwahaha mentioned this issue Jul 16, 2020

installing from scratch boazbk/panbook#1

Closed

dhimmel mentioned this issue Jul 29, 2020

Update Pandoc to 2.10 manubot/rootstock#354

Closed

This was referenced Aug 19, 2020

runtime failure chdemko/pandoc-latex-color#7

Closed

Error running filter pandoc-numbering: Filter returned error status 1 chdemko/pandoc-numbering#20

Closed

This was referenced Sep 5, 2020

Could not find executable pandoc-beamer-block error on windows chdemko/pandoc-beamer-block#1

Closed

Throws an exception when inserting a table chdemko/pandoc-latex-tip#13

Closed

NMarkgraf mentioned this issue Sep 9, 2020

Using markdown tables leads to error "unknown tag: Caption" NMarkgraf/typography.py#5

Closed

cwgoes mentioned this issue Sep 13, 2020

Asynchronous acknowledgements cosmos/ibc#480

Merged

6 tasks

daamien mentioned this issue Sep 15, 2020

Panflute vs. Pandoc 2.10 dalibo/pandocker#184

Closed

ickc mentioned this issue Nov 6, 2020

filters.md: document Env Var jgm/pandoc#6816

Closed

ickc mentioned this issue Nov 6, 2020

Expose pandoc's Environment variable #148

Closed

ickc changed the title ~~Supporting pandoc 2.10~~ Supporting pandoc 2.11 Nov 7, 2020

ickc mentioned this issue Nov 8, 2020

Version managing pandoc #153

Closed

3 tasks

ickc added enhancement in progress labels Nov 8, 2020

ickc assigned sergiocorreia and ickc Nov 8, 2020

sergiocorreia added a commit that referenced this issue Nov 9, 2020

Added native files to test cases

243af31

- See discussion in #142 and comments from @ickc and @tarleb - Fixed bugs revealed by the new tests

sergiocorreia closed this as completed Nov 10, 2020

dhimmel mentioned this issue Nov 10, 2020

TypeError in pandocxnos.core.process_refs with Pandoc 2.11 tomduck/pandoc-fignos#85

Closed

DCsunset mentioned this issue Nov 25, 2020

Using pandoc-include results in NotImplementedError DCsunset/pandoc-include#12

Closed

ngirard mentioned this issue Jan 20, 2021

Exception: unknown tag: ColWidth entangled/filters#13

Closed

Supporting pandoc 2.11 #142

Supporting pandoc 2.11 #142

Comments

ickc commented Jul 2, 2020 • edited Loading

sergiocorreia commented Jul 2, 2020

ickc commented Jul 2, 2020

ickc commented Jul 2, 2020

jgm commented Jul 3, 2020

sergiocorreia commented Jul 6, 2020 • edited Loading

ickc commented Jul 8, 2020

ickc commented Jul 8, 2020

dilawar commented Jul 19, 2020

NMarkgraf commented Jul 26, 2020

aubertc commented Jul 27, 2020

NMarkgraf commented Jul 27, 2020

ickc commented Jul 28, 2020

fralau commented Sep 10, 2020 • edited Loading

ickc commented Sep 15, 2020

ickc commented Sep 15, 2020

sergiocorreia commented Sep 15, 2020

ickc commented Sep 15, 2020

fralau commented Sep 16, 2020 • edited Loading

tarleb commented Nov 6, 2020 • edited Loading

ickc commented Nov 6, 2020

ickc commented Nov 7, 2020

fralau commented Nov 7, 2020 • edited Loading

sergiocorreia commented Nov 7, 2020

fralau commented Nov 7, 2020 • edited Loading

ickc commented Nov 8, 2020

ickc commented Nov 8, 2020

gabyx commented Nov 9, 2020

ickc commented Nov 9, 2020

sergiocorreia commented Nov 9, 2020

tarleb commented Nov 9, 2020

sergiocorreia commented Nov 9, 2020

sergiocorreia commented Nov 9, 2020

ickc commented Nov 10, 2020

sergiocorreia commented Nov 10, 2020

gabyx commented Nov 10, 2020

ickc commented Jul 2, 2020 •

edited

Loading

sergiocorreia commented Jul 6, 2020 •

edited

Loading

fralau commented Sep 10, 2020 •

edited

Loading

fralau commented Sep 16, 2020 •

edited

Loading

tarleb commented Nov 6, 2020 •

edited

Loading

fralau commented Nov 7, 2020 •

edited

Loading

fralau commented Nov 7, 2020 •

edited

Loading