Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docx to Native] Keep colums during the conversion #8057

Open
fsoedjede opened this issue May 4, 2022 · 2 comments
Open

[Docx to Native] Keep colums during the conversion #8057

fsoedjede opened this issue May 4, 2022 · 2 comments

Comments

@fsoedjede
Copy link
Sponsor

Hello,
Thanks for this tool.

Describe your proposed improvement and the problem it solves.

I would like to know if it's possible to support keeping columns

Currently, when converting from Docx to native, the information is lost.

In my use case, the docx editor cannot have access to the markdown file. That's why I'm asking here.

Docx example : test-columns.docx

Current output

Native output with command pandoc --from=docx --to=native --output=test-columns-current.native test-columns.docx:

Markdown output with command pandoc --from=docx --to=markdown --output=test-columns-current.md test-columns.docx:

Proposal

We could use fenced_divs to do this.

The blocks in columns will be surrounted by a div with the class "columns"

Native output will be: test-columns-expected.native

Markdown output will be: test-columns-expected.md

Describe alternatives you've considered.

There is this filter: https://github.com/jdutant/columns.
I'm using it in the cases the source file is markdown but in my process, the docx editor cannot have access to the markdown file.
Markdown is use as reference file for conversions to other format because it's readable in git history.

I'm open if you have any suggestion even if it needs writing a custom Haskell extension.

@mb21
Copy link
Collaborator

mb21 commented May 7, 2022

How do you propose the docx reader would recognize a column in the docx file? It would be just a special style, or..?

You could also put in a special keyword in your title, and then write a custom lua-filter to do the transformation to a div..

@fsoedjede
Copy link
Sponsor Author

@mb21 sorry for the delayed reply

How do you propose the docx reader would recognize a column in the docx file? It would be just a special style, or..?

In docx, to make columns possible, we have to create a new Section (https://support.microsoft.com/en-us/office/insert-a-section-break-eef20fd8-e38c-4ba6-a027-e503bdf8375c)
From the docx file I've added in the issue, we have:

<w:sectPr w:rsidR="00704BBC" w:rsidRPr="00F964AB" w:rsidSect="00F34CE4">
    <w:pgSz w:w="11906" w:h="16838"/>
    <w:pgMar w:top="1417" w:right="1417" w:bottom="1417" w:left="1417" w:header="708" w:footer="708"
             w:gutter="0"/>
    <w:cols w:num="2" w:space="708"/>
    <w:docGrid w:linePitch="381"/>
</w:sectPr>

In the node w:cols, w:num="2" means we have two columns. When it's not defined, it means, from my understading the we have a normal document

In src/Text/Pandoc/Writers/Docx.hs I see some references to sectPr.
In src/Text/Pandoc/Readers/Docx.hs or any other readers I don't see references to sectPr.

You could also put in a special keyword in your title, and then write a custom lua-filter to do the transformation to a div..

It could work but if the title can contain one column text and two column text at the same time, it will not.
I will use this for now.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants