Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unwanted block quotes in lists when converting ODT generated from DOCX #9505

Open
mathrick opened this issue Feb 24, 2024 · 4 comments
Open
Labels

Comments

@mathrick
Copy link

Explain the problem.
This might be related to #8836. It seems that the fixes to native DOCX reader were not propagated to other affected formats like ODT, even though that issue does mention .odt in a comment.

The file in question has been generated by LibreOffice (using official Ubuntu 22.0 packages) from DOCX (same source file as in #9504): How-to-Become-a-Best-Selling-Author.odt

$ libreoffice --convert-to odt How-to-Become-a-Best-Selling-Author.docx
$ libreoffice --version
LibreOffice 7.3.7.2 30(Build:2)
$ pandoc How-to-Become-a-Best-Selling-Author.odt -o How-to-Become-a-Best-Selling-Author.typ

This generates the following list structure:

In this article you will learn the basics about three things:

- #quote(block: true)[
  Defining your target audience
  ]

- #quote(block: true)[
  Creating book covers, titles, and subtitles
  ]

- #quote(block: true)[
  Working with other authors
  ]

Pandoc version?

$ pandoc --version
pandoc 3.1.12.1
Features: +server +lua
Scripting engine: Lua 5.4
@mathrick mathrick added the bug label Feb 24, 2024
@jgm
Copy link
Owner

jgm commented Feb 25, 2024

This has nothing to do with typst specifically; it's an odt reader issue.

@jgm jgm changed the title Strange Typst markup when converting ODT generated from DOCX Unwanted block quotes in lists when converting ODT generated from DOCX Feb 25, 2024
@mathrick
Copy link
Author

@jgm: Ah, I wasn't sure about that, since I couldn't immediately see the blockquotes in the native format dump, and it doesn't generate them with Markdown target.

@jgm
Copy link
Owner

jgm commented Feb 26, 2024

It's there in the native output:

, BulletList
    [ [ BlockQuote
          [ Para
              [ Str "Defining"
              , Space
              , Str "your"
              , Space
              , Str "target"
              , Space
              , Str "audience"
              ]
          ]
      ]
    , [ BlockQuote
          [ Para
              [ Str "Creating"
              , Space
              , Str "book"
              , Space
              , Str "covers,"
              , Space
              , Str "titles,"
              , Space
              , Str "and"
              , Space
              , Str "subtitles"
              ]
          ]
      ]
    , [ BlockQuote
          [ Para
              [ Str "Working"
              , Space
              , Str "with"
              , Space
              , Str "other"
              , Space
              , Str "authors"
              ]
          ]
      ]
    ]

@jgm
Copy link
Owner

jgm commented Feb 26, 2024

I think this is happening because the list items' paragraphs are indented. The ODT reader uses indentation as a heuristic for determining when we have a block quote. It is a similar issue to #8836, but the fix will have to be different because ODT is a different format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants