Skip to content

markitdown-ocr: PPTX converter emits literal \n sequences in Markdown output #2010

@MontesanoDev

Description

@MontesanoDev

PptxConverterWithOCR uses "\\n" instead of "\n" in multiple output paths.
As a result, converted PPTX files contain literal backslash-n sequences rather
than Markdown line breaks.

The OCR PPTX tests currently encode this malformed output as expected behavior.

Additionally, the optional LLM caption path imports ._llm_caption, but the
module exists in markitdown.converters, not in markitdown_ocr.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions