Skip to content

PptxConverter threw TypeError with message: '<' not supported between instances of 'NoneType' and 'Emu' #1293

Open
@lumpi101

Description

@lumpi101

I have a pptx which leads to an error when trying to MarkItDown().convert() it.

Traceback is:

Traceback (most recent call last):
  [...]
  File "/var/lang/lib/python3.12/site-packages/markitdown/_markitdown.py", line 273, in convert
    return self.convert_stream(source, stream_info=stream_info, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lang/lib/python3.12/site-packages/markitdown/_markitdown.py", line 361, in convert_stream
    return self._convert(file_stream=stream, stream_info_guesses=guesses, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lang/lib/python3.12/site-packages/markitdown/_markitdown.py", line 600, in _convert
    raise FileConversionException(attempts=failed_attempts)
markitdown._exceptions.FileConversionException: File conversion failed after 1 attempts:
 - PptxConverter threw TypeError with message: '<' not supported between instances of 'NoneType' and 'Emu'

The proplematic pptx is confidential, so I cannot provide it. But I could pin the bug down to this line of code. There is indeed for one shape shape.top == None, so the sorting fails. The problematic shape seems to be empty anyways.

Currently, I use a very ugly monkey patch:

def _shape_filter(s):
    return not (  # if "top" and "left" attributes exist, both of them must not be None
        hasattr(s, "top") and hasattr(s, "left") and (s.top is None or s.left is None)
    )

def _mock_sorted(iterable, **kwargs):
    iterable = (it for it in iterable if _shape_filter(it))
    return sorted(iterable, **kwargs)

from markitdown.converters import _pptx_converter
_pptx_converter.sorted = _mock_sorted  # type: ignore

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions