Skip to content

bug/ImportError-unstructured.partition.pdf #523

@mtb-beta

Description

@mtb-beta

Describe the bug

There is a bug that causes an ImportError when trying to import unstructured.partition.pdf.

To Reproduce

The issue can be reproduced by running the following in a Python shell:

>>> import unstructured.partition.pdf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/mypath/venv/lib/python3.9/site-packages/unstructured/partition/pdf.py", line 6, in <module>
    from pdfminer.utils import open_filename
ImportError: cannot import name 'open_filename' from 'pdfminer.utils' (/mypath/venv/lib/python3.9/site-packages/pdfminer/utils.py)

Expected behavior

At a minimum, it would be preferable to have no code that results in an ImportError.

Screenshots

None

Desktop (please complete the following information):

  • OS: Mac
  • Python version: 3.9.16

Additional context

This issue occurs in version 0.6.2 of the unstructured
It does not occur in version 0.6.1. Also, pdfminer has been in maintenance mode since 2020, and the latest version does not have open_filename in pdfminer.utils.

euske/pdfminer@7ed7d4e

It appears that this issue surfaced due to the import statement being defined in the module global scope in this commit:

894a190#diff-cefa2d296ae7ffcf5c28b5734d5c7d506fbdb225c05a0bc27c6b755d5424ffda

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions