Skip to content

extracting images from a document #108

@peczony

Description

@peczony

Hi,

my task is to parse some docx files, extracting plain text and images from them.

am i right in thinking (after reading the docs and playing around with some code) that there is no way so far to extract images from a document? I.e. I know that the document has an inline_shapes property, but:

  • the run doesn’t have one, so I can’t establish the link between an image and a run
  • InlineShape doesn’t have a method for saving an image to disk?

I conjecture that the best way I have is to unzip the docx and parse xml?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions