New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to parse embedded file(OLE obejct) in pptx/docx #644
Comments
Use follow code example to get OLEObjects from the first slide presentation:
|
Hi adamshakhabov, thanks for your reply! According to my knowledge, the ole object should be stored in embedded object parts(X.MainDocumentPart.EmbeddedObjectParts), and I am asking for a method to parse the oleobject instead of just getting it. |
Hi @hong1997! I think Open XML SDK has not some specific method for OLEObject element reading (parse its properties). Can you say more precise, which one feature of OLEObject you try to parse? Also, it would be better if you attach pptx-file with this OLEObject case. |
@hong1997 and @adamshakhabov, GitHub issues are not the place to ask and discuss questions regarding Open XML SDK library usage. You should ask usage-related questions on stackoverflow.com, where you will already find a large number of questions and answers tagged with In this specific case, another user already asked about how he could extract OLE-embedded files from Word documents, and I provided an accepted answer. |
@ThomasBarnekow , thanks for your info, I will close the issue. However, the answer you provided only handles 1 kind of OLE structure. You could see from my description that only the last kind of ole object can be handled by the class you provided. |
Some of the OLE can show as wmf image. Because it contain the fallback element. Here is my code that save the fallback element to file https://github.com/lindexi/lindexi_gd/tree/d182ca9f0cece56d32a801923a1fdffa64f95dfd/NallwerewawchailawileeForeehakel . Some ole can use WinForms to convert. The DotNet Heaven: Read OLE Object type image field in C#.net |
Thanks everyone for an interesting discussion. This looks to have been resolved so I'll close the issue. |
Before submitting an issue, please fill this out
Is this a:
How to parse embedded files(OLE obejct) in pptx/docx.
They are Ole objects mostly, like object1.bin.
If there're any good ways to parse it?
Unzip the OLE object, there're several kinds of format:
Didn't find out a general good way to achieve that.
I check the source code of Tika parser, they extract it in a rule-based method...
Observed
Please add your observed behavior here
Expected
Please add your expected behavior here.
The text was updated successfully, but these errors were encountered: