Skip to content

Commit

Permalink
doc: added FAQ
Browse files Browse the repository at this point in the history
  • Loading branch information
decalage2 committed Jan 26, 2018
1 parent 8d04075 commit 93d0bde
Show file tree
Hide file tree
Showing 2 changed files with 28 additions and 0 deletions.
27 changes: 27 additions & 0 deletions doc/FAQ.rst
@@ -0,0 +1,27 @@
==========================
Frequently Asked Questions
==========================

Can I extract all images from MS OLE2 documents with olefile?
-------------------------------------------------------------

Not directly: images are not always stored the same way, and it also depends on the format.

For example in Powerpoint presentations, you may find a stream named "Pictures"
when running "olefile yourfile.ppt". You may extract the stream by using the
openstream() method on the OleFileIO object, but you will usually get a binary
stream containing several picture files. You may also extract it manually using
tools such as SSView (http://www.mitec.cz/ssv.html).

Then the only way I've found so far is to use file carving tools which are
able to determine the beginning and the end of each picture in a binary file.
These tools are not always easy to use but if you're interested have a look
at http://pypi.python.org/pypi/hachoir-subfile
and http://www.forensicswiki.org/wiki/Tools:Data_Recovery#Carving.

If you really need to automate the process then you have to study Microsoft
specifications (at http://www.microsoft.com/interop/docs/officebinaryformats.mspx)
and find the right way to parse MS Office documents...

A lot of people (including me) would be very interested if you find a solution! ;-)

1 change: 1 addition & 0 deletions doc/index.rst
Expand Up @@ -35,6 +35,7 @@ Microscopy file formats, McAfee antivirus quarantine files, etc.
Howto
OLE_Overview
olefile
FAQ



Expand Down

0 comments on commit 93d0bde

Please sign in to comment.