Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

olevba : raises an error with an unknown docx #711

Closed
jcmbs opened this issue Sep 1, 2021 · 3 comments · Fixed by #716
Closed

olevba : raises an error with an unknown docx #711

jcmbs opened this issue Sep 1, 2021 · 3 comments · Fixed by #716
Assignees
Milestone

Comments

@jcmbs
Copy link

jcmbs commented Sep 1, 2021

Hi,

olevba raises an error when it encounters a docx document that it is not able to identify :

olevba 0.60 on Python 3.9.2 - http://decalage.info/python/oletools
ERROR    Unhandled exception in main: expected str, bytes or os.PathLike object, not NoneType
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/oletools/olevba.py", line 4668, in main
    curr_return_code = process_file(filename, data, container, options)
  File "/usr/local/lib/python3.9/dist-packages/oletools/olevba.py", line 4477, in process_file
    vba_parser = VBA_Parser_CLI(filename, data=data, container=container,
  File "/usr/local/lib/python3.9/dist-packages/oletools/olevba.py", line 4029, in __init__
    super(VBA_Parser_CLI, self).__init__(*args, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/oletools/olevba.py", line 2773, in __init__
    self.open_openxml(_file)
  File "/usr/local/lib/python3.9/dist-packages/oletools/olevba.py", line 2897, in open_openxml
    self.append_subfile(filename=subfile, data=ole_data)
  File "/usr/local/lib/python3.9/dist-packages/oletools/olevba.py", line 3175, in append_subfile
    self.ole_subfiles.append(VBA_Parser(filename, data, container,
  File "/usr/local/lib/python3.9/dist-packages/oletools/olevba.py", line 2770, in __init__
    self.open_ppt()
  File "/usr/local/lib/python3.9/dist-packages/oletools/olevba.py", line 3109, in open_ppt
    self.append_subfile(None, vba_data, container='PptParser')
  File "/usr/local/lib/python3.9/dist-packages/oletools/olevba.py", line 3175, in append_subfile
    self.ole_subfiles.append(VBA_Parser(filename, data, container,
  File "/usr/local/lib/python3.9/dist-packages/oletools/olevba.py", line 2754, in __init__
    self.ftg = ftguess.FileTypeGuesser(self.filename, data=data)
  File "/usr/local/lib/python3.9/dist-packages/oletools/ftguess.py", line 658, in __init__
    if FType_Generic_OpenXML.recognize(self):
  File "/usr/local/lib/python3.9/dist-packages/oletools/ftguess.py", line 447, in recognize
    main_part_ext = os.path.splitext(main_part)[1][1:]
  File "/usr/lib/python3.9/posixpath.py", line 118, in splitext
    p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not NoneType

In this case, in FType_Generic_OpenXML.recognize the main_part variable was left at None when it was used (os.path.splitext(main_part)[1][1:]).

Further up in the code there is a comment section that deals with this case. Not knowing why this section is commented out, I just added a test that temporarily solves the problem:

--- ftguess.py.orig     2021-09-01 10:51:53.560000000 +0200
+++ ftguess.py  2021-08-31 17:43:37.404000000 +0200
@@ -414,6 +414,8 @@
         # else:
         #     # TODO: log error, raise anomaly (or maybe it's the case for XPS?)
         #     return False
+        if main_part is None:
+            return False
         # parse content types, find content type of main part
         try:
             content_types = ftg.zipfile.read('[Content_Types].xml')

Since the offending document contains confidential information, I can't share it in this issue.

Version information:

  • OS: Linux
  • OS version: Debian 11.0 - 64 bits
  • Python version: 3.9.2 64 bits
  • oletools version: 0.60

Regards,

@decalage2 decalage2 self-assigned this Sep 1, 2021
@decalage2 decalage2 added this to the oletools 0.60 milestone Sep 1, 2021
@christian-intra2net
Copy link
Contributor

I can have a look at this, encountered a similar error when running olevba on a file from the unittests. I am currentyly extending ftguess, so can handle this case as well, I think

christian-intra2net added a commit to christian-intra2net/oletools that referenced this issue Oct 1, 2021
Return False if OpenXML type has no known main relationship part.
Otherwise recognize() will raise an error a few lines later at
splitext().

This should solve issue decalage2#711 (author had suggested exactly this)
christian-intra2net added a commit to christian-intra2net/oletools that referenced this issue Oct 1, 2021
Return False if OpenXML type has no known main relationship part.
Otherwise recognize() will raise an error a few lines later at
splitext().

This should solve issue decalage2#711 (author had suggested exactly this)
christian-intra2net added a commit to christian-intra2net/oletools that referenced this issue Oct 5, 2021
Return False if OpenXML type has no known main relationship part.
Otherwise recognize() will raise an error a few lines later at
splitext().

This should solve issue decalage2#711 (author had suggested exactly this)
christian-intra2net added a commit to christian-intra2net/oletools that referenced this issue Oct 6, 2021
Return False if OpenXML type has no known main relationship part.
Otherwise recognize() will raise an error a few lines later at
splitext().

This should solve issue decalage2#711 (author had suggested exactly this)
@decalage2
Copy link
Owner

I get the same error with XPS files, because they have a different relationship URI.
But if I try to edit Office documents (Word, Excel, etc) to trigger this error, the resulting documents are considered as malformed by MS Office and cannot be opened.
@jcmbs could you please just tell me what is the file format of the document that triggered this error?

@decalage2 decalage2 linked a pull request Nov 2, 2021 that will close this issue
@decalage2
Copy link
Owner

Also @jcmbs could you please run ftguess -l debug on the document, so that I see how the relationships look like?

c-rosenberg pushed a commit to HeinleinSupport/oletools that referenced this issue Dec 2, 2021
Return False if OpenXML type has no known main relationship part.
Otherwise recognize() will raise an error a few lines later at
splitext().

This should solve issue decalage2#711 (author had suggested exactly this)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants