Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't recurse into every container type #73

Merged
merged 1 commit into from
May 17, 2016
Merged

Conversation

mistydemeo
Copy link
Contributor

1b5698a updated self.container_type() to recognize OLE as an additional format, but this broke the -zip switch. The method was being used to identify two different categories of formats:

  1. Container formats which need to be matched against the PRONOM container signatures in order to get more precise matches; and
  2. Container formats which can be recursed into via the -zip switch in order to identify the formats of their contents.

FIDO supports OLE for the former but not the latter, and since OLE is usually not interesting for the format of its contents, it doesn't make sense to support recursing into it.

This commit adds a new method which differentiates whether FIDO is interested in recursing into a format, not merely whether it is a container format, and updates the -zip path to check using it.

Fixes #72.

1b5698a updated self.container_type() to recognize OLE as an additional
format, but this broke the `-zip` switch. The method was being used to
identify two different categories of formats:

1. Container formats which need to be matched against the PRONOM
   container signatures in order to get more precise matches; and
2. Container formats which can be recursed into via the `-zip` switch in
   order to identify the formats of their contents.

FIDO supports OLE for the former but not the latter, and since OLE is
usually not interesting for its contents, it doesn't make sense to
support recursing into it.

This commit adds a new method which differentiates whether FIDO is
interested in recursing into a format, not merely whether it *is* a
container format, and updates the `-zip` path to check using it.

Fixes #72.
@sevein
Copy link
Contributor

sevein commented May 16, 2016

👍

@jhsimpson
Copy link
Contributor

I tested with the same .doc file, looks good:

fido -zip ~/Downloads/embedded_video_quicktime.doc
FIDO v1.3.4 (formats-v84.xml, container-signature-20160121.xml, format_extensions.xml)
bad repeat interval
bad repeat interval
OK,171,fmt/111,"OLE2 Compound Document Format","OLE2 Compound Document Format",26624,"/home/jhs/Downloads/embedded_video_quicktime.doc","None","signature"
FIDO: Processed 1 files in 233.11 msec, 4 files/sec

@jhsimpson jhsimpson merged commit 217ab26 into master May 17, 2016
@Hwesta Hwesta deleted the dont_recurse_into_ole branch October 25, 2016 20:27
@Hwesta Hwesta added this to the 1.3.4 milestone Oct 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants