Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
tree: 0cb7f6d389
Fetching contributors…

Cannot retrieve contributors at this time

181 lines (161 sloc) 8.596 kb


Metadata Inspection Database Alerting System

This is a project to create a system to automate the inspection and databasing of all Meta data information contained within all files destined for an organization (generally via dumping the files which are attached to emails through the use of YARA, but could also be automated via netwitness, other full pcap tool, or just to iterate through file servers looking for suspicious files).

Alternatively, this can be used to look for heuristic anomalies in existing collections of files both malicious and benign.

MIDAS Requires:

Yara 1.6
Yara Python 1.6
MongoDB 2.0+
PyMongo 2.2+
Python 2.7
Exiftool 9.0+

Optional if you want SSDeep fuzzy Hashing

This program uses PyExifData for extraction of metadata, and PyMongo to interface with a local Mongodb instance which will store the extracted data for later queries and tracking. Files and extracted metadata are also scanned by Yara and alerts are written out to logs, and along with MD5hashes and a SSDeep fuzzy hash are placed in the JSON which is sent to the Database.

Latest Changes

Version .11a
Added midas-settings.cfg file for database config and yararules/log file config, that way I can keep it in one place to make a tool to search that DB later, and it keeps the user out of the source of
Version .10a
Added full file yara scanning, this can be resource intensive if you have a lot of rules. (-f or -fullyara). It will alert to logs at warning level and push all alerts for a file into the DB in the JSON
Version .09a
Added SSDeep Fuzzy Hashing with (-S or -SSDeep) flag, saved in JSON to ['SSDeep'] Key. New dependencies: ssdeep/pyssdeep (if you dont want to use this you can just never use the flag and delete the include ssdeep from the head of
Version .07a
Added ['YaraAlerts'] Key to Metadata JSON which will save the yara rule hits to the database entry for each file.


Install all of the prereqs listed above.
Place, midasdb.cfg, and midasyararules.yar in a directory which is NOT the path to be scanned.
Configure your DB Server / DB / Collection info inside of midas-settings.cfg (note it comes set up to connect to localhost:27017 DB = test Collection = metadata )
You can also set a YaraRules file and designate a log file in midas-settings.cfg (default is midasyararules.yar and midas.log) PROFIT!

  • Currently the program works to extract exif data from all files in a given directory.
  • It computes an MD5hash and time stamp for each file and add that to the JSON
  • It then adds the metadata in json format to a mongo DB collection of your chosing
  • Then it will use Yara to perform detections in the metadata (or whole files with -f) Detections will be logged at a WARNING level for easy ident and also added to the JSON data with key 'YaraAlerts'
  • It then has the ability to either (-d) delete or (-m) move files once scanned to a configurable destination.
  • It will then pause 15 seconds (configurable with -s) and repeat this process with no further interaction, logging all DB Submissions, and file moves/deletes

Please contact me at with any questions.


MIDAS Will create the collection if it dosen't exist
Enter DB info below:
server: localhost
port: 27017
db: test
collection: metadata
General Settings
logs: midas.log
yararules: midasyararules.yar

USAGE Example:

usage: [-h] [-S] [-d] [-m MOVE] [-s SLEEP] Path

Metadata Inspection Database Alerting System

positional arguments:
Path Path to directory of files to be scanned (Required)

optional arguments:
-h, --help show this help message and exit
-d, --delete Deletes files after extracting metadata (Default: False)
-S --SSDeep Perform ssdeep fuzzy hashing of files and store in DB (Default: False)
-f, --fullyara Scan the entriety of each file with Yara (Default: Only Metadata is scanned)
-m MOVE, --move MOVE Where to move files to once scanned (Default: Files are Not Moved)
-s SLEEP, --sleep SLEEP Time in Seconds for to sleep between scans (Default: 15 sec)

What you see at the CLI Upon Execute:

~/MIDAS$ python -s -f -m ../2 -s 30 ../testmidas/

Scanning all files recursively from here: ../testmidas/
Logging all information to: midas.log
Using Yara Rule file: midasyararules.yar
Sleeping for: 30 seconds between iterations
All files will be moved to: ../2 once scanned
SSDeep fuzzy hashing is set to: True
Full file Yara scanning is set to: True
Delete after scanning is set to: False

This program will not terminate until you stop it. Enjoy!

LOGS Example:

INFO:root:Starting Midas with the following args: {'yararules': './midasyararules.yar', 'logs': './midas.log', 'move': None, 'sleep': 15, 'Path': '../testmidas/', 'delete': True}
INFO:root:2012:09:10 16:45:49: Metadata for july.swf MD5: ac97a9244a331ffd1f695d1a99485e5d added to database
INFO:root:2012:09:10 16:45:49:../testmidas/july.swf has been deleted.
INFO:root:2012:09:10 16:45:49: Metadata for 2.pdf MD5: 101c15e96c05c6ef289962f49f6dae87 added to database
WARNING:root:2012:09:10 16:45:49: Yara Matches for 2.pdf: [MetaData_PDF_Test] MD5: 101c15e96c05c6ef289962f49f6dae87
INFO:root:2012:09:10 16:45:49:../testmidas/2.pdf has been deleted.
INFO:root:2012:09:10 16:45:49: Metadata for 1.pdf MD5: 32d29ee5d36373a775c8f0776b2395bc added to database
WARNING:root:2012:09:10 16:45:49: Yara Matches for 1.pdf: [MetaData_PDF_Test, MetaData_Author_OracleReports_Test] MD5: 32d29ee5d36373a775c8f0776b2395bc
INFO:root:2012:09:10 16:45:49:../testmidas/1.pdf has been deleted.

Info Inserted into database:

[_id] => 32d29ee5d36373a775c8f0776b2395bc
[SSDeep] => 3072:TlijdBnn/V8zhltU+AqblNIrrN2Ywzmr35DUQKn:ynihrrRNIXN2YwzmzU
[File:FileType] => PDF
[File:FileSize] => 107474
[File:DateTimeRecieved] => 2012:09:10 15:24:08
[PDF:PageCount] => 1
[PDF:Title] => ntlwr_folio_logo_mpg3153683.pdf
[PDF:Creator] => Oracle10gR2 AS Reports Services
[File:MIMEType] => application/pdf
[PDF:Author] => Oracle Reports
[PDF:PDFVersion] => 1.4
[PDF:Producer] => Oracle PDF driver
[YaraAlerts] => [MetaData_PDF_Test, MetaData_Author_OracleReports_Test]
[File:FileModifyDate] => 2012:09:10 14:41:14-04:00
[PDF:ModifyDate] => 2012:07:10 07:39:29
[PDF:CreateDate] => 2012:07:10 07:39:29
[File:FileName] => 221.pdf
[PDF:Linearized] =>

[_id] => ac97a9244a331ffd1f695d1a99485e5d
[SSDeep] => 3072:QeORGrBzIqh1olop2dqvsQuiatQq+SnDwURYjcaY3o/GKZRDwcQ:5ORGrBzXQqvsQuztQq+qkjJY3o/3zMcQ
[File:MIMEType] => application/x-shockwave-flash
[File:DateTimeRecieved] => 2012:09:10 15:24:08
[Flash:FileAttributes] => 25
[XMP:Creator] => unknown
[File:FileModifyDate] => 2012:09:10 14:41:02-04:00
[XMP:Format] => application/x-shockwave-flash
[Flash:Compressed] => 1
[Flash:FlashVersion] => 14
[File:FileSize] => 156778
[XMP:Publisher] => unknown
[Flash:ImageWidth] => 500
[Flash:FrameCount] => 1
[File:FileType] => SWF
[File:FileName] => 22july.swf
[Flash:ImageHeight] => 375
[XMP:Date] => 2012:8:15
[YaraAlerts] => None
[XMP:Description] =>
[XMP:Title] => Adobe Flex 4 Application
[Flash:Duration] => 0.041666666666667
[Composite:ImageSize] => 500x375
[Flash:FrameRate] => 24
[XMP:Language] => EN

Copyright & License Info:

MIDAS is copyrighted by Chris Clark 2012. Contact me at

MIDAS is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

MIDAS is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with MIDAS. If not, see

Jump to Line
Something went wrong with that request. Please try again.