Skip to content
dkovar edited this page Sep 7, 2012 · 2 revisions

Purpose:

analyzeMFT.py is designed to fully parse the MFT file from an NTFS filesystem and present the results as accurately as possible in a format that allows further analysis with other tools. At present, it will read an entire MFT through to the end without error, but it skips over parsing some of the attributes. These will be filled in as time permits.

Caution:

This code is very much under development. You should not depend on its results without double checking them against at least one other tool.

Output:

The output is currently written in CSV format. Due to the fact that Excel automatically determines the type of data in a column, it is recommended that you write the output to a file without the .csv extension, open it in Excel, and set all the columns to "Text" rather than "General" when the import wizard starts. Failure to do so will result in Excel formatting the columns in a way that misrepresents the data.

I could pad the data in such a way that forces Excel to set the column type correctly but this might break other tools.

Inspiration:

My original inspiration was a combination of MFT Ripper (thus the current output format) and the SANS 508.1 study guide. I couldn't bear to read about NTFS structures again, particularly since the information didn't "stick". I also wanted to learn Python so I figured that using it to tear apart the MFT file was a reasonably sized project.

Many of the variable names are taken directly from Brian Carrier's The Sleuth Kit. His code, plus his book "File System Forensic Analysis", was very helpful in my efforts to write this code.

The output format is almost identical to Mark Menz's MFT Ripper. His tool really inspired me to learn more about the structure of the MFT and to learn what additional information I could glean from the data.

I also am getting much more interested in timeline analysis and figured that really understanding the the MFT and having a tool that could parse it might serve as a good foundation for further research in that area.

Future work:

(Move these to issues.)

  1. Figure out how to write the CSV file in a manner that forces Excel to interpret the date/time fields as text. If you add the .csv extension Excel will open the file without invoking the import wizard and the date fields are treated as "General" and the date is chopped leaving just the time.
  2. Add "extract" switch - extract or work on live MFT file
  3. Finish parsing all possible attributes
  4. Look into doing more timeline analysis with the information
  5. Improve the documentation so I can use the structures as a reference and reuse the code more effectively
  6. Clean up the code and, in particular, follow standard naming conventions. (Scratch that, needs deep rewrite.)
  7. There are two MFT entry flags that appear that I can't determine the significance of. These appear in the output as Unknown1 and Unknown2
  8. Parse filename based on 'nspace' value in FN structure
  9. Output HTML as well as CSV
Clone this wiki locally