Parser for $LogFile on NTFS
Switch branches/tags
Nothing to show
Clone or download
jschicht Version
Added mimssing usn reason and source codes.
Added missing reparse tags.
Fix bug in decode of $REPARSE_POINT attribute with type WCI.
Fixed bug in $EA attribute handling.
Latest commit 12fde5b May 20, 2017


Decode and dump $LogFile records and transaction entries.
Decode NTFS attribute changes.
Optionally resolve all datarun list information available in $LogFile. Option: "Reconstruct data runs".
Recover transactions from slack space within $LogFile.
Choose to reconstruct missing or damaged headers of transactions found in slack. Option: "Rebuild header". 
Optionally also finetune result with a LSN error level value. Option: "LSN error level".
Logs to csv and imports to sqlite database with several tables.
Optionally import csv output of mft2csv into db.
Choose among 6 different timestamp formats.
Choose timestamp precision: None, MilliSec and NanoSec.
Choose Precision separator at millisec.
Choose Precision separator at nanosec.
Choose region adjustment for timestamps. Default is to present timestamps in UTC 0.0.
Choose output separator. Option: "Set separator".
Configurable UNICODE or ANSI output. Option "Unicode".
Configurable MFT record size (1024 or 4096). Option "MFT record size".
Optionally decode individual transactions or partial transactions (fragment).
Option to reconstruct RCRD's from single or multiple transactions (fragments).
Option to skip fixups (for broken $LogFile).
Detailed verbose output into debug.log.
Configurable comma separated list of lsn's to trigger ultra verbose information about specific transactions into debug.log.
Configuration for 32-bit OS.
Configuration for binary data extraction of resident data updates.
Autogenerated sql for importing out put into MySql database.
Option to skip all sqlite3 stuff to speed up total parsing.
Optional command line mode. Supports errorlevel suitable for batch scripting. 

NTFS is designed as a recoverable filesystem. This done through logging of all transactions that alters volume structure. So any change to a file on the volume will require something to be logged to the $LogFile too, so that it can be reversed in case of system failure at any time. Therefore a lot of information is written to this file, and since it is circular, it means new transactions are overwriting older records in the file. Thus it is somewhat limited how much historical data can be retrieved from this file. Again, that would depend on the type of volume, and the size of the $LogFile. On the systemdrive of a frequently used system, you will likely only get a few hours of history, whereas an external/secondary disk with backup files on, would likely contain more historical information. And a 2MB file will contain far less history than a 256MB one. So in what size range can this file be configured to? Anything from 256 KB and up. Configure the size to 2 GB can be done like this, "chkdsk D: /L:2097152". How a large sized logfile impacts on performance is beyond the scope of this text. Setting it lower than 2048 is normally not possible. However it is possble by patching untfs.dll:

This parser will decode and dump lots of transaction information from the $LogFile on NTFS. There are several csv's generated as well as an sqlite database named ntfs.db containing all relevant information. The output is extremely detailed and very low level, meaning it requires some decent NTFS knowledge in order to understand it. The currently handled Redo transaction types with meaningfull output decode are:


The list of currently supported attributes:

So basically all attributes are supported.

Explanation of the different output generated:

The main csv generated from the parser.

The input information needed for reconstructing dataruns

The final output of reconstructed dataruns

All dumped and decoded index records (IndexRoot/IndexAllocation)

Same as LogFile.csv, but have filename information joined in from the $UsnJrnl or csv of mft2csv.

Dummy $MFT recreated based on found MFT records in InitializeFileRecordSegment transactions. Can use mft2csv on this one (remember to configure "broken MFT" and "Fixups" properly).

Records for the $UsnJrnl that has been decoded within $LogFile

All undo operations for clearing of directory indexes (INDX).

All headers of decoded transactions.

All decoded SetBitsInNonresidentBitMap operations.

LogFile_DirtyPageTable32bit.csv and LogFile_DirtyPageTable64bit.csv
All entries in every decoded DirtyPageTableDump operation for both 32bit and 64bit OS.

Decoded $ObjectId attributes.

All decodes from system file $ObjId:$O.

All entries in every decoded OpenAttributeTableDump operation.

All decodes from system file $Quota:$O.

All decodes from system file $Quota:$Q.

All headers of decoded RCRD records.

All decodes from system file $Reparse:$R.

All decodes from system file $Secure:$SDH.

All decodes from system file $Secure:$SII.

Decoded security descriptors. Source can be from $SECURITY_DESCRIPTOR or $Secure:$SDS.

All entries from decoded AttributeNamesDump transactions found in slack space.

All entries from decoded OpenAttributeTableDump transactions found in slack space.

Decoded TransactionTableDump transactions.

All resolved filenames with MftRef, MftRefSeqNo and Lsn.


All decodes of UpdateFileNameRoot and UpdateFileNameAllocation for both redo and undo operations.

All decodes of CompensationlogRecord. Not relevant for nt5.x.

An sqlite database file with tables almost equivalent to the above csv's. The database contains 5 tables:
LogFileTmp (temp table used when recreating dataruns).

Defaults are presented in UTC 0.00, and with nanosecond precision. The default format is YYYY-MM-DD HH:MM:SS:MSMSMS:NSNSNSNS. These can be configured. The different timestamps refer to:
CTime means File Create Time. 
ATime means File Modified Time. 
MTime means MFT Entry modified Time. 
RTime means File Last Access Time. 

Reconstructing dataruns.

Many operations on the filesystem, will trigger a transaction into the $LogFile. Those relating to the $DATA attribute, ie a file's content are so far identified as;


They all leave different information in the $LogFile. Resident data modifications behave differently and can not be reconstructed just like that, at least on NTFS volumes originating from modern Windows versions. 

InitializeFileRecordSegment is when a new file is created. Thus it will have the $FILE_NAME attribute, as well as the original $DATA atribute content, including dataruns. Since the $LogFile is circular, and older events gets overwritten by newer, the challenge with the $LogFile is to get information long enough back in time. However, if InitializeFileRecordSegment is present, then we should be able to reconstruct everything, since all records written after that will also be available. We will also have information about the offset to the datarun list. This is a relative offset calculated from the beginning of the $DATA attribute. This is important information to have when calculating where in the datarun list the UpdateMappingPairs have done its modification. 

CreateAttribute is the original attribute when it was first created (if not written as part of InitializeFileRecordSegment). With this one too, we should be able to reconstruct dataruns since we have all transactions available to us. However, this one will not in itself provide us with the file name. Here too, we have the offset to the datarun list available which is extremely useful when solving UpdateMappingPairs.

UpdateMappingPairs is a transaction when modifications to the $DATA/dataruns are performed (file content has changed). The information found in this transaction is not complete, and it contains just the new values added to the existing datarun list. It also contains a relative offset that tells us where in the datarun list the changes have been written. This offset is used in combination with the offset to datarun as found in InitializeFileRecordSegment and CreateAttribute. 

SetNewAttributeSizes is a transaction that contains information about any size value related modifications done to the $DATA attribute. This is tightly connected to UpdateMappingPairs which only contains datarun changes.

With the above 4 different redo operations can for a given reference number (distinct file) reconstruct some of the filesystem change history of file. Because of the circularity of it, we only have part of the history, the most recent. The extent of the history we can retrieve highly depends on what kind of volume the target is. If it is a system volume, then a week of history is probably more than you could expect, while a removable or external or secondary disk will contain far more history. Thus we may reconstruct the full history of a deleted file (having it's $MFT record overwritten), and in turn recreate the datarun list to perform recovery upon. In other cases we may not be able to reconstruct the full history, so only a partial datarun list can be reconstructed. The final csv file with the adjusted dataruns, LogFile_DatarunsModified.csv, will have the dataruns displayed differently. 

Those starting with a "!" indicates the complete datarun list has been recreated.
Those starting with a "?" indicates a partial recovery, with the number of "**" representing missing bytes off the original datarun list.

To simplify recovery of a file based on a datarun, one can use the attached PoC called ExtractFromDataRuns. It is quite self explanatory. Just fill in the complete datarun list, the real size and init size, and a name for the output file. Optionally choose to process image files (disk or partition). Also tick off if any compressed/sparse flag have been detected. Feeding it with a partially reconstructed datarun list will not work! Please note that alternate data streams can be distinguished by the presence of a dataname and also by a differing OffsetInMft values for a given fileref.

The separate download package "" have a partition image with a small NTFS volume to test it on. On the volume there are 2 deleted non-resident files which both have their MFT record overwritten by new files. Any decent recovery software based on signature searching should be able to recover the jpg file (Tulips.jpg), because it is contigous. However, they most likely will not identify its filename, or anything other about the file. The second file will likely not be recoverable using standard tools, because it is fragmented (compressed), and by having its MFT record overwritten, resoving the file without the datarun list is impossible. Using the PoC we can identify the filename ( and extract it into perect shape (don't worry it only contains one of the sample images shipped with Windows). Read the file readme.DataRunsResolved.txt for the details about the sample image and how to interpret output and recover the files.

What we have achieved with this is to recover fragmented files which have their MFT record overwritten. Since we have reconstructed partial/complete datarun history, we can with certainty (at least if full history is reconstructed) determine whether file slack data have belonged to the given file or not.

Datarun reconstruction is broken with UNICODE (ANSI is ok).
Importing og Mft2Csv output is broken if csv is UNICODE (ANSI is ok).
Partial updates to IndexRecords (IndexRoot/IndexAllocation) are very hard to interpret, as we likely do not have knowledge of the original index. Complete records are OK though.
Circularity of a 65 MB file poses inherent and absolute restriction on how many historical FS transtions. Systemdrives thus have limited history in $LogFile, whereas external/secondary drives have more histrocal transtions stored. Can increase size of $LogFile with chkdsk (chkdsk c: /L:262144).
Changes to the data of resident files are not stored within $LogFile, only information that a change was done is stored.

The $UsnJrnl contains information in a more human friendly way. For instance each record contains fileref, filename, timestamp and explanation of what occurred. It also contains far more historical information than $LogFile, though without a lot of details. If $UsnJrnl is active, then all transactions written to it during the recycle life of the $LogFile are also present within $LogFile. This means that there is no reason to decode the $UsnJrnl in order to understand $LogFile any better.

Slack space
In this context slack space means the space within a RCRD record that is the leftover in the record beyond the last transaction. I don't think this has been described before, so let me explain. Volume slack is the unused space between the end of file system and end of the partition where the file system resides. MFT record slack is kind of the same, but refers to the space found after the record end signature (0xFFFFFFFF) up to the physical record end (0x400 or 0x1000). And slack space within the $LogFile is thus the space found beyond the last transaction and up to the RCRD record end (usually 0x1000). These transactions from slack are actually there from before the $LogFile was recycled (overwriten). There is also an algorithm identifying valid transactions from slack space. In addition there may also exist several layers of such slack space. 
Lets say last transaction in a given RCRD record ended at offset 0x00007D27. From this offset and up to 0x00007FFF we have 0x2D8 bytes of slack space. It could then be that the bytes starting at 0x00007D28 is not a valid transaction header because it is in the middle of a transaction. The program will then (if configured to) attempt to rebuild a pseudo header with valid values in order to decode the transaction. If it fails at rebuild any valid header, it considers to much information from the original header is lost, and it will consider these bytes as lost, and continue scanning the rest of the slack space for any valid transaction header. The lost bytes will be logged to debug.log for you to investigate. On the other hand, if it was able rebuild a valid header, the information about this would be found in the lf_TextInformation field. Lets say it identified a transaction starting at offset 0x00007D48 and with size 0xB0. Then another good transactions was identified immediately following at offset 0x00007DF8 with size 0xE0. However at offset 0x00007ED8 there was no valid transaction header. This means we are now at the seond layer of slack space within that RCRD record. Now lets say the program, after re-scanning, was able to identify a valid transaction header at offset 0x00007F18. This transaction would be marked in the csv with a value of 2 in the FromRcrdSlack field. However, consider the transaction total size pushing the offset beyong the RCRD record size. In essensce, this would be a partially recovered transaction, which could decode fine, but will be found with a value of 1 in the IncompleteTransaction field. Remember the debug.log is very detailed and will help understanding the decoded output, especially what comes from slack space.

32-bit vs 64-bit configuration.
This setting is important to set correctly. It means which OS has handled the target volume. The point is that handling of OpenAttributeTable differ from 32-bit to 64-bit OS. There might of course be cases (for instance usb disk) where the volume has been handled by several different OS's, in which case it might be tricky to get this setting 100% correct. From version an autodetection mechanism was implemented. However it is still recommended to attempt setting this configuration correct. In the TextEinformation field there will printed a "Mixed OS detected" message when it detect OpenAttributeTableDump format that differs from the configuration or both types are detected. In any way, it may be usefull to look into the LogFile_OpenAttributeTable.csv to evaluate the output. If you see entries with columns containing strange values, then this particular setting might be wrong. If so, then most values are way off. For instance most AttributeType fields are UNKNOWN, Lsn is not within the current range, MftRef is too high and MftRefSeqNo of 0. Beware that AttributeType is resolved as UNKNOWN for unitialized entries in the table, but these are easy to spot as all values after AllocatedOrNextFree are 0 and is perfectly valid. These challenges seems fixed with the autodetection implemented in version

Extraction of resident attribute updates (UpdateResidentValue).
The UpdateResidentValue operation is for updates to the content of resident attributes. The configuration of "Extract resident updates of min size" will let you extract the binary modification to the resident attribute. The input field is for the minimum size in bytes to extract. The likely most interesting use of this feature is with volumes handled by Nt5.x (XP,2003), where the complete updates to $DATA attribute (normal file content) are stored in the redo and undo fields with UpdateResidentValue. Files with a resident $DATA content, are smaller sized files, at most 744 bytes (with MFT record size of 1024) but usually less. The extracted data is written to a subfolder named ResidentExtract. The output files are named with a logic like this; MFT($MFTRef)_$OffsetInMft_$AttributeOffset_LSN($Lsn)$Operation.bin. For example MFT(1643)_0x0098_0x00B8_LSN(1415242628)redo.bin would mean MFT record number 1643, the offset of target attribute in MFT is 0x98, the offset of the modification within target attribute is 0xB8, the LSN of the transaction is 1415242628, and this was for a redo operation. The extracts for undo operations are thus containing the data at that offset before the modification. The activation of this feature, will trigger som irrelevant and non-interesting output. Most false positives are automatically filtered, but some are unavoidable. For instance updates of $INDEX_ROOT, $ATTRIBUTE_LIST and $BITMAP may be included. It is possible though to manually trace back the attributes to filter out non-$DATA but comparing the OffsetInMft with what is found in the relevanr InitializeFileRecordSegment or if applicable in $MFT itself. From version etraction of $EA was added. See note about $EA attributes.

Filenames csv
From version there was implemented a new feature to dump all identified filenames. The source of these entries come from InitializeFileRecordSegment, UpdateNonResidentValue, AddindexEntryRoot, DeleteindexEntryRoot, AddIndexEntryAllocation, DeleteIndexEntryAllocation and WriteEndOfIndexBuffer. The csv with these filenames, LogFile_FileNames.csv, thus contains a rebuilt history of all filename, MftRef and MftRefSeqNo for the duration of the $LogFile hostory. You will thus be able to see all the various filenames a given Mft Record have had during the timespan that the $LogFile covered. When a file is renamed, the MftrefSeqNo is not incremented. When a MFT record is marked as deleted, and later reused, the MftRefSeqNo is incremented by one with the new initialization.

With Transactional NTFS (TxF), there will be occurences of the named stream $TXF_DATA in the attribute $LOGGED_UTILITY_STREAM. It is always resident, and there can be several per file. The usage is not widespread, and Microsoft actually encourage alternative methods. [quote]Microsoft strongly recommends developers utilize alternative means to achieve your application�s needs.[/quote]. It seems only few software update mechanisms uses it. Every file/folder created with it, get a unique fileref (not to be confused with MFT record numbers). This unique fileref is what the file with be named when it is deleted later on (and moved into the \$Extend\$RmMetadata\$Txf folder). The $Tops file's standard unamed $DATA attribute contains information about where in the $TxfLogContainer00000000000000000001 the next transaction would go. The $Tops named $DATA stream $T contains actual data (from transactional file operations) that is recycled. Within $TXF_DATA there is also a field called LsnUserData that is an offset into the $TxfLogContainer00000000000000000001 where a lot more transaction details are found. LsnNtfsMetadata also contains an offset into the same $TxfLogContainer00000000000000000001 file. Note that these offsets may be changed at a later stage, and is then referred to in an UpdateResidentValue operation in the $LogFile. The MftRef_RM_Root field is the file record number of the root of the resource manager responsible for the transaction associated with this file (default is 5 which is root directory). For updates to $TXF_DATA through UpdateResidentValue, the text "Partial update" is appended in the TextInformation field. For this reason, there may also exist values like 0x0000000000007E-- in LsnUserData. This is because an UpdateResidentValue of size 31 bytes means, the first 3 members of the complete structure is missing, and also 1 byte from LsnUserData is missing. The missing byte is the "low byte", thus 2 characters/nibbles of "-" each, as replacement to indicate the missing unknown byte (actually just the existing byte). If UpdateResidentValue was at 32 bytes, then no replacement characters/nibbles would be needed as the full LsnUserData was provided.

In this log there is written error messages and verbose information. If strange values or comments are found in output, search debug.log for the lsn. Usually the whole transaction is dumped, which might help understand. To aid investigations of transactions, it might be helpful to fill out a command separated list of lsn's in the input field. Then verbose information for these lsn's will be printed.

The $EA attribute is a set of name value pairs and are thought to be there for compatibility with OS/2. It is rarely seen used, though some malware has used it. There can be several pairs per attribute, but the maximum size of all are 65535 bytes. There can only be 1 $EA atribute per file. Larger content can be spread across several files as shown in the EaTools PoC;
Existing $EA attributes can not be directly modified. Additional name/value pairs can be added at any time as long as size of all pairs are below 0xFFFF bytes. The strange behaviour of this attribute is that all content of the entire $EA is written to $LogFile for any new pair. That is for both resident and nonresident $EA content. That means if a third name/value pair is added to an $EA, all 3 pairs are written to $LogFile either through UpdateResidentValue or UpdateNonResidentValue.

Implement more analysis of data present in ntfs.db. Currently it will require a certain level of NTFS knowledge in order to understand the output.
Dump non-resident $EA content.

Command line use
If no parameters are supplied, the GUI will by default launch. Valid switches are:

Input $LogFile extracted. Required unless /LogFileFragmentFile: is used.
Optionally input a $LogFile fragment. Can be any broken fragment with at least 1 transaction in.
The output csv of latest Mft2Csv. Optional.
The output path of all parser output. Defaults to program directory.
A string value for the timezone. See notes further down for valid values.
The output format of csv. Valid values can be l2t, BodyFile, all.
Boolean value for handling MFT's with bad format. Default is 0. Can be 0 or 1.
Boolean value to skip fixups. Primarily used with memory dumps. Default is 0. Can be 0 or 1.
The separator to use in the csv. Default is |
Boolean value for decoding unicode strings. Default is 0. Can be 0 or 1.
An integer from 1 - 6 for specifying the timestamp format. Start the gui to see what they mean. Default is 6.
What precision to use in the timestamp. Valid values are None, MilliSec and NanoSec. Default is NanoSec.
The separator to put in the separation of the precision. Default is ".". Start the gui to see what it means.
The separator to put in between MilliSec and NanoSec in the precision of timestamp. Default is empty/nothing. Start the gui to see what it means.
A custom error value to put with errors in timestamp decode. Default value is '0000-00-00 00:00:00', which is compatible with MySql, and represents and invalid timestamp value for NTFS.
Boolean value for specifying if reconstruction of dataruns should be performed. Default is 0. Can be 0 or 1.
The size of the MFT records. Valid values are 1024 and 4096. Default is 1024.
Boolean value for specifying if attempting to rebuild header from broken transactions recovered from slack. Default is 0. Can be 0 or 1.
Number of sectors per cluster. Default is 8. Can be 1,2,4,8,16,32,64 and 128.
A value between 0 and 1 (100%), as threshold for identifying valid/invalid transaction from slack, as based on the previous lsn. Default is 0.1 (10%).
Boolean value for specifying if the volume is originating from a 32-bit system. Default is 0 (x64). Can be 0 or 1.
Boolean value for specifying if extraction of resident data content should be performed. Default is 0. Can be 0 or 1.
Value for setting the minimum size in bytes for what to extract. Only used with ExtractDataUpdates setting. Default is 2 (bytes) as minimum.
A comma separated list of lsn's that will trigger verbose mode. Verbose logging information is found in debug.log.
Boolean value for specifying if parser should omit all sqlite3 operations. If the generated ntfs.db (reconstruct of dataruns, import of mft csv) are not used, this can safely be ignored/omitted. Default is 0. Can be 0 or 1.
Boolean value for activating a simple validation on a fragment only, and not full parser. Can be 0 or 1. Will by default write fixed fragment to OutFragment.bin unless otherwise specified in /OutFragmentName:
The output filename to write the fixed fragment to, if /VerifyFragment: is set to 1. If omitted, the default filename is OutFragment.bin.
Boolean value to skip fixups. Used with reconstructed fragments. See examples. Default is 0. Can be 0 or 1.
Boolean value to treat $LogFile as broken. Used with reconstructed RCRD's and will bypass several validation checks. Default is 0. Can be 0 or 1.

The available TimeZone's to use are:

Error levels
The current exit (error) codes have been implemented in commandline mode, which makes it more suited for batch scripting.
1. No valid transaction could be decoded. Empty output.
2. A likely incorrect input parameters was detected. Most often this will SectorsPerCluster, and more rarely MftRecordSize. The default is SectorsPerCluster=8 and MftRecordSize=1024.
3. Fragment failed validation.
4. Failure in writing fixed fragment to output. Validation of fragment succeeded though.

Thus if you get %ERRORLEVEL% == 1 it means nothing was decoded, and if you get %ERRORLEVEL% == 2 then most likely SectorsPerCluster param was set incorrect.

LogFileParser.exe /LogFileFile:c:\temp\$LogFile /TimeZone:2.00 /MftRecordSize:4096 /ExtractDataUpdates:1 /ExtractDataUpdatesSize:8 /SectorsPerCluster:64 /TSFormat:1 /TSPrecision:NanoSec /Unicode:1
LogFileParser.exe /LogFileFile:c:\temp\$LogFile /TimeZone:-5.00 /MftRecordSize:1024 /SectorsPerCluster:1 /TSFormat:1 /TSPrecision:MilliSec /Unicode:0
LogFileParser.exe /LogFileFile:c:\temp\$LogFile /TSFormat:3 /ReconstructDataruns:1
LogFileParser.exe /LogFileFile:c:\temp\$LogFile /TimeZone:7.00 /MftRecordSize:1024 /TSFormat:2 /TSPrecision:None /SourceIs32bit:1
LogFileParser.exe /LogFileFile:c:\temp\$LogFile /TimeZone:7.00 /MftRecordSize:1024 /TSFormat:2 /TSPrecision:None /SourceIs32bit:1 /SkipSqlite3:1
LogFileParser.exe /LogFileFile:c:\temp\$LogFile /OutputPath:E:\LFP-Output
LogFileParser.exe /LogFileFile:c:\temp\$LogFile /MftCsvFile:c:\temp\$MFT
LogFileParser.exe /LogFileFragmentFile:c:\temp\fragment.bin /OutputPath:E:\LFP-Output
LogFileParser.exe /LogFileFragmentFile:c:\temp\fragment.bin /OutputPath:E:\LFP-Output /VerifyFragment:1
LogFileParser.exe /LogFileFragmentFile:c:\temp\fragment.bin /OutputPath:E:\LFP-Output /VerifyFragment:1 /OutFragmentName:FragmentsRCRDCollection.bin
LogFileParser.exe /LogFileFile:E:\LFP-Output\FragmentsRCRDCollection.bin /OutputPath:E:\LFP-Output /SkipFixups:1 /BrokenLogFile:1

Last example is a basic that uses common defaults that work out just fine in many cases. Also compatible with MySql imports.

Windows Internals 6th Edition