Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using attribute length instead of allocated size for nonresident $DATA runlist? #92

Open
crotron opened this issue Jun 28, 2021 · 2 comments

Comments

@crotron
Copy link

crotron commented Jun 28, 2021

In the definition of attr_nonresident_fmt, I notice that the runlist is specified to depend on the runlist_offset and the allocated_size. Is there any reason, at least in theory, why it shouldn't also be possible to use the attribute's length (defined in attr_header_fmt) for this as an alternative?

The reason I ask is because for some very large files, I have MFT records that look like this:

image

After the start of the record, there's a $DATA attribute that starts at an offset of 0x38. The length of it is specified to be 0xF0, or 240 (15 rows). Counting down 15 rows from the start of the attribute, you end up at the MFT record terminator at the end of the runlist. So it appears to me that it could potentially be used, at least in this specific scenario.

One other thing that can be noted is that the last 3 8-byte words before the start of the runlists are all zeros (corresponding to allocated_size, real_size, and initialized_size). I can't explain why this is the case - I doubt it is random disk corruption since I see this specific thing happening in several other MFT records that otherwise look perfectly normal. Maybe it could be a software bug, or maybe it is just supposed to be this way due to the unusual circumstances that I find this occurring in (namely, inside a MFT record that is being provided by a nonresident $ATTRIBUTE_LIST from another MFT record). In any case, it seems like RecuperaBit looks at one or more of those sizes when parsing the format, decides the runlist is of length 0 based on those size values, and stops without having processed any of the runlist. As a result, when it is time to reconstruct the file, there are some missing sections of data, which can cause the output file to become corrupted and/or blank (ERROR:root:Cannot restore $DATA attribute(s) for File(...)).

I think, if it were to use the attribute length instead (assuming that is even a valid thing to do) it may be possible to reconstruct the file in situations where you cannot rely on the sizes.

@Lazza
Copy link
Owner

Lazza commented Jul 13, 2021

Thank you for taking the time to report these findings. Could you include a hexdump of that specific sector only and in text form?

I might try and see if something can be done, the important thing is not to introduce "smart" heuristics that could potentially break valid $DATA attributes.

@crotron
Copy link
Author

crotron commented Jul 29, 2021

Sorry, I hadn't been paying attention to this and I missed this. I assumed Github would send me a notification when I got a reply, but I never got one.

Here is the hexdump:

46 49 4C 45 30 00 03 00 00 00 00 00 00 00 00 00 01 00 00 00 38 00 01 00 30 01 00 00 00 04 00 00 6F 87 42 00 00 00 02 00 01 00 00 00 02 90 42 00 C6 14 00 00 00 00 00 00 80 00 00 00 F0 00 00 00 01 00 40 00 00 00 00 00 81 2C 09 00 00 00 00 00 42 41 0A 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 42 29 0A 42 43 1D 2F 32 70 09 AC F0 00 32 0D 09 4B 50 01 32 65 09 C6 0F 02 32 BE 09 90 9F 01 32 29 0A CB EF 01 32 DA 09 72 00 02 32 EA 09 77 BE 01 32 E4 09 33 61 01 32 E3 09 AF FF 00 32 C7 09 45 30 01 32 83 09 51 B0 01 32 89 09 F6 0F 01 32 3D 09 51 60 01 32 04 09 1C 00 01 32 8D 09 A6 8F 01 32 B7 09 E3 8F 01 22 F8 09 AB 3F 32 C7 09 81 20 02 32 07 0A 74 CF 01 32 2C 0A 21 60 02 32 0A 0A 06 60 01 32 93 09 4D 40 01 32 56 09 FB FF 00 32 E4 08 04 21 01 32 C8 08 D7 2F 01 32 3F 09 74 FF 00 32 5D 09 B4 4F 01 32 C0 07 B2 4F 01 00 00 FF FF FF FF 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 C6 14

Let me know if this format isn't ideal.

I ended up being able to recover everything on my drive (including these files; as far as I can tell, there is no data corruption) by putting in a small hack: inside _attributes_reader, after running parse_mft_attr, if I see that the attribute is $DATA, it is non-resident, and the runlist is [], I manually cut out the bits according to the current offset value, the attribute's runlist_offset, and the attribute's length, call runlist_unpack on those bits, and set the attribute's runlist to whatever that function returns. It doesn't seem to me to really be an ideal way of doing things, but at the very least it allowed me to make changes for handling this specific case without messing with other files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants