[CMLIB] Registry transactional writes. #3932

mrmks04 · 2021-08-28T21:54:40Z

Purpose

Implemented restore registry from log file

Proposed changes

Changed create flags for log files
Changed log write function

HBelusca · 2021-08-28T21:58:57Z

My general comment for now is: please fix the code formatting so that it sticks to the one used in the existing files (currently there are inconsistencies in: the indentation -- of parameters, etc. -- and the style of comments -- within the functions, it should be /* xxx */ ).

Question: Is this log format the one compatible with Windows? (which is also described in https://web.archive.org/web/20201109000715/http://amnesia.gtisc.gatech.edu/~moyix/suzibandit.ltd.uk/MSc/ )

mrmks04 · 2021-08-28T22:21:19Z

Question: Is this log format the one compatible with Windows? (which is also described in https://web.archive.org/web/20201109000715/http://amnesia.gtisc.gatech.edu/~moyix/suzibandit.ltd.uk/MSc/ )

I used this information about log files https://github.com/msuhanov/regf/blob/master/Windows%20registry%20file%20format%20specification.md#format-of-transaction-log-files
And several log dumps from Windows server 2003 analyzed

HBelusca · 2021-08-28T22:29:29Z

Question: Is this log format the one compatible with Windows? (which is also described in https://web.archive.org/web/20201109000715/http://amnesia.gtisc.gatech.edu/~moyix/suzibandit.ltd.uk/MSc/ )

I used this information about log files https://github.com/msuhanov/regf/blob/master/Windows%20registry%20file%20format%20specification.md#format-of-transaction-log-files
And several log dumps from Windows server 2003 analyzed

OK; it would be nice to know whether all of this also confirms what has been found independently in this master thesis I linked above, in order to be sure we have a good understanding on what's going on.

sdk/lib/cmlib/hiveinit.c

HBelusca · 2021-08-29T00:34:46Z

sdk/lib/cmlib/hiveinit.c

+    // Validate log header
+    if (!HvpVerifyHiveHeader(&logBaseBlock, HFILE_TYPE_LOG))
+    {
+        DPRINT1("LOG header corrupted\n");


TODO: Log can self-heal (depending on the value of the CmpSelfHeal variable, and on some flags set in the CmpBootType variable).

HBelusca · 2021-08-29T00:35:08Z

sdk/lib/cmlib/hiveinit.c

+
+    if (!isSuccess)
+    {
+        DPRINT1("Read LOG file failed\n");


Self-healing possible? (see below)

HBelusca · 2021-08-29T00:40:17Z

sdk/lib/cmlib/hiveinit.c

+    {
+        DPRINT1("LOG header corrupted\n");
+        return FALSE;
+    }


What has been done until now, is for restoring the header when handling the RecoverHeader error case during loading.
What is following next below, is for handling the RecoverData error case during loading.
It would be nice to have these in two separate recovery functions.

HBelusca · 2021-08-29T20:43:10Z

ntoskrnl/config/cminit.c

@@ -545,14 +545,14 @@ CmpOpenHiveFiles(IN PCUNICODE_STRING BaseName,

    /* Now create the file */
    Status = ZwCreateFile(Log,
-                          DesiredAccess,
+                          DesiredAccess | SYNCHRONIZE,


Why is this needed? (same question for the IoFlags too)

Without this flag, it have stack corrupt. It happen if inside RegistryHive->FileWrite function ZwWriteFile return STATUS_PENDING. Or in RegistryHive->FileFlush function ZwFlushBuffersFile return STATUS_PENDING.

Does that claim still stand @mrmks04 with todays master head?

With latest code from master problem gone.

drivers/filesystems/fastfat_new/create.c

sdk/lib/cmlib/hivewrt.c

sdk/lib/cmlib/hiveinit.c

HBelusca · 2022-02-08T00:29:08Z

PR is in review.

drivers/filesystems/fastfat_new/create.c

HBelusca · 2022-02-08T20:34:06Z

ntoskrnl/config/cminit.c

                          &ObjectAttributes,
                          &IoStatusBlock,
                          NULL,
                          AttributeFlags,
                          ShareMode,
                          CreateDisposition,
-                          IoFlags,
+                          IoFlags | FILE_SYNCHRONOUS_IO_NONALERT,


Note: Primary is also opened with this flag + the sync attributes set too. Do we want this? What does windows do?

Windows server 2003 open file with flags 0x40008, for "ntuser.dat.log".

Then this extra flag is not set by Win2k3.

sdk/lib/cmlib/hivedata.h

sdk/lib/cmlib/hivewrt.c

HBelusca · 2022-02-08T23:15:06Z

sdk/lib/cmlib/hiveinit.c

+
+    *BaseBlock = LogBaseBlock;
+
+    return TRUE;


Here HiveSuccess.

HBelusca · 2022-02-08T23:15:55Z

sdk/lib/cmlib/hiveinit.c

+    if (!IsSuccess)
+    {
+        DPRINT1("Read LOG file failed\n");
+        return FALSE;


Here and below, all the failures would be Fail.

HBelusca · 2022-02-08T23:17:11Z

sdk/lib/cmlib/hiveinit.c

+ *
+ * Function to recover hive data from log.
+ */
+BOOLEAN CMAPI


Same suggestion here as for the previous Recover function.

Not realy. "recover hive data" and "recover hive header".:smiley:

HBelusca · 2022-02-08T23:30:36Z

sdk/lib/cmlib/hiveinit.c

 {
    if (BaseBlock->Signature != HV_HBLOCK_SIGNATURE ||
        BaseBlock->Major != HSYS_MAJOR ||
        BaseBlock->Minor < HSYS_MINOR ||
-        BaseBlock->Type != HFILE_TYPE_PRIMARY ||
+        BaseBlock->Type != HfileType ||


Since you generalized this function to be also usable for HFILE_TYPE_LOG types 👍 :
have you checked whether Windows registry LOG files' base block have all the Major and Minor fields set?

Base block in log file is copy of original registry base block.

OK so it's better to check for this.

sdk/lib/cmlib/hiveinit.c

Co-authored-by: Hermès BÉLUSCA - MAÏTO <hermes.belusca-maito@reactos.org>

=== DOCUMENTATION REMARKS === This implements (also enables some parts of code been decayed for years) the transacted writing of the registry. Transacted writing (or writing into registry in a transactional way) is an operation that ensures the successfulness can be achieved by monitoring two main points. In CMLIB, such points are what we internally call them the primary and secondary sequences. A sequence is a numeric field that is incremented each time a writing operation (namely done with the FileWrite function and such) has successfully completed. The primary sequence is incremented to suggest that the initial work of syncing the registry is in progress. During this phase, the base block header is written into the primary hive file and registry data is being written to said file in form of blocks. Afterwards the seconady sequence is increment to report completion of the transactional writing of the registry. This operation occurs in HvpWriteHive function (invoked by HvSyncHive for syncing). If the transactional writing fails or if the lazy flushing of the registry fails, LOG files come into play. Like HvpWriteHive, LOGs are updated by the HvpWriteLog which writes dirty data (base block header included) to the LOG themselves. These files serve for recovery and emergency purposes in case the primary machine hive has been damaged due to previous forced interruption of writing stuff into the registry hive. With specific recovery algorithms, the data that's been gathered from a LOG will be applied to the primary hive, salvaging it. But if a LOG file is corrupt as well, then the system will perform resuscitation techniques by reconstructing the base block header to reasonable values, reset the registry signature and whatnot. This work is an inspiration from PR reactos#3932 by mrmks04 (aka Max Korostil). I have continued his work by doing some more tweaks and whatnot. In addition to that, the whole transaction writing code is documented. === IMPORTANT NOTES === HvpWriteLog -- Currently this function lacks the ability to grow the log file size since we pretty much lack the necessary code that deals with hive shrinking and log shrinking/growing as well. This part is not super critical for us so this shall be left as a TODO for future. HvLoadHive -- Currently there's a hack that prevents us from refactoring this function in a proper way. That is, we should not be reading the whole and prepare the hive storage using HvpInitializeMemoryHive which is strictly used for HINIT_MEMORY but rather we must read the hive file block by block and deconstruct the read buffer from the file so that we can get the bins that we read from the file. With the hive bins we got the hive storage will be prepared based on such bins. If one of the bins is corrupt, self healing is applied in such scenario. For this matter, if in any case the hive we'll be reading is corrupt we could potentially read corrupt data and lead the system into failure. So we have to perform header and data recovery as well before reading the whole hive. Another important note is that the added code grew up the binary size of x64 FreeLdr and that makes a PE image check fail because the bootloader is too large. Currently such code is disabled for AMD64, until a real fix comes into place.

=== DOCUMENTATION REMARKS === This implements (also enables some parts of code been decayed for years) the transacted writing of the registry. Transacted writing (or writing into registry in a transactional way) is an operation that ensures the successfulness can be achieved by monitoring two main points. In CMLIB, such points are what we internally call them the primary and secondary sequences. A sequence is a numeric field that is incremented each time a writing operation (namely done with the FileWrite function and such) has successfully completed. The primary sequence is incremented to suggest that the initial work of syncing the registry is in progress. During this phase, the base block header is written into the primary hive file and registry data is being written to said file in form of blocks. Afterwards the seconady sequence is increment to report completion of the transactional writing of the registry. This operation occurs in HvpWriteHive function (invoked by HvSyncHive for syncing). If the transactional writing fails or if the lazy flushing of the registry fails, LOG files come into play. Like HvpWriteHive, LOGs are updated by the HvpWriteLog which writes dirty data (base block header included) to the LOG themselves. These files serve for recovery and emergency purposes in case the primary machine hive has been damaged due to previous forced interruption of writing stuff into the registry hive. With specific recovery algorithms, the data that's been gathered from a LOG will be applied to the primary hive, salvaging it. But if a LOG file is corrupt as well, then the system will perform resuscitation techniques by reconstructing the base block header to reasonable values, reset the registry signature and whatnot. This work is an inspiration from PR reactos#3932 by mrmks04 (aka Max Korostil). I have continued his work by doing some more tweaks and whatnot. In addition to that, the whole transaction writing code is documented. === IMPORTANT NOTES === HvpWriteLog -- Currently this function lacks the ability to grow the log file size since we pretty much lack the necessary code that deals with hive shrinking and log shrinking/growing as well. This part is not super critical for us so this shall be left as a TODO for future. HvLoadHive -- Currently there's a hack that prevents us from refactoring this function in a proper way. That is, we should not be reading the whole and prepare the hive storage using HvpInitializeMemoryHive which is strictly used for HINIT_MEMORY but rather we must read the hive file block by block and deconstruct the read buffer from the file so that we can get the bins that we read from the file. With the hive bins we got the hive storage will be prepared based on such bins. If one of the bins is corrupt, self healing is applied in such scenario. For this matter, if in any case the hive we'll be reading is corrupt we could potentially read corrupt data and lead the system into failure. So we have to perform header and data recovery as well before reading the whole hive.

GeoB99 · 2023-11-19T20:08:42Z

Superseded by #4571. Thank you for contributing nonetheless!

mrmks04 requested review from HeisSpiter, ThFabba and tkreuzer as code owners August 28, 2021 21:54

github-actions bot added drivers Kernel mode drivers and frameworks kernel&hal Code changes to the ntoskrnl and HAL labels Aug 28, 2021

HBelusca requested review from HBelusca and removed request for HeisSpiter August 28, 2021 21:56

HBelusca added the review pending For PRs undergoing review. label Aug 28, 2021

HBelusca added the enhancement For PRs with an enhancement/new feature. label Aug 28, 2021

HBelusca reviewed Aug 29, 2021

View reviewed changes

mrmks04 force-pushed the registry_transaction branch from 8cd2d15 to 724986b Compare August 29, 2021 20:32

HBelusca reviewed Aug 29, 2021

View reviewed changes

drivers/filesystems/fastfat_new/create.c Outdated Show resolved Hide resolved

HBelusca reviewed Aug 29, 2021

View reviewed changes

sdk/lib/cmlib/hivewrt.c Outdated Show resolved Hide resolved

mrmks04 force-pushed the registry_transaction branch from 724986b to a5649a6 Compare August 29, 2021 21:03

[CMLIB] Registry transactional writes.

2e0e7f2

mrmks04 force-pushed the registry_transaction branch from a5649a6 to 2e0e7f2 Compare September 2, 2021 20:58

Extravert-ir reviewed Jan 16, 2022

View reviewed changes

sdk/lib/cmlib/hiveinit.c Outdated Show resolved Hide resolved

Recover function splitted to recover header and recover data

2d3b888

mrmks04 force-pushed the registry_transaction branch from 94a48dd to 2d3b888 Compare February 4, 2022 20:23

HBelusca reviewed Feb 8, 2022

View reviewed changes

Update drivers/filesystems/fastfat_new/create.c

8ac4a9d

Co-authored-by: Hermès BÉLUSCA - MAÏTO <hermes.belusca-maito@reactos.org>

github-actions bot removed the drivers Kernel mode drivers and frameworks label Feb 9, 2022

mrmks04 and others added 2 commits February 9, 2022 23:20

Update sdk/lib/cmlib/hivewrt.c

1fad5d5

Co-authored-by: Hermès BÉLUSCA - MAÏTO <hermes.belusca-maito@reactos.org>

Review fixes

b956b00

HBelusca self-assigned this Apr 3, 2022

GeoB99 closed this Nov 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CMLIB] Registry transactional writes. #3932

[CMLIB] Registry transactional writes. #3932

mrmks04 commented Aug 28, 2021

HBelusca commented Aug 28, 2021

mrmks04 commented Aug 28, 2021

HBelusca commented Aug 28, 2021

HBelusca Aug 29, 2021

HBelusca Aug 29, 2021

HBelusca Aug 29, 2021

HBelusca Aug 29, 2021

mrmks04 Aug 29, 2021

JoachimHenze Feb 4, 2022 •

edited

mrmks04 Feb 6, 2022

HBelusca commented Feb 8, 2022

HBelusca Feb 8, 2022

mrmks04 Feb 9, 2022 •

edited

HBelusca Feb 9, 2022

HBelusca Feb 8, 2022

HBelusca Feb 8, 2022

HBelusca Feb 8, 2022

mrmks04 Feb 9, 2022

HBelusca Feb 8, 2022

mrmks04 Feb 9, 2022

HBelusca Feb 9, 2022

GeoB99 commented Nov 19, 2023

[CMLIB] Registry transactional writes. #3932

[CMLIB] Registry transactional writes. #3932

Conversation

mrmks04 commented Aug 28, 2021

Purpose

Proposed changes

HBelusca commented Aug 28, 2021

mrmks04 commented Aug 28, 2021

HBelusca commented Aug 28, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JoachimHenze Feb 4, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HBelusca commented Feb 8, 2022

Choose a reason for hiding this comment

mrmks04 Feb 9, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GeoB99 commented Nov 19, 2023

JoachimHenze Feb 4, 2022 •

edited

mrmks04 Feb 9, 2022 •

edited