[dev] PBash Patching Procedure

Short, high-level summary

Importing

The BP consists of multiple patchers, which are split into two categories: preservers and mergers. Each patcher manages one or more tags, each tag corresponds to one or more fields (i.e. full subrecords or parts of subrecords). A preserver simply forwards values from tagged mods. Import Keywords, which manages the Keywords tag, for example, forwards the KWDA/KSIZ subrecords. A merger merges values from all tagged mods.

Any subrecord that is not subject to a tag or comes from plugins that don't have the corresponding tag applied will not be touched by the BP - it will obey the rule of one, same as always.

Merging

Merging (using the Merge Patches patcher) works differently. Merging imitates the behavior of the game engine wrt. overrides of records exactly. That is, if you have a mergeable plugin and you merge it into the BP, the resulting winning records ingame will be exactly the same records as if you had the merged plugin active in your load order. Any records that are in the merged plugin and don't get overriden by a later plugin will be copied wholesale into the BP.

If records in there do get overriden by later-loading plugins, then the BP won't include those records at all. After that, the regular import logic takes place (if you have any tags on the merged plugin or later-loading ones that override its records).

That's the reason why you can use the BP to merge 'patch plugins' created via xEdit without breaking manual conflict resolution (as long as you disable all import patchers and the Leveled List patcher (which should really be called Import Leveled Lists)).

Technical summary

The user clicks Build Patch
PatchDialog.PatchExecute is called. This is the central method.
It calls init_patchers_data, which calls initData on each patcher.
Now it differs from patcher to patcher. Some use initData to read a lot of mod files to gather information, some don't use it at all (because they don't need it).
Next, PatchExecute calls initFactories. This asks every patcher, 'hey, which record types do you actually need to read and write?'. Specifically, getReadClasses() and getWriteClasses() are called on each patcher.
Every patcher answers, and we make a union of the results.
Next, PatchExecute calls scanLoadMods. This loads every file in the load order, skipping all record types that no patcher wanted. Most patchers stick every record they could potentially patch into the BP at this point (you'll see why in a bit).
This is where the BP merges mods, resolves aliases, applies Filter and IIM tags, etc.
Then, it passes the loaded ModFile instances to the patchers (via scanModFile), and they read the information they're interested in from those files.
Next, PatchExecute calls buildPatch. This calls buildPatch on each patcher. The name is quite misleading, this is where the actual patcher logic happens.
Each patcher uses the information it has gathered in the previous stage to do its importing, merging, tweaking, whatever. They operate on the records that were placed in the BP in step 7 here - meaning that different patcher's changes merge together automatically! That's the reason patchers add every record they could patch to the BP.
The BP would be gigantic if we kept all those records. So the patchers make keepers - whenever they change a record, they grab such a keeper and pass the record's fid (that's Wrye Bash's name for FormIDs) to it.
Finally, PatchExecute calls _save_pbash. This writes out the finished BP, but only the records which were passed to a keeper in stage 12.

I skipped over a whole ton of complexity - long fids vs short fids, mappers, swappers, the interaction with records code, etc.

Deep dive into the three main phases

`initData`

Run through every source for this patcher, load them with a LoadFactory that accepts all record types that this patcher could possibly patch. This is also where we could get a massive performance boost - basically just by mirroring what the scanLoadMods phase does, i.e. collecting all the record types and sources from each patcher, then loading all the sources once for all collected record types and passing the loaded files to each patcher.

After we've loaded the sources, we then check which record types the sources actually have. So a patcher might be able to patch, say, fifty different record types, but if its sources only have an intersection of ten record types, then loading the other fourty for the rest of the load order is a waste. We store those in srcClasses. What classestemp does, I don't know - seems to just duplicate what srcClasses is doing for some reason.

Then the actual work of initData is done - collecting the data from the sources based on recAttrs_class, comparing it to the source's masters to skip ITMs and storing it in id_data for usage in the next phase.

`scanLoadMods`

Now we have the changed data from each source. We now collect all the srcClasses of the patchers, i.e. the record types that are actually present and load every mod in the load order with them.

Do note that we have to do this step by step, since merging a mod into the BP will have to alter the factories we use for loading - i.e., if we have our previous ten record types and then merge a mod that has three new ones, we now have to load all subsequent mods with these thirteen record types. The reason for that is because merging should behave exactly like the game's conflict resolution: if a merged file has records that are overriden by a later-loading plugin, we need to load those records in that plugin and let them win in the BP (this might kick the merged record(s) out of course).

We then check for every mod in the load order if it has any records that aren't in the BP already, but that are in the id_data (i.e. one of the sources wants to change the record) and where the data from id_data isn't equivalent to the data in the mod's record. We don't have to worry about updating this record to match later-loading mods, that's taken care of by update_patch_records_from_mod.

`buildPatch`

Once again, we iterate through all the record types that are actually present (i.e. in srcClasses). We check if we've forwarded any records for each of those record types into the BP and, if so, we check if we want to keep them in the BP. This is done in _inner_loop. We first check if the last forwarded version of the record has identical data to what the sources want to set it to. If it does, we skip the record. If it doesn't, we forward the data from the sources and keep the record in the BP.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly