Deterministic MVID
So far the deterministic MVID was computed as a hash of the IL metadata. It didn't include PE headers.
This means that if there's an assembly which only differs in the PE architecture it would have the same MVID.
This can happen relatively easily for facades (assemblies with only type forwarders) as those are typically
identical across platforms, possibly except for the architecture.
The correct way to calculate MVID is to hash the content of the entire file with the MVID itself and
the strong name signature (if presenet) zeroed out.
This change modified the MVID computation to work that way: Write the entire image with MVID all zeroes,
compute the hash, write the MVID and then compute the strong name.
PdbChecksum
In order for the produced PDB to be fully verifiable the assembly must contain a PdbChecksum
debug header entry which has a cryptographic hash of the associated PDB. The wayt to calculate the checksum
is prescribed as other programs must be able to validate it. See https://github.com/dotnet/runtime/blob/main/docs/design/specs/PE-COFF.md#pdb-checksum-debug-directory-entry-type-19.
This change implements that behavior. Symbols are now written explicitly once all metadata is processed
but before the final assembly image is written. For portable PDBs the full file is written with PDB ID
set to all zeroes. Then the crypto hash is calculated (using SHA256).
The hash is then written into the PdbChecksum debug header entry of the assembly image.
Deterministic Portable PDB ID
So far the PDB ID was set to the same value as MVID. With the fix for MVID described above, this doesn't
work anymore. In case of an embedded PDB, it's not possible to calculate MVID using the hash
with the PDB embedded, as the PDB would now contain the MVID (cycle).
The recommended way to calculate PDB ID is to use the same mechanism as for calculating PDB checksum
and use the first 20 bytes of the has as the PDB ID.
This change implements this by using the first 20 bytes of the hash and sets the PDB ID as the last
change of the portable PDB.
The calculated PDB ID is also then set into the CodeView debug header entry of the assembly.
Added tests for reading and writing both portable and embedded portable PDBs with checksums.
Added test to validate stability of PDB ID across multiple writes.
Added test to validate stability of MVID across multiple writes and the fact that it changes
when just the PE architecture is changed.