-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PE loader/image cleanups. No-copy mapping of R2R PEs on Windows. #61938
Conversation
94c38cc
to
b346156
Compare
afa8bdf
to
5e53f38
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me and a great cleanup too, thanks Vlad! I have a couple of minor comments and observations but it would be great if the other reviewers could also take a look, most notably I'm thinking about JanV as another pair of eyes to audit the page protection logic.
{ | ||
// To minimize wasted VA space on 32-bit systems, align file to page boundaries (presumed to be 4K) | ||
// | ||
// On Windows we use 4K file alignment, per requirements of memory mapping API (MapViewOfFile3, et al). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - according to the above condition this section is used both on Windows and on 32-bit Unix (most notably Linux on arm I guess); please consider making the comment more precise.
{ | ||
// To minimize wasted VA space on 32-bit systems, align file to page boundaries (presumed to be 4K) | ||
// | ||
// On Windows we use 4K file alignment, per requirements of memory mapping API (MapViewOfFile3, et al). | ||
// We also want files always accepted by the native loader, thus we can't use the same approach as on Unix. | ||
fileAlignment = 0x1000; | ||
} | ||
|
||
int sectionAlignment = 0x1000; | ||
if (!target.IsWindows && is64BitTarget) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition !target.IsWindows && is64BitTarget
is apparently a negation of the condition target.IsWindows || !is64BitTarget
but it takes a bit of Boolean algebra to double-check that. Could we use an if-else construct or a temporary boolean variable with a sufficiently descriptive name, e.g. isWindowsOr32Bit?
src/coreclr/utilcode/pedecoder.cpp
Outdated
@@ -372,7 +372,10 @@ CHECK PEDecoder::CheckSection(COUNT_T previousAddressEnd, COUNT_T addressStart, | |||
CHECK(alignedSize >= VAL32(pNT->OptionalHeader.SizeOfImage)); | |||
|
|||
// Check expected alignments | |||
#if TARGET_WINDOWS | |||
// R2R files on Unix do not align section RVAs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo - should this be 'R2R files on Windows'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, in a way the comment does hold on Unix but it's strange that it's in a TARGET_WINDOWS conditional compilation block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Windows we make section RVAs always a multiple of SectionAlignment
. That is not an explicit requirement in the documentation, but it looks like OS loader expects that and our own validation here expects that too.
That is not the case on Unix where we keep lower 16bit of the RVA intact. I do not know why this asserts is not triggered on Unix. It should. When I tried using the same approach on Windows as on Unix, this assert was triggered.
Perhaps this style of assert does not run on Unix? (that would be mildly disturbing).
Anyways - I think the assert should be only enabled on Windows, where the condition is expected. At least as a matter of documenting the expectations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I should use a better terminology to make it more clear.
I.E. on Windows section start RVAs are multiples of SectionAlignment. On Unix this is not true.
Any suggestions?
(same applies to having gaps between sections when loaded - Windows PEs are not expected to have gaps, Unix PEs may have gaps)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I guess that probably the only cause of my confusion was the wording - having a comment about Unix behavior under a conditional compilation section TARGET_WINDOWS
at the very least looks somewhat surprising. Perhaps changing the wording to something like 'On Windows the sections are aligned blah blah blah (in contrast to Unix where they're not blah blah blah)' would be more than sufficient to address my suggestion.
src/coreclr/vm/peimagelayout.cpp
Outdated
|
||
static bool HavePlaceholderAPI() | ||
{ | ||
const MapViewOfFile3Fn INVALID_ADDRESS_CENTINEL = (MapViewOfFile3Fn)1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit - typo, should be SENTINEL I believe.
@@ -2228,7 +2228,15 @@ public static int Main(string[] args) | |||
RunTest("ExplicitlySizedClassTest", ExplicitlySizedClassTest()); | |||
RunTest("GenericLdtokenTest", GenericLdtokenTest()); | |||
RunTest("ArrayLdtokenTests", ArrayLdtokenTests()); | |||
RunTest("TestGenericMDArrayBehavior", TestGenericMDArrayBehavior()); | |||
|
|||
// TODO: BUG BUG: allocation of MD arrays is failing on OSX/ARM64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a GitHub bug we could reference here? Or should we create one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will create a bug for this and add a link to this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -2043,9 +2043,9 @@ MAPmmapAndRecord( | |||
|
|||
// Set the requested mapping with forced PROT_WRITE to ensure data from the file can be read there, | |||
// read the data in and finally remove the forced PROT_WRITE | |||
if ((mprotect(pvBaseAddress, len + adjust, prot | PROT_WRITE) == -1) || | |||
if ((mprotect(pvBaseAddress, len + adjust, PROT_WRITE) == -1) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just curious - why was this change necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this change if prot
is executable, the prot | PROT_WRITE
becomes WX. mprotect
will accept that, but pread
will fail and we would bail from this method. This would always happen in an R2R PE, since .text
is executable.
We then would go on the path of doing layout via copying, which would work, but often end up not in the desired location and later fail when applying relocations, where we had similar WX issue. Failed relocations would result in R2R disabled. The code would still run - in pure IL/JIT mode.
Ultimately we would have R2R enabled on OSX only when we are very lucky.
@@ -208,15 +214,16 @@ void Assembly::Init(AllocMemTracker *pamTracker, LoaderAllocator *pLoaderAllocat | |||
// loading it entirely. | |||
//CacheFriendAssemblyInfo(); | |||
|
|||
if (IsCollectible()) | |||
if (IsCollectible() && !pPEAssembly->IsDynamic()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are dynamic assemblies excluded here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dynamic assemblies do not have PE layouts. The GetLoadedImageContents
below would return NULL
.
src/coreclr/vm/peimage.cpp
Outdated
@@ -1192,7 +1004,12 @@ HRESULT PEImage::TryOpenFile() | |||
{ | |||
ErrorModeHolder mode(SEM_NOOPENFILEERRORBOX | SEM_FAILCRITICALERRORS); | |||
m_hFile=WszCreateFile((LPCWSTR)GetPathToLoad(), | |||
GENERIC_READ, | |||
GENERIC_READ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nit - this function shares 90% with the GetFileHandle. Since you are touching this code, it seems it would make sense to call TryOpenFile from the GetFileHandle to get rid of the code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The difference is in taking a lock and in hadling the failure. I will just make one call another.
src/coreclr/vm/peimage.h
Outdated
// PE relocations are applied | ||
// native exception handlers are registered with OS. | ||
// | ||
// Flat layouts are sufficient for operatons that do not require running native code, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Flat layouts are sufficient for operatons that do not require running native code, | |
// Flat layouts are sufficient for operations that do not require running native code, |
#if defined(TARGET_UNIX) | ||
if (((pSection->Characteristics & VAL32(IMAGE_SCN_MEM_EXECUTE)) != 0)) | ||
{ | ||
#ifdef __APPLE__ | ||
dwNewProtection = PAGE_READWRITE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this the macOS issue that you've mentioned in the PR description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, one of two issues. The other one is WX case in MAPmmapAndRecord
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
Thanks!! |
A few changes with the goal of rationalizing/simplifying handling of PE image layouts. Also adds support for no-copy mapping of R2R assemblies when need to load manually (i.e. when loading from a singlefile bundle).
PEImage
to 2 - FLAT and LOADED.(MAPPED was functionally the same as LOADED, the original reason for it was likely support for native resources)
PEImageLayout
types. There was quite a bit of overlap between them, mostly because of MAPPED.(this makes startup of composite r2r singlefile apps the same as non-singlefile, which is up to 2x improvement, depending on scenario,this opens the path to making composite the default mode for singlefile r2r now)
PEImage
/PEImageLayout
machinery.NOTE: requirements of memory mapping APIs on Windows in combination with desire to always produce PE files compatible with OS loader require the following:
(this is only a requirement for R2R files, but done in all cases for simplicity for now, this can change in the future)
Files that do not conform with above will work as before - via making a copy when loaded in the context of a singlefile bundle.
The tradeoff is slight increase in file sizes. As an example - in case of untrimmed composite r2r singlefile HelloWrold app, the overall increase is 65841K -> 68036K, which is ~3%
Another effect of this change is making R2R to work reliably on OSX. Turns out we had issues with doing manual mapping of executable code on OSX and also with performing relocations. As a result R2R on OSX would only work in cases where doing layout by copying would get the virtual address that PE specifies. That is a fragile condition and R2R worked rarely (if at all).
With this change performing manual layouts is more straightforward and failures would cause asserts, so the issues were noticed and fixed.