Skip to content

Replace ReadFile/WriteFile usages with stdio #116203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 31 commits into
base: main
Choose a base branch
from

Conversation

huoyaoyuan
Copy link
Member

Replaces the majority of ReadFile/WriteFile instances with stdio, and fstream for some superpmi cases.
Remaining instances are StgIO and superpmi/fileio, which uses the same file handles for memory map.

This PR aims to remove as much ReadFile/WriteFile instances with as little change. Opening earlier for some questions.

@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jun 2, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jun 2, 2025
@huoyaoyuan huoyaoyuan added area-PAL-coreclr and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Jun 2, 2025
@@ -342,6 +343,18 @@ CreateFileWrapper(
return ret;
}

int fopen_u16_wrapper(FILE** stream, const WCHAR* filename, const WCHAR* mode)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The long path normalization should be retained in long-term, because it's required for production code like assembly loading. Forward linking doesn't work for superpmi because utilcode is not linked to it. Moving all the normalization logic to coreclrminipal would require moving SString down. What's the recommended approach for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the path normalization needed with CRT methods like fopen?

Copy link
Member Author

@huoyaoyuan huoyaoyuan Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is needed for both fopen and fstream. I skipped fopen which is only used in development tools.

@@ -138,7 +138,7 @@ unsigned int MethodContext::calculateRawFileSize()
2 /* end canary '4', '2' */;
}

unsigned int MethodContext::saveToFile(HANDLE hFile)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to use STL in all superpmi code, including the shims/driver that executes in the same process?

If so, it can be refactored and simplified a lot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superpmi doesn't have the same servicing concerns as coreclr.dll, so I believe it's okay to use the STL throughout. @jkotas do you agree?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to seperate superpmi out, to ease review and avoid unnecessary conversion of CreateFile->fopen->fstream.

@@ -632,26 +633,26 @@ HRESULT PEWriter::setDirectoryEntry(PEWriterSection *section, ULONG entry, ULONG
// across a dll boundary and use it.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still a concern? How we have dynamically linked UCRT on Windows.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC Debug builds use static linked CRT

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, debug builds use a static CRT when possible (C++/CLI tests are the only cases that don't).

Copy link
Member

@jkotas jkotas Jun 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specific comment is not a concern anymore. mscorpe.dll is not a separate library.

LONG _cRef;
HANDLE _hFile;
LONG _cRef;
FILE* _fp;

};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this file be deleted and FILE used instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It implements the COM IStream interface, but not passed over in most usages. Event pipe usage can be replaced with conditional compilation on OS-native handle like AOT.

@@ -5794,7 +5794,7 @@ void DumpHeaderDetails(IMAGE_COR20_HEADER *CORHeader, void* GUICookie)

void WritePerfData(const char *KeyDesc, const char *KeyName, const char *UnitDesc, const char *UnitName, void* Value, BOOL IsInt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole tracing can be deleted.

#define ERROR_ALREADY_EXISTS 183L
#define ERROR_FILENAME_EXCED_RANGE 206L

HRESULT HRESULT_FROM_LAST_STDIO()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the errno to HRESULT conversion really needed? Where are the HRESULTs used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in ilasm and pewriter, which is also used by ilasm. Ilasm uses the HRESULTs in IfFailGo.


int fopen_u16(FILE** stream, const WCHAR* path, const WCHAR* mode)
{
return _wfopen_s(stream, path, mode);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_wfopen is warning by msvc, but fopen is not. Is it safely allowed by SDL? Should the shape of wrapper API be fopen instead of fopen_s?

@huoyaoyuan huoyaoyuan marked this pull request as ready for review June 7, 2025 11:42
@huoyaoyuan
Copy link
Member Author

Tests are passing now. This is ready for review.

@mangod9
Copy link
Member

mangod9 commented Jun 30, 2025

There is a merge conflict. Assume this is still ready to review?


HRESULT HRESULT_FROM_LAST_STDIO()
{
return HRESULT_FROM_WIN32(::GetLastError());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is GetLastError correct here? C runtime methods are documented to return error in errno.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find any better mapping from errno, the original approach was using GetLastError when calling Windows API directly. CRT functions should also be preserving the LastError when calling Windows API. It's not used in important paths (only ilasm, ildasm and multicorejit).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRT functions should also be preserving the LastError when calling Windows API

They happen to not stomp on the last error most of the time. I do not think they are preserving it intentionally 100%.

The errno mapping and the long path support parts of this PR are not pretty. I have not figured out what to do about it yet...


FILE* result = fopen(pathU8, modeU8);

delete[] pathU8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these delete[]s overwrite errno?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, if the implementation calls free etc.


int fsetpos_64(FILE* stream, int64_t pos)
{
fpos_t fpos = (fpos_t)pos;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need fpos_t cast here? Is there a risk of 64-bit -> 32-bit truncation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fpos_t is always defined as long long on Windows so there's no risk. I don't remember why it was here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-PAL-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants