Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CA1838 Avoid 'StringBuilder' parameters for P/Invokes #7186

Merged
merged 25 commits into from Feb 15, 2022

Conversation

elachlan
Copy link
Contributor

@elachlan elachlan commented Dec 30, 2021

@elachlan elachlan marked this pull request as draft January 8, 2022 00:09
@elachlan
Copy link
Contributor Author

elachlan commented Jan 8, 2022

There are 21 violations of this rule. The fixes seem complex and I don't think I can sort them out.

Helpful resources:
dotnet/runtime#47735
https://docs.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1838

@Forgind
Copy link
Member

Forgind commented Jan 10, 2022

There are 21 violations of this rule. The fixes seem complex and I don't think I can sort them out.

Helpful resources: dotnet/runtime#47735 https://docs.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1838

Do you want me to mark this up-for-grabs? I'm not sure if someone else will have a chance to sort through it, but I wouldn't have high expectations from maintainers.

@elachlan
Copy link
Contributor Author

As far as I can tell instead of using stringbuilder you are supposed to use a char buffer when making calls via P/Invoke?

I will give it a go and someone can review it and tell me if I am doing it wrong.

src/Tasks/LockCheck.cs Outdated Show resolved Hide resolved
src/Tasks/NativeMethods.cs Outdated Show resolved Hide resolved
src/Tasks/NativeMethods.cs Outdated Show resolved Hide resolved
@elachlan
Copy link
Contributor Author

@Forgind in my manic googling to understand how this all works I stumbled into these:

They might be useful in replacing the msbuild maintained pinvokes.

Copy link
Member

@ladipro ladipro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of issues here caused the semantic difference between StringBuilder and char[] marshaling. StringBuilder automatically sets its Length on the way out by scanning the unmanaged buffer for \0. With char array you haver to do it manually.

src/Tasks/NativeMethods.cs Outdated Show resolved Hide resolved
src/Tasks/NativeMethods.cs Outdated Show resolved Hide resolved
src/Tasks/LockCheck.cs Outdated Show resolved Hide resolved
src/Tasks/LockCheck.cs Outdated Show resolved Hide resolved
src/Tasks/ComReference.cs Outdated Show resolved Hide resolved
src/Tasks/ComReference.cs Outdated Show resolved Hide resolved
src/Tasks/AssemblyDependency/AssemblyInformation.cs Outdated Show resolved Hide resolved
src/Tasks/AssemblyDependency/GlobalAssemblyCache.cs Outdated Show resolved Hide resolved
src/Framework/NativeMethods.cs Show resolved Hide resolved
@elachlan elachlan marked this pull request as ready for review January 28, 2022 02:56
src/Framework/NativeMethods.cs Show resolved Hide resolved
src/Framework/NativeMethods.cs Outdated Show resolved Hide resolved
src/Tasks/AssemblyDependency/GlobalAssemblyCache.cs Outdated Show resolved Hide resolved
src/Tasks/AssemblyDependency/GlobalAssemblyCache.cs Outdated Show resolved Hide resolved
src/Tasks/LockCheck.cs Outdated Show resolved Hide resolved
src/Tasks/NativeMethods.cs Show resolved Hide resolved
@stephentoub
Copy link
Member

I tested it with Span and the P/Invoke threw an exception.

You can't pass a span directly to a DllImport method (you'll be able to with the new GeneratedDllImport support coming in .NET 7 that builds out the marshaling stubs at compile time). But you can pass either a ref or a pointer. So, for example, if the DllImport signature is:

internal static extern int GetLongPathName(string path, ref char fullpath, int length);

you can pass ref MemoryMarshal.GetReference(span) as that fullPath argument.

src/Tasks/AssemblyDependency/AssemblyInformation.cs Outdated Show resolved Hide resolved
src/Tasks/AssemblyDependency/AssemblyInformation.cs Outdated Show resolved Hide resolved
src/Tasks/ComReference.cs Outdated Show resolved Hide resolved
@ladipro
Copy link
Member

ladipro commented Feb 2, 2022

[curious] @stephentoub do you think the runtime can at some point introduce zero-copy marshaling of output string buffers? Something like:

extern static int GetStringAndYesIKnowTheExactLength([MarshalAs(UnmanagedType.CreateAndPinNewString, LengthIsPassedInParameterNumber=2)] out string s, int length);

Where the stub would create a new string object of size length, pin it, and pass the pointer to unmanaged code. Technically it mutates an existing string object but the object is freshly created so if you squint you could say this is just a fancy string constructor.

@stephentoub
Copy link
Member

stephentoub commented Feb 2, 2022

Do you have an example Win32 API that looks like that, where the exact length of the output is known in advance? Typically it's the API that tells the caller how much it wrote to the caller-supplied buffer. (And I say Win32 because on Unix the lingua-franca is UTF8 which couldn't write directly into the string buffer anyway.)

That said, you can already do that if you really want to. Just make the call inside of a string.Create callback and hand the span for the string buffer off to the native call (either pinning it and passing a pointer or using the new source gen marshaling support for spans).

@ladipro
Copy link
Member

ladipro commented Feb 2, 2022

This very PR has three occurrences of the pattern:

  1. Call an API with null buffer to get the length (kernel32!GetShortPathName, kernel32!GetLongPathName, fusion!GetCachePath) or call it with a reasonably sized buffer and hope it will fit (mscoree!GetFileVersion).
  2. Allocate a buffer of the actual returned length (only if the reasonably sized buffer was not enough in case of GetFileVersion).
  3. Call the API again passing the allocated buffer to obtain the string.

The allocation and copying in steps 2 and 3 could be avoided.

I didn't know about string.Create, that's an awesome API!

@stephentoub
Copy link
Member

Call the API again passing the allocated buffer to obtain the string.

I don't see how the runtime could depend on that, though. This pattern involves trusting that the API will succeed in filling the whole space because you gave it what it previously told you was required.

@elachlan elachlan requested a review from ladipro February 2, 2022 21:57
@ladipro
Copy link
Member

ladipro commented Feb 4, 2022

The second call still takes the buffer size so it doesn't overrun, and it returns the actual size so the caller has to make sure that it filled the whole space and the string length is correct. I see how convoluted it is and runtime support is probably not a good idea. Especially now that I know string.Create exists.

Copy link
Member

@ladipro ladipro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

src/Tasks/ComReference.cs Outdated Show resolved Hide resolved
@ladipro ladipro requested a review from Forgind February 4, 2022 09:18
Copy link
Member

@Forgind Forgind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read all the conversation, but I will come back to this. There's a lot to learn here!

if (hresult == NativeMethodsShared.ERROR_INSUFFICIENT_BUFFER)
{
// Allocate new buffer based on the returned length.
char* runtimeVersion2 = stackalloc char[dwLength];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an unstackalloc? And maybe check what dwLength is?

If dwLength is big, it would be good not to overrun the stack. I imagine we'd have a little more space if we can un-allocate the first stack before allocating the second.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No Its scoped the the current method.

stack allocated memory block created during the method execution is automatically discarded when that method returns. You cannot explicitly free the memory allocated with stackalloc.

https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/operators/stackalloc

do
{
runtimeVersion = new StringBuilder(bufferLength);
hresult = NativeMethods.GetFileVersion(path, runtimeVersion, bufferLength, out _);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I'm not a fan of this code, so I'm glad it's gone 😁

/// <summary>
/// Lazy loaded cached root path of the GAC.
/// </summary>
private static readonly Lazy<string> _gacPath = new(() => GetGacPath());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth making this lazy vs. just leaving it as a static call? It looks like it's only used once per ResolveComReference call, which seems like not very much to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought process was to hold a cached static value for it so we only have to call once per global run. I am unsure if that is how it works in practice.


// Try increased buffer sizes if on longpath-enabled Windows
for (int bufferSize = NativeMethodsShared.MAX_PATH; !success && bufferSize <= NativeMethodsShared.MaxPath; bufferSize *= 2)
for (int bufferSize = NativeMethodsShared.MAX_PATH; bufferSize <= NativeMethodsShared.MaxPath; bufferSize *= 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's relevant for this PR, but MaxPath can be as large as int.MaxValue; since that isn't exactly a power of 2, doesn't that mean it could theoretically (if we keep getting ERROR_INSUFFICIENT_BUFFER or pathLength is 0) reach the top, overflow, and throw an exception?

Copy link
Contributor Author

@elachlan elachlan Feb 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 23 doublings going from MAX_PATH (260) to int.MaxValue. So its not a trivial risk for long path enabled windows. Interestingly NTFS has a 65,535 character limit and The Windows API has many functions that also have Unicode versions to permit an extended-length path for a maximum total path length of 32,767 characters.

So I think we would run into file system/WinAPI limitations before hitting overflows.

I think this would work.
for (int bufferSize = NativeMethodsShared.MAX_PATH; bufferSize <= NativeMethodsShared.MaxPath && bufferSize <= int.MaxValue/2; bufferSize *= 2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Forgind let me know if you think the additional check is helpful or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would work. I'd be mildly in favor of adding it, but I don't care too much. It hasn't been an important case up to this point, so I doubt it'll be an important case in the future.

It isn't really the point of this PR, but a VerifyThrow at the end of the loop might be a nicer solution? It's almost certainly a bug if we get close to int.MaxValue, and it would be good to make the bug as visible as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it.

@Forgind Forgind added the merge-when-branch-open PRs that are approved, except that there is a problem that means we are not merging stuff right now. label Feb 8, 2022
@elachlan elachlan requested a review from Forgind February 8, 2022 21:56
Copy link
Member

@Forgind Forgind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@Forgind Forgind merged commit b8d493a into dotnet:main Feb 15, 2022
@Forgind
Copy link
Member

Forgind commented Feb 15, 2022

Thanks @elachlan!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-branch-open PRs that are approved, except that there is a problem that means we are not merging stuff right now.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants