Cleanup some string operating functions#96099
Cleanup some string operating functions#96099AaronRobinsonMSFT merged 16 commits intodotnet:mainfrom
Conversation
src/coreclr/inc/corhlprpriv.h
Outdated
| size_t destLen = minipal_get_length_utf8_to_utf16(utf8str, sourceLen, 0); | ||
|
|
||
| HRESULT hr = FString::Utf8_Unicode_Length(utf8str, & allAscii, & length); | ||
| LPWSTR buffer = (LPWSTR) AllocThrows((length + 1) * sizeof(WCHAR)); |
There was a problem hiding this comment.
This statement is unchanged (with whitespace off).
There was a problem hiding this comment.
| LPWSTR buffer = (LPWSTR) AllocThrows((length + 1) * sizeof(WCHAR)); | |
| CHAR16_T* buffer = (CHAR16_T*) AllocThrows((length + 1) * sizeof(CHAR16_T)); |
There was a problem hiding this comment.
dn-u16.h uses WCHAR while minipal/utf-8 uses CHAR16_T. They are not treated as compatible although both typedef-d from wchar_t. Is any reason that compiler supported char16_t can't be used?
There was a problem hiding this comment.
Some platforms do not support char16_t, which is why we went with the least supported type to handle coreclr and mono simultaneously.
runtime/src/native/minipal/utf8.h
Lines 21 to 25 in 17594ec
During this refactoring, it's better to directly allocate the target type; CHAR16_T for UTF16 or char for UTF8 to avoid further confusion.
There was a problem hiding this comment.
Some platforms do not support
char16_t,
I was asking for the exact platforms and GCC/Clang versions we need to target.
There was a problem hiding this comment.
macOS doesn't support char16_t so you can cross the whole IsApplePlatform series off the list. Our least supported versions support C11 and hence char16_t on Linux, so only OS headers are relevant here.
There was a problem hiding this comment.
macOS doesn't support
char16_t
Asked a friend to confirm that char16_t is supported with latest build tool on macOS. The mentions of macOS doesn't support char16_t are quite old. On cppreference, char16_t is marked as supported by Apple clang. There's also documentation about CChar16 in Swift doc.
Can you confirm again about this?
There was a problem hiding this comment.
I think in C mode, we had some problems. I don't remember the details offhand. This is the quick way to find out what doesn't work in the CI:
#ifdef TARGET_WINDOWS
typedef wchar_t CHAR16_T;
#else
- typedef unsigned short CHAR16_T;
+ typedef char16_t CHAR16_T;
#endif (note that we'd still need CHAR16_T for Windows, so perhaps it's better to keep this investigation separate from this PR)
|
Thanks for starting this work. Ideally we should remove the intermediate wrappers for string conversion and directly use minipal/utf8.h in coreclr/mono/corehost, including windows targets; but it will take a while to get there. (I started the grand change in a branch, but realized that it's best to do in batches to ease up the review like you are doing here) |
Co-authored-by: Adeel Mujahid <3840695+am11@users.noreply.github.com>
|
Now tests are passing. I need some assist about link failure on macos. |
|
|
Hmm it is linked in Unix pal, but not minipal. Will it cause duplicated symbols on Linux? |
Co-authored-by: Aaron Robinson <arobins@microsoft.com>
Co-authored-by: Adeel Mujahid <3840695+am11@users.noreply.github.com>
|
Since the buffer will always be sufficient, the resultant error code could only be NO_UNICODE_TRANSLATION. |
AaronRobinsonMSFT
left a comment
There was a problem hiding this comment.
Thanks!
@am11 anything else?
Co-authored-by: Adeel Mujahid <3840695+am11@users.noreply.github.com>
This reverts commit bc81c55.
* Revert "Forward minipal_get_length_utf16_to_utf8 and minipal_convert_utf16_to_utf8 from DAC (dotnet#97055)" This reverts commit 1263107. * Revert "Cleanup some string operating functions (dotnet#96099)" This reverts commit bc81c55.
Replace with standard or shared ones.
Please confirm their behavior and performance characteristics are desired.