Uniform char8_t and char basic_string_view punning
Pre-releaseThis release makes the m/strings/punning.h header have uniform type punning between char8_t and char based basic_string_view<> objects.
This is mainly to support consumption of OSS packages which tacitly assume that char encodes UTF-8 which is a poor assumption.
With the m::as_u8string_view() metaphor, one can easily take a std::string and treat it as a std::u8string_view and then interact with it as UTF-8 data in a type safe fashion.
And then when having to interact back with a library which demands std::string_view for UTF-8 data, you can use m::as_string_view() on your std::u8string data and apply it again into the OSS library.
Caveat Programmer! These functions are only punning the types - they do nothing to extend lifetimes. If you want data with safe lifetime, you should write something like:
auto my_string = std::u8string{m::as_u8string_view(some_oss_function(x, y, z))};
Assuming that some_oss_function() returns a std::string, the compiler will get a std::string_view of it to pass to the m::as_u8string_view() function, which will simply remap the pointer and length to a std::u8string_view instance and return that.
It is tempting to imagine that the entire basic_string<> object could be punned but standard library implementations can and do perform major specialization for the 'common' character types of char and wchar_t, which may or may not extend to the less-common char8_t, char16_t, and char32_t, and there are people who adhere to the notion that basic_string<mytype> is a logical notion so the standard library maintainers also must cater to this usage.
Therefore punning of the entire string object is not provided since while it may or may not work on any given standard library implementation, it is almost certainly not portable and is also a proverbial ticking time bomb.
If you want to fix this problem, get your open source code to provide proper UTF-8 support by providing support for std::u8string, but this is probably a decade long 'windmill tilt' if not multi-decade. char8_t isn't brand new.