Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

code: conform to c++23 #184

Merged
merged 1 commit into from
Jul 2, 2023
Merged

Conversation

lazan
Copy link
Collaborator

@lazan lazan commented Apr 25, 2023

This patch still builds with c++17.

This patch still builds with c++17.
@lazan lazan requested a review from ahhud April 25, 2023 09:45
@@ -100,7 +100,7 @@ namespace casual
if( transcode::utf8::exist( "ISO-8859-15"))
{
const std::string source = { static_cast<std::string::value_type>(0xA4)};
const std::string expect( u8"€");
const std::string expect( "€");
const std::string result = transcode::utf8::encode( source, "ISO-8859-15");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Är detta bra? Bygger väl på att Eurosymbolen i expect råkar vara en UTF-8 encoded variant av symbolen. Dvs att källkoden editerats i en omgivning som har locale utf-8, och att kompilatorn accepterar denna byte-ström i den locale som gäller vid bygget. Borde man använda "hex-notation" i expected? Har inte läst på om det nya utf-8 stödet i C++20/23 ännu, men antar att det är det som gör at koden behövde ändras för at fungera med både C++17 och C++23... Har köpt Josuttis "C++20 The complete Guide" (700 sidor om C++20, version daterad 2022-11-14) och ser att det finns ett avsnitt i den om ändringarna i utf-8 stödet.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not that great... But u8 prefix creates a std::u8string, witch we don't have any knowledge off in our code base, yet. The source code encoding is in utf8, hence the euro-sign will be a utf8 encoded string, as far as I can understand. The whole u8 prefix is rather confusing, at least to me: https://stackoverflow.com/questions/23471935/how-are-u8-literals-supposed-to-work

It makes more sense now when it creates std::u8string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

http://wg21.link/p1423 can be of interest.
According to the man page for c++ the g++ compiler takes the default input character set (-finput-charset=charset) from the "locale". If not available there it is assumed to be UTF-8. But it can be specified/overridden on the command line....
For the execution character set (-fexec-charset=charset) the default is UTF-8.
What happens if the input charset is UTF-8 but actual input is "illegal" UTF-8 (e.g. it really is 8859-1 with non-ascii characters) is probably "implementation dependent". According to the stack overflow discussion Clang gives a warning, but g++ just "preserves" the input bytes in this case.

@lazan lazan merged commit d68d3b7 into feature/1.7/main Jul 2, 2023
@lazan lazan deleted the feature/1.7/c++23-conformance branch July 24, 2023 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants