You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
uses p.string(), which does not give a UTF-8-encoded string on windows (in some cases, maybe?). Trying to dump() the resultant JSON throws a "invalid UTF-8 byte" exception.
Reproduction steps
Convert a std::filesystem::path, which contains a unicode "Right Single Quotation Mark" character (U+2019), to a json implicitly or with to_json.
Inspect the new json (string_t)'s bytes, either by dump()ing, or converting to BSON.
Actual: The string gets converted by std::filesystem::path::string(), which appears to convert it to Windows-1252 encoding. Its bytes end up as \x92 rather than \xe2\x80\x99.
I can also workaround this problem by adding a manifest XML that sets my app's code page to CP_UTF8 on supported versions of windows.
In CMake I wrapped this in a function:
# target_add_manifest(<target> <manifest file>)
#
# You probably want to use ${MANIFEST_FILE_UTF8} defined below this function
#
# Adds a manifest file (https://learn.microsoft.com/en-us/windows/win32/sbscs/application-manifests)
# to an EXE
function(target_add_manifest TARGET_NAME MANIFEST_FILE)
if(NOT TARGET_NAME)
message(FATAL_ERROR "You must provide a target")
endif()
if(NOT MANIFEST_FILE)
message(FATAL_ERROR "You must provide a manifest file")
endif()
add_custom_command(
TARGET ${TARGET_NAME}
POST_BUILD
COMMAND "mt.exe" -manifest \"${MANIFEST_FILE}\" \"-updateresource:$<TARGET_FILE:${TARGET_NAME}>\"
)
endfunction()
which is used like this (probably want to wrap in a platform check):
This solves the problem, if the app is running on at least Windows Version 1903. Still a bug but wanted to share this workaround because it's useful for many libraries that have the same issue.
Description
This conversion function:
https://github.com/nlohmann/json/blob/7efe875495a3ed7d805ddbb01af0c7725f50c88b/include/nlohmann/detail/conversions/to_json.hpp#L416C1-L420C2
uses
p.string()
, which does not give a UTF-8-encoded string on windows (in some cases, maybe?). Trying todump()
the resultant JSON throws a "invalid UTF-8 byte" exception.Reproduction steps
Convert a
std::filesystem::path
, which contains a unicode "Right Single Quotation Mark" character (U+2019), to ajson
implicitly or withto_json
.Inspect the new
json (string_t)
's bytes, either bydump()
ing, or converting to BSON.Expected vs. actual results
Expected: "Strings are stored in UTF-8 encoding." per https://json.nlohmann.me/api/basic_json/string_t/
Actual: The string gets converted by
std::filesystem::path::string()
, which appears to convert it to Windows-1252 encoding. Its bytes end up as\x92
rather than\xe2\x80\x99
.Minimal code example
Workaround I'm using is to use
WideCharToMultiByte
+.native()
to get the string in UTF-8 before passing to nlohmann:Error messages
"[json.exception.type_error.316] invalid UTF-8 byte at index 0: 0x92
Compiler and operating system
MSVC 2022 Professional, C++ 20
Library version
develop - a259ecc
Validation
develop
branch is used.The text was updated successfully, but these errors were encountered: