Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When outputting file paths, emojis will not be displayed if they are included in the path, here are my thoughts. #709

Open
DamonGX opened this issue Aug 16, 2023 · 2 comments

Comments

@DamonGX
Copy link

DamonGX commented Aug 16, 2023

//#ifdef _MSC_VER
//std::wstring path::wstring() const
//{
// std::wstring_convert<std::codecvt_utf8<wchar_t>> convert;
// return convert.from_bytes(string());
//}
//#endif

#ifdef _MSC_VER
#include <stringapiset.h>
std::wstring path::wstring() const
{
const char *str{string().c_str()};
size_t strSize = strlen(str);
int unicodeSize = MultiByteToWideChar(CP_UTF8, 0, str, strSize, NULL, 0);
wchar_t *unicodeStr = new wchar_t[unicodeSize]{L'\0'};
MultiByteToWideChar(CP_UTF8, 0, str, strSize, unicodeStr, unicodeSize);
return {unicodeStr};
}
#endif

@doomlaur
Copy link
Contributor

I'm not sure which version of xlnt you are using, but if you're using version 1.5.0, you should maybe use the latest commit from the master branch instead. The issue you describe has been fixed by pull request #607 by using std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>>. As explained on cppreference, the previous code converted UTF-8 to UCS-2 (the predecessor of UTF-16) on Windows, causing Unicode code points that need 4 bytes (like emojis) to fail. Unfortunately, the fix has not been released in a stable version of XLNT yet, but at least in my experience, the master branch seems to be even more stable than version 1.5.0, as it contains many bugfixes - so you should definitely give it a go.

For the record: all conversion functions provided by the C++ Standard library have been deprecated in C++17. Since wide strings are only used on Windows, your solution is a very good alternative 👍 In fact, I'm already using that in some projects. To avoid the memory leak at the end of your code snippet (the wchar_t array never gets deleted) and to avoid copying unnecessarily, the alternative to std::wstring_convert could be the following (slightly adapted version of your code):

#ifdef _MSC_VER

#include <stringapiset.h>

std::wstring path::wstring() const
{
	const std::string & path_str = string();
	int size = MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, path_str.c_str(), static_cast<int>(path_str.length()), nullptr, 0);

	if (size > 0)
	{
		std::wstring path_converted(size, L'\0');
		size = MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, path_str.c_str(), static_cast<int>(path_str.length()), path_converted.data(), size);
		return path_converted;
	}
	else
	{
		return {};
	}
}
#endif

@DamonGX
Copy link
Author

DamonGX commented Aug 19, 2023

Wow.Thank you for your reply. Your code has taken into account a memory leak issue that I did not consider. Based on your explanation, I have also learned a lot. Finally, thank you again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants