Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

print() output in console is garbled when using non-ASCII characters on Windows #87591

Closed
Calinou opened this issue Jan 25, 2024 · 11 comments · Fixed by #91147
Closed

print() output in console is garbled when using non-ASCII characters on Windows #87591

Calinou opened this issue Jan 25, 2024 · 11 comments · Fixed by #91147

Comments

@Calinou
Copy link
Member

Calinou commented Jan 25, 2024

Tested versions

  • Reproducible in: 4.2.1.stable

System information

Windows 10 22H2

Issue description

Running the following code on Windows with the command prompt visible:

func _ready():
	print("Héllo world")
	print_rich("[b]Héllo world")
	print_verbose("Héllo world")
	prints("Héllo", "world")
	printt("Héllo", "world")
	printraw("Héllo world\n")
	print_debug("Héllo world")
	push_error("Héllo world")
	push_warning("Héllo world")

Results in:

HÚllo world
HÚllo world
HÚllo world
HÚllo   world
HÚllo world
HÚllo world
   At: res://Node2D.gd:10:_ready()
ERROR: HÚllo world
   at: push_error (core/variant/variant_utility.cpp:1091)
WARNING: HÚllo world
     at: push_warning (core/variant/variant_utility.cpp:1111)

Text appears correctly in the editor Output panel.

This occurs both from cmd.exe and PowerShell, both when using the standard executable and the console wrapper. I haven't tested Windows Terminal nor Windows 11 yet.

Steps to reproduce

Use print() or any derivatives with text that contains non-ASCII characters like é, ×, ©, and so on. Also try emoji for good measure: 🙂 👋🏻 (note that only Windows Terminal can display those correctly, at least with colors)

Minimal reproduction project (MRP)

test_unicode_print.zip

@AThousandShips

This comment was marked as outdated.

@AThousandShips
Copy link
Member

Did some digging but didn't find any immediate solution, unsure what's causing this quirk, probably something with how it handles redirection or wide characters

@bruvzg bruvzg self-assigned this Jan 25, 2024
@bs-mwoerner
Copy link
Contributor

bs-mwoerner commented Jan 25, 2024

You can fix the é by just doing

SetConsoleOutputCP(CP_UTF8);
WriteConsoleA(GetStdHandle(STD_OUTPUT_HANDLE), buf, len, NULL, NULL);

(no need for MultiByteToWideChar). I didn't manage to get this to print emotes, though. I also can't enter or paste emotes into a console window. Maybe the console just doesn't support characters beyond 0xFFFF. 🤷‍♂️

Could do a pull request, but I think @bruvzg is already on it?

Edit: Oh, and I'm talking about WindowsTerminalLogger::logv() by the way.

@bruvzg
Copy link
Member

bruvzg commented Jan 25, 2024

SetConsoleOutputCP is a most obvious solution to the issue, but there might be issues with the way Godot is printing strings, and underlying CRT code (which is different in MinGW and MSVC), so it should be tested in various conditions. I have not done anything with it so far, just added to mt todo list.

@AThousandShips
Copy link
Member

There's various prints that aren't necessarily going through the logger so has to be compatible with that, though those are largely crash related or extreme error cases with printf, but still needs to be catch-all, don't know the deep details on that though

@bs-mwoerner
Copy link
Contributor

Okay, then I better leave that to someone more versed in the inner workings. In case it helps, here's how far I got:

WindowsTerminalLogger::WindowsTerminalLogger() {
	SetConsoleOutputCP(CP_UTF8);
}

void WindowsTerminalLogger::logv(const char *p_format, va_list p_list, bool p_err) {
	if (!should_log(p_err)) {
		return;
	}

	const unsigned int BUFFER_SIZE = 16384;
	char buf[BUFFER_SIZE + 1]; // +1 for the terminating character
	int len = vsnprintf(buf, BUFFER_SIZE, p_format, p_list);
	if (len <= 0) {
		return;
	}
	if ((unsigned int)len >= BUFFER_SIZE) {
		len = BUFFER_SIZE; // Output is too big, will be truncated
	}
	buf[len] = 0;

	WriteFile(GetStdHandle(p_err ? STD_ERROR_HANDLE : STD_OUTPUT_HANDLE), buf, len, NULL, NULL);

#ifdef DEBUG_ENABLED
	FlushFileBuffers(GetStdHandle(p_err ? STD_ERROR_HANDLE : STD_OUTPUT_HANDLE));
#endif
}

(WriteFile may be better than WriteConsole in case someone wants to redirect stdout...)

grafik

@bruvzg
Copy link
Member

bruvzg commented Feb 2, 2024

In case it helps, here's how far I got:

This works with the main executable, but not with a wrapper. But a bit more complex variant using both WriteFile and WriteConsoleW seems to be working OK.

@bruvzg
Copy link
Member

bruvzg commented Feb 2, 2024

This works with the main executable, but not with a wrapper. But a bit more complex variant using both WriteFile and WriteConsoleW seems to be working OK.

But unfortunately not working with redirects.

@DidierMorandi
Copy link

@DidierMorandi
Copy link

I did this: Language settings, selecting Administrative language settings, clicking Change system locale... and checking the Beta: Use Unicode UTF-8 for worldwide language support box and then restarting my pc.
When I go to the Console and use CHCP, I get 65001. So far so good (as would Laetizia Buonaparte say).
But when I start Godot, I get this:
bug_godot
the accentuated character "é" just disappeared in "vidéo".

@DidierMorandi
Copy link

I found this in the sources (godot/platform/windows/display_server_windows.cpp):
78 String msg = "Error " + itos(id) + ": " + String::utf16((const char16_t *)messageBuffer, size);

Maybe the use of utf16 (more or less as often as utf8 in the Godot sources for Windows) could be the cause of the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants