New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't load .cs files containing "special" characters (like ç è ì) in comments or strings. #27083
Comments
Could you attach an example containing one of your problematic source files so that someone can verify if it really is valid UTF-8 or not. |
Sorry for the late reply, I've been trying to understand what was happening, because when I tried using an empty project in order to create those files, I was getting different results from before. |
Saying that Godot doesn't save file as UTF-8 as unless it contains non ASCII characters isn't quite right. As file containing only ASCII characters is a valid UTF-8 file. Problem is that VS doesn't like UTF-8 files without BOM too much and it assumes that they are in system encoding, at least it used to do so probably for backwards compatibility reasons. That is somewhat annoying as most(from personal experience) other code editors just assume that file is in UTF-8 and use of UTF8 with BOM is discouraged. For mixed Godot/VS workflow you can try VS extension like this |
Thank you, karliss! |
Is there anything we can do to ease the loading of UTF-8 files with BOM? Edit: It seems |
Hi, about this error, when I use VS Code, it detects UTF-8. |
Can anyone still reproduce this issue in Godot 3.5 or any later release? I was unable to reproduce this in Linux using VSCode to create a file with UTF8-BOM encoding. |
Hi, I've reproduced it independently and found this issue whilst I was checking before posting my own. Affected demo project attached. This project was built in Godot 4 Beta 4. I'm running Windows 11, the .cs file was written using VS2022 Pro. The game opens a window but immediately closes and no game starts. If I remove the special character (© in this case) the game launches normally. |
Oh, ouch - I just re-opened my demo project and the problem manifests differently. If you open the demo project I supplied in Godot, it actually reports that So something in the guts of Godot is simply ignoring the C# file entirely. |
Final update -> re-saving the file in VS using Save As -> with encoding -> UTF-8 (65001) seems to fix the file. This ties in with the comments above. |
@RobTF I tried your project, this is the console output:
Yes, as described by the error the file was not loaded because it contains invalid UTF-8 characters so later when it tries to use it in the scene it can't find it.
Files should be encoded in UTF-8, if your file doesn't contain valid UTF-8 characters I think failing to load the file is expected and not a bug. Not sure why VS2022 Pro would write invalid characters to a UTF-8 encoded file though, could be you had selected a different encoding? VSCode detects the file encoding as UTF-8 though. |
Ah got it, so I think the issue is that I didn't create the By default, VS seems to maintain the character set it finds when it opens the file, but I can use VS to "fix" a Godot generated .cs file by forcing it to save specifically as UTF-8. |
Hmm. Why does Godot not use UTF-8 when creating the file then? |
I think that's the crux of it, at least it appears to be on my end. I get the feeling that full-fat Visual Studio handles these non-UTF8 encodings perhaps more completely/correctly than Godot or VSCode; maybe if you open the Godot generated .cs file in one of the latter two and re-save it, it silently converts to UTF-8 as that's all the tool understands which in turn makes the problem less clear to some users. This also might be a Windows only problem, as the |
If I create the file with VS 2022, it works, thanks RobTF! |
I had this issue in Godot 4.2 and VS 2022. I created a .cs file from inside godot and when I used swedish special characters in a comment in the code, godot could no longer load the script file nor the scene it was attached to. |
Godot should always use UTF-8, but it's possible that VS expect byte order mark to identify UTF-8 file and (Godot is not adding it). |
Yep, VS 2022 still assume file without BOM and any Unicode characters is 1252, and saving it as 1252 if you add characters from this encoding. At least it's asking to convert it to UTF-8 if you add other characters and seems to detect files with Unicode characters in it. But we probably should add BOM, Godot (and most of the other software) do not care if it's there, so it should not cause any issues. Adding it for diff --git a/modules/mono/csharp_script.cpp b/modules/mono/csharp_script.cpp
index 33fef2d58c..21826f7b3f 100644
--- a/modules/mono/csharp_script.cpp
+++ b/modules/mono/csharp_script.cpp
@@ -2950,6 +2950,9 @@ Error ResourceFormatSaverCSharpScript::save(const Ref<Resource> &p_resource, con
Ref<FileAccess> file = FileAccess::open(p_path, FileAccess::WRITE, &err);
ERR_FAIL_COND_V_MSG(err != OK, err, "Cannot save C# script file '" + p_path + "'.");
+ file->store_8(0xEF); // Store UTF-8 BOM.
+ file->store_8(0xBB);
+ file->store_8(0xBF);
file->store_string(source);
if (file->get_error() != OK && file->get_error() != ERR_FILE_EOF) { |
Hi there! Here to say that this was exactly my problem also, and was solved exactly as mentioned :). I don't know nothing to this matter, but I suppose it's not as easy as making Godot forcing the conversion of scripts to UTF-8 :/ |
Godot version:
3.1 Stable Mono (at least since 3.1 Beta 3)
Official builds, x64 versions.
OS/device including version:
Windows 7 Ultimate x64
Issue description:
Can't load .cs files containing "special" characters in comments or strings.
Example of such characters: ç è ì.
Unicode error: invalid utf8
modules/mono/utils/string_utils.cpp:181 - Method/Function Failed, returning: ERR_INVALID_DATA
Script 'res://Button03.cs' contains invalid unicode (utf-8), so it was not loaded. Please ensure that scripts are saved in valid utf-8 unicode.
modules/mono/csharp_script.cpp:2898 - Condition ' err != OK ' is true. returned: RES()
Failed loading resource: res://Button03.cs
scene/resources/resource_format_text.cpp:175 - Couldn't load external resource: res://Button03.cs
editor/editor_data.cpp:564 - Index p_idx=1 out of size (edited_scene.size()=1)
I'm using Visual Studio 2017 with default settings for encoding, those characters are valid UTF-8 (even ANSI).
Steps to reproduce:
Open any .cs script,
add any of those characters to a string or a comment,
try to build/run the game or load the scene containing said script.
Minimal reproduction project:
The text was updated successfully, but these errors were encountered: