New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COMMON: Skip possible UTF-8 BOM when reading INI files #5435
Conversation
WalkthroughThe updates across multiple files introduce a new feature to handle the UTF-8 byte-order mark (BOM) in configuration and INI file parsing. When loading from a stream, the system now checks for and removes the BOM to ensure the subsequent data is interpreted correctly. This change is consistent across the Changes
TipsChat with CodeRabbit Bot (
|
How do these characters appear in the file in the first place? After editing with an external editor? |
Yes, it was added with an external editor. |
This is a good idea and I think the simple approach is appropriate. I think the comment should include that external editors can be the source, and maybe the consequences of not stripping it. Otherwise the code will keep raising those questions. I expected coderabbit to suggest sizeof(UTF8_BOM) instead of 3! =) |
To be fair, I was specifically testing UTF-8 with BOM when trying to determine why u8.ini wasn't being read from ultima8.dat. I just did not expect to be prompted to replace an invalid scummvm.ini or that exiting instead of specifically clicking cancel would replace it. In a previous job where we implemented UTF-8 reading of INI configuration files, we relied on the BOM to indicate that it should not be read with the system codepage. My understanding is this has become uncommon as UTF-8 BOM is discouraged. |
5fe7808
to
0f45263
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: CodeRabbit UI
Files selected for processing (2)
- common/config-manager.cpp (2 hunks)
- common/formats/ini-file.cpp (2 hunks)
Files skipped from review due to trivial changes (2)
- common/config-manager.cpp
- common/formats/ini-file.cpp
Okay, thank you! |
After briefly considering a more complex solution including detection additional encoding byte-order marks, I thought it would be best to simply read and skip UTF-8 BOM when reading INI files. This would at least prevent someone like me from accidentally wiping out their scummvm.ini files.
Summary by CodeRabbit