Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Inconsistent character encoding detection on open vs reload #2843
Hi, using version 7.3.1 (it was also bugged on a previous version) I've been working on some UTF-8 XML (this XML class that i'm working with only accepts UTF-8 input and will crash out otherwise - and also putting the file through phps libxml error detection to confirm if it's UTF-8 or not - which it is when this error occurs).
When I open a newly created file the document displays in UTF-8 as expected, however when it automatically refreshes the file or reload the file after it has been updated, the file encoding changes from UTF-8 to TIS-620. It appears that the character that is mainly bugging it out is the left accented a; "á" (in my testing at least).
I took a look at a couple of the other issues raised and I tried going Preferences -> Misc -> Autodetect character encoding and turned that off. It worked after closing the original file, opening it, changing it externally then reloading it (but it didn't fix it after just changing the option then reloading the file after changing it externally).
I also have Preferences -> New Document -> Encoding -> UTF-8 > Apply to opened ANSI files [ON] (altering this didn't seem to do anything for me - but may not have tested that properly).
*When I reference UTF-8, it's without BOM.
Notepad++ v7.3.1 (64-bit)
I don't know how Notepad++ attempts to detect UTF-8 files, but it's usually just possible to rule out that a file is not UTF-8, if it contains codes or sequences illegal in UTF-8.
(by the way your XML class will only crash if you happen to use illegal UTF-8 characters or sequences)