Searching files without UTF-8 BOM? #51859
Answered
by
look
Liturgist
asked this question in
Code Search and Navigation
-
Select Topic AreaQuestion BodyDocumentation says that only UTF-8 encoded files are searched. Does that mean that a UTF-8 BOM (Byte Order Mark) is required at the beginning of the file? It does not appear so. Is a file omitted from search if a non-UTF-8 character sequence is found? |
Beta Was this translation helpful? Give feedback.
Answered by
look
Apr 3, 2023
Replies: 1 comment 2 replies
-
That's correct. We convert the raw bytes to UTF-8 and if the conversion is not successful, the blob is excluded. |
Beta Was this translation helpful? Give feedback.
2 replies
Answer selected by
look
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
That's correct. We convert the raw bytes to UTF-8 and if the conversion is not successful, the blob is excluded.