Fix #36951: Add the "file.encoding_candidate" attribute for selecting encoding to open file. #117053

a45s67 · 2021-02-19T16:03:05Z

TLDR

I try to solve #36951 in the way like "set fileencodings= xxx,xxx" in vim.

How to use :

If we want to try to open a file in the order of "utf-8", "big5", "...(some codepages you want)".
In the setting.json, just add -

file.encoding_candidate: [ "utf-8", "big5", "....(some codepages you want)"]

Then the editor will try to open file with "utf-8", if it failed, try "big5", and so on.

Note :

I use "TextDecoder" to check if the file can be opened.
Thus please check that the encoding you want to use exists in "TextDecoder" before use.

Priority:

auto detected encoding < file.encoding_candidate

For #36951.

ghost · 2021-02-19T16:03:20Z

All CLA requirements met.

bpasero · 2021-02-19T16:30:45Z

Thanks for the PR, a little bit on the expectation: I will probably not have time to review this until we get to our yearly issue grooming iteration in October.

bpasero · 2021-09-18T09:26:52Z

This seems to be building around the fatal flag on TextDecoder (which I was not aware of). But are we sure this works as we think it does, signalling whether the encoding is valid for a buffer?

https://encoding.spec.whatwg.org/#error-mode

bpasero · 2021-09-18T09:31:12Z

Besides, we cannot really decode the entire document at once just to check on error for performance/memory reasons and we cannot decode a chunk because we don't really know the byte size...

bpasero · 2021-09-18T12:01:55Z

Also, TextDecoder with an encoding other than UTF-8 will not work in browsers, so I am inclined to close this PR.

a45s67 · 2021-10-01T18:07:53Z

Thanks @bpasero to give me some comments!

Yes, I did not consider such details you mentioned (fatal flag, byte size) when implementing this feature.
Some code of my commits must make some changes to be more reliable.

To make it work in browsers, maybe I will try to find other text decoder instead of TextDecoder.

But is there a way to check the encoding of a file without decoding all its content? I thought it is necessary to get the accurate result.

a45s67 added 3 commits February 1, 2021 22:25

Add attribute 'file.encoding_candidate'

aba2e4e

chore: do some work on comments

7327a77

chore: restore some default setting and remove debug.log

0dcae41

vscode-triage-bot assigned bpasero Feb 19, 2021

bpasero added this to the Backlog milestone Feb 19, 2021

bpasero marked this pull request as draft February 19, 2021 16:30

fixed: handle the case when "files.encoding_candidate" is not exist

5f30f41

a45s67 marked this pull request as ready for review February 20, 2021 18:56

fixed: move the candidate selection function to getReadEncoding()

7a71fbd

bpasero mentioned this pull request Feb 22, 2021

Allow to configure a list of encodings to use when guessing #36951

Closed

bpasero marked this pull request as draft March 8, 2021 06:57

bpasero added the info-needed Issue requires more information from poster label Sep 18, 2021

github-actions bot locked and limited conversation to collaborators Nov 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #36951: Add the "file.encoding_candidate" attribute for selecting encoding to open file. #117053

Fix #36951: Add the "file.encoding_candidate" attribute for selecting encoding to open file. #117053

a45s67 commented Feb 19, 2021 •

edited

Loading

ghost commented Feb 19, 2021 •

edited by ghost

Loading

bpasero commented Feb 19, 2021

bpasero commented Sep 18, 2021 •

edited

Loading

bpasero commented Sep 18, 2021

bpasero commented Sep 18, 2021

a45s67 commented Oct 1, 2021 •

edited

Loading

Fix #36951: Add the "file.encoding_candidate" attribute for selecting encoding to open file. #117053

Fix #36951: Add the "file.encoding_candidate" attribute for selecting encoding to open file. #117053

Conversation

a45s67 commented Feb 19, 2021 • edited Loading

TLDR

How to use :

Note :

Priority:

ghost commented Feb 19, 2021 • edited by ghost Loading

bpasero commented Feb 19, 2021

bpasero commented Sep 18, 2021 • edited Loading

bpasero commented Sep 18, 2021

bpasero commented Sep 18, 2021

a45s67 commented Oct 1, 2021 • edited Loading

a45s67 commented Feb 19, 2021 •

edited

Loading

ghost commented Feb 19, 2021 •

edited by ghost

Loading

bpasero commented Sep 18, 2021 •

edited

Loading

a45s67 commented Oct 1, 2021 •

edited

Loading