Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json schema parsing should not complain about UTF-8 BOM in schemas #163683

Closed
AArnott opened this issue Oct 14, 2022 · 8 comments
Closed

json schema parsing should not complain about UTF-8 BOM in schemas #163683

AArnott opened this issue Oct 14, 2022 · 8 comments
Assignees
Labels
info-needed Issue requires more information from poster

Comments

@AArnott
Copy link
Member

AArnott commented Oct 14, 2022

Type: Bug

The JSON editor supports the $schema property, which is great. And although it apparently works when those schemas start with a UTF-8 BOM, it complains when doing so.

image

UTF-8 BOM at the beginning of files is not a defect. Please remove the warning about not supporting them (and continue to actually work when they are present).

VS Code version: Code 1.72.1 (129500e, 2022-10-10T17:22:48.346Z)
OS version: Windows_NT x64 10.0.22621
Modes:
Sandboxed: No

System Info
Item Value
CPUs Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz (8 x 2793)
GPU Status 2d_canvas: unavailable_software
canvas_oop_rasterization: disabled_off
direct_rendering_display_compositor: disabled_off_ok
gpu_compositing: disabled_software
multiple_raster_threads: enabled_on
opengl: disabled_off
rasterization: disabled_software
raw_draw: disabled_off_ok
skia_renderer: enabled_on
video_decode: disabled_software
video_encode: disabled_software
vulkan: disabled_off
webgl: unavailable_software
webgl2: unavailable_software
webgpu: disabled_off
Load (avg) undefined
Memory (System) 31.95GB (16.42GB free)
Process Argv --folder-uri file:///c%3A/Users/andarno/source/repos/visual-studio-green --crash-reporter-id 584b0dd9-24bd-43e4-935a-f762d9a54323
Screen Reader no
VM 0%
Extensions (68)
Extension Author (truncated) Version
vscode-sidebar Acr 1.2.2
tsl-problem-matcher amo 0.6.2
vscode-zipfs arc 3.0.0
azurite Azu 3.19.0
search-crates-io bel 1.2.1
LinkCheckMD bla 0.3.1
better-toml bun 0.3.2
npm-intellisense chr 1.4.2
path-intellisense chr 2.8.1
jshint dba 0.11.0
vscode-eslint dba 2.2.6
githistory don 0.6.19
xml Dot 2.5.1
es7-react-js-snippets dsz 4.4.3
gitlens eam 12.2.2
EditorConfig Edi 0.16.4
vscode-npm-script eg2 0.3.29
prettier-vscode esb 9.9.0
json-pretty-printer eus 1.1.0
git-project-manager fel 1.8.2
dotnet-test-explorer for 0.7.8
vscode-pull-request-github Git 0.50.0
vscode-jasmine-test-adapter hbe 1.8.2
vscode-test-explorer hbe 2.21.1
docker-linter hen 0.5.0
crates-completer jed 1.2.1
docomment k-- 0.1.31
vscode-jest-test-adapter kav 0.8.1
azure-pipelines ms- 1.208.0
vscode-docker ms- 1.22.1
csharp ms- 1.25.0
vscode-dotnet-runtime ms- 1.5.0
python ms- 2022.16.1
vscode-pylance ms- 2022.10.20
jupyter ms- 2022.9.1202862440
jupyter-keymap ms- 1.0.0
jupyter-renderers ms- 1.0.10
vscode-jupyter-cell-tags ms- 0.1.6
vscode-jupyter-slideshow ms- 0.1.5
remote-containers ms- 0.255.3
remote-ssh ms- 0.90.1
remote-ssh-edit ms- 0.84.0
remote-wsl ms- 0.72.0
vscode-remote-extensionpack ms- 0.21.0
azure-account ms- 0.11.2
cpptools ms- 1.12.4
powershell ms- 2022.8.5
test-adapter-converter ms- 0.1.6
vsliveshare ms- 1.0.5735
vscode-versionlens pfl 1.0.10
java red 1.11.0
vscode-yaml red 1.10.1
rust-analyzer rus 0.3.1238
crates ser 0.5.10
gitconfig sid 2.0.1
vscode-rust-test-adapter Swe 0.11.0
vscode-lldb vad 1.8.1
intellicode-api-usage-examples Vis 0.2.5
vscodeintellicode Vis 1.2.28
vscode-java-debug vsc 0.45.0
vscode-java-dependency vsc 0.21.0
vscode-java-pack vsc 0.25.3
vscode-java-test vsc 0.37.1
vscode-maven vsc 0.39.0
vscode-icons vsc 11.20.0
php-debug xde 1.29.0
vsc-docker Zim 0.34.0
php-intellisense zob 1.0.11

(1 theme extensions excluded)

A/B Experiments
vsliv368:30146709
vsreu685:30147344
python383:30185418
vspor879:30202332
vspor708:30202333
vspor363:30204092
vswsl492:30256859
vslsvsres303:30308271
pythonvspyl392:30443607
vserr242cf:30382550
pythontb:30283811
vsjup518:30340749
pythonptprofiler:30281270
vshan820:30294714
vstes263:30335439
vscoreces:30445986
pythondataviewer:30285071
vscod805:30301674
binariesv615:30325510
bridge0708:30335490
bridge0723:30353136
cmake_vspar411:30581797
vsaa593:30376534
pythonvs932:30410667
cppdebug:30492333
vscaat:30438848
vsclangdf:30486550
c4g48928:30535728
dsvsc012:30540252
azure-dev_surveyone:30548225
2144e591:30553903
vsccc:30566497
pyindex848:30577860
nodejswelcome1cf:30587006
40g7c324:30573242

@VSCodeTriageBot
Copy link
Collaborator

Thanks for creating this issue! It looks like you may be using an old version of VS Code, the latest stable release is 1.72.2. Please try upgrading to the latest version and checking whether this issue remains.

Happy Coding!

@AArnott
Copy link
Member Author

AArnott commented Oct 14, 2022

It still repros on the latest.

And by the way, you can repro the problem yourself with this JSON file:

{
	"$schema": "https://raw.githubusercontent.com/dotnet/Nerdbank.GitVersioning/dc0a8cfe732e7e589836225666b81e1b7ef58d82/src/NerdBank.GitVersioning/version.schema.json",
	"version": "1.0"
}

@aeschli
Copy link
Contributor

aeschli commented Oct 17, 2022

See https://www.rfc-editor.org/rfc/rfc8259#section-8.1

Implementations MUST NOT add a byte order mark (U+FEFF) to the
   beginning of a networked-transmitted JSON text.  In the interests of
   interoperability, implementations that parse JSON texts MAY ignore
   the presence of a byte order mark rather than treating it as an
   error.

I assume that can be applied when stored in files.

@aeschli aeschli closed this as completed Oct 17, 2022
@aeschli aeschli reopened this Oct 17, 2022
@aeschli
Copy link
Contributor

aeschli commented Oct 17, 2022

I can remove or reword the warning if you have pointers to specs that clarify that. From what I've seen many consumers of JSON struggle with it, so I think the warning is useful.

@aeschli aeschli added the info-needed Issue requires more information from poster label Oct 17, 2022
@AArnott
Copy link
Member Author

AArnott commented Oct 17, 2022

Thank you for responding quickly and for the RFC link. It makes sense that JSON shouldn't include the BOM when it's network transmitted. Such cases like HTTP include headers that already identify the encoding anyway.
It's an interesting case though for JSON schemas where the schemas so commonly come from github, and the web server doesn't know it's JSON and the file on disk doesn't know it's being transmitted. It seems like code editors should be more forgiving in the JSON schema case.
Alternatively, it seems that code editors should strongly resist using a BOM in the encoding when saving .json files to help developers such as myself to conform with this requirement. Maybe a .gitattributes or .editorconfig rule to help enforce it. Are you aware of any such mitigations? I'd happily apply them.

@aeschli
Copy link
Contributor

aeschli commented Oct 18, 2022

I'm not aware of such a feature. In VS Code you can define the encoding to be used for by default for writing files: files.encoding, but we always persist the encoding.
#47089 requests to do this per language.

So ok to close this issue? I think issuing the warning when loading the schema from a URL is proper.

@sharwell
Copy link
Member

sharwell commented Nov 9, 2022

@aeschli This issue presents a case where assuming the input strictly adheres to this section of the JSON specification is causing a problem for some users. In addition, the JSON specification explicitly allows a JSON processor (e.g. VS Code) to accept input in this form in order to avoid causing this pain for users. I believe it would be net-beneficial to update VS Code accordingly, even if the input is not supposed to be in this form.

In the interests of interoperability, implementations that parse JSON texts MAY ignore the presence of a byte order mark rather than treating it as an error.

Note that we don't necessarily need to remove this warning from the current JSON document during editing, but for the case of JSON Schema the user is not actively editing the schema and just wants to get work done. I'm specifically requesting the BOM be ignored in the linked schema file.

@aeschli
Copy link
Contributor

aeschli commented Nov 10, 2022

What you see is just a warning (not an error). We just ignore the BOM and and continue to process the schema.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
info-needed Issue requires more information from poster
Projects
None yet
Development

No branches or pull requests

4 participants