Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Scintilla's JSON lexer for JSON instead of the JavaScript/C++ lexer #2454

Closed
wants to merge 1 commit into from
Closed

Conversation

d5l6
Copy link

@d5l6 d5l6 commented Mar 17, 2020

This commit changes JSON from a custom filetype to one with its own lexer, as requested in
#384 (comment). The lexer is taken from Scintilla's upstream
(version 3.10.4).

This has the following effects:

  • JavaScript comments (//, /* */) are no longer highlighted as comments, since JSON doesn't allow them
  • Text outside of quotes, except keywords and numbers, is now considered an error, and highlighted as such (in red)

It should also allow keys and string values to be highlighted in different colours, as in many other editors, but I couldn't get that to work.

Stopping comments from being highlighted is the main motivation for this - highlighting them misleads the user into thinking that JSON allows them.

This commit changes JSON from a custom filetype to one with its own lexer, as requested in
#384 (comment). The lexer is taken from Scintilla's upstream
(version 3.10.4).

This has the following effects:
 - JavaScript comments (//, /* */) are no longer highlighted as comments, since JSON doesn't allow them
 - Text outside of quotes, except keywords and numbers, is now considered an error, and highlighted as such (in red)

It should also allow keys and string values to be highlighted in different colours, as in many other editors, but I couldn't get that to work.

Stopping comments from being highlighted is the main motivation for this - highlighting them misleads the user into thinking that JSON allows them.
@ell1e
Copy link

ell1e commented May 12, 2020

Isn't it common that JSON parsers accept comments as an extension? I know many also don't, but unless you want to split up into two separate JSON-pure and JSON-with-comments file types it might be better to keep in the comment highlighting

@eht16
Copy link
Member

eht16 commented May 15, 2020

@ell1e what is "common"? This is hard to define in a general way and the JSON standard does not allow comments.

@d5l6 I wonder if it wouldn't be better to keep the custom filetype and just add the Scintilla JSON lexer to Geany and update the custom filetype definition to use it.
Otherwise we need also a migration path for existing users or they will end up with two JSON filetypes. Not sure which one would be used then.

@ell1e
Copy link

ell1e commented May 16, 2020

@eht16 just google "JSON for comments", apart from posts telling you it's not allowed you'll find quite a bunch of different parsers to enable this. And Visual Studio Code (which is pretty popular) uses this style for all its configs, too. I don't have proper statistics, but I think it should be easy enough to see that a non-trivial amount of people use this

@eht16
Copy link
Member

eht16 commented May 16, 2020

@ell1e I still think a dedicated JSON filetype should (at least by default) not allow comments.
But we don't even need to make a hard decision here as the Scintilla JSON lexer has a property to toggle highlighting of comments: lexer.json.allow.comments.

So, one can add:

[lexer_properties]
lexer.json.allow.comments=1

to the filetype definition and so enable highlighting of comments if desired.
@d5l6 this could go as commented and documented option in the filetype definition (for reference see https://sourceforge.net/p/scintilla/code/ci/default/tree/lexers/LexJSON.cxx#l121).

@ell1e
Copy link

ell1e commented May 16, 2020

I think the best way could be to have a JSON with comments file type for the lenient parsing if the core JSON type doesn't allow it. A global option as you suggest will basically force anyone to disable strict mode if they have just have just 1+ files that rely on the expanded syntax, which IMHO turns the option useless. It's not like usually a dev's choice, really, what files they have to deal with. So with a global option alone you might as well just leave things as they are right now (which I think wouldn't be the worst thing to do, actually) - edit: although as an addition it could be a nice to have, beyond two separate file types

@elextr
Copy link
Member

elextr commented May 16, 2020

The JSON lexer setting is global for the filetype, so there would need to be two filetypes to handle both commentable JSON and uncommentable JSON. Changing a filetype setting during editing is simply not usable. But AFAIK there is nothing to distinguish between the variants so different filetypes can be selected. The whole file would have to be searched for comments before it was distinguished if it the file is a JSON with comments or a JSON without comments.

And then just because a file has no comments when opened doesn't say the user doesn't want to add them if the tool they are feeding the JSON to allows, and Geany doesn't know that tool.

As a general principle the highlighting lexer and even the tags parser are not meant to be syntax checkers, they should be as inclusive as possible unless its possible to distinguish which variant is intended. For example the C++ lexer and parser don't distinguish if the file is C++98 or C++20 nor the C lexer/parser C89 to C21.

But the use the specific JSON lexer is still useful if it has other benefits, but it should be more inclusive by default, not less.

@ell1e
Copy link

ell1e commented May 16, 2020

Changing a filetype setting during editing is simply not usable.

For what it's worth, I think that is what VS Code does: it attempts to auto-detect (not sure if just based on file extension or actually content) but t the docs suggest it shows the resulting filetype somewhere prominent, I think on the statusbar, where it can easily be changed. So it kind of expects you to manually switch during editing if it gets it wrong I believe. But I personally also prefer the JSON highlighter to just not be as strict in general, so that I don't need to do manual toggling if it gets it wrong

@codebrainz
Copy link
Member

@ell1e I still think a dedicated JSON filetype should (at least by default) not allow comments.
But we don't even need to make a hard decision here as the Scintilla JSON lexer has a property to toggle highlighting of comments: lexer.json.allow.comments...

+1

The JSON lexer setting is global for the filetype, so there would need to be two filetypes to handle both commentable JSON and uncommentable JSON.

This seems OK. We could ship the proper JSON filetype and any users who want to derive some customized version of the language, could simply define a custom filetype inheriting from the real JSON filetype but with their overrides (such as the property @eht16 mentioned).

It's easy enough to switch filetypes at runtime, or add a new mapping for the custom JSON-like filetype to filetype_mappings.conf to have it auto-detected.

@ell1e
Copy link

ell1e commented May 16, 2020

I would recommend to ship a JSON with comments filetype from the start because it does seem to be somewhat common to me (mostly due to MS's VS Code which popularized). Being able to extend it is always cool, but obviously the user experience is better if the right file mode is already available in the built-in list. With that given, it seems reasonable to then make the "regular" JSON type be strict about comments being not allowed and all that

Fwiw, "JSON with comments" usually means allowing trailing commas in objects and lists, comment type // line comment, comment type /* block comment */, and sometimes comment type # alternate line comment. I would recommend accepting these all as valid in a JSON with comments file type if you choose to provide one

@codebrainz
Copy link
Member

would recommend to ship a JSON with comments filetype from the start because it does seem to be somewhat common to me (mostly due to MS's VS Code which popularized)...

Sure, like "JSON (strict)" filetype using the real JSON lexer, and then a custom filetype "JSON (loose)" inheriting from the Javascript lexer like the current JSON custom filetype does.

@ell1e
Copy link

ell1e commented May 16, 2020

Sounds good! Although I'd call it "JSON (strict)" & "JSON (with comments)", because that is how most people seem to refer to the looser interpretation on the web. But that's just my personal preference, obviously

@Skif-off
Copy link
Contributor

@d5l6 , update LexJSON.cxx, please: Scintilla 3.21.1 has LexJSON.cxx with small changes.

@ell1e
Copy link

ell1e commented Oct 20, 2020

For what it's worth, I've now seen JSONC files (with comments) stored in .json multiple times, e.g. the launch settings in any VS Code project. So while technically a different file type, I do wonder how you'd want to detect it based on the file ending, making the idea of two different parsers for this IMHO quite impractical.

The thing is, I want the syntax highlighting to mostly make my file readable, not to teach me how "wrong" it is when it is actually correct given the actual file type, and I would assume that applies to many users. And it seems like there is no way to safely detect here if a comment in a .json file is an actual mistake, or just a (common!) quirk of whatever tool wrote/reads that file. So I don't think it makes sense to attempt to teach anyone any lessons here in the syntax highlighter/to mark it as "mistake" by default when it just might not be.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants