New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Go] syntax rework #1662

Merged
merged 30 commits into from Nov 29, 2018

Conversation

Projects
None yet
7 participants
@mitranim
Contributor

mitranim commented Jul 14, 2018

This PR is a complete rework of the Go syntax.

Improvements:

  • Contextual types. Types are scoped where types are expected. There's no whitelist of "known" types.

  • Better understands the language and its syntax. Better at handling whitespace and comments inside complex forms. Better at handling other nuances. Unlikely to be confused by unusual code style.

  • Vars and consts in the root scope are added to the symbol index and searchable by ⌘R, ⌘⇪R, and goto_definition.

  • Supports parenthesized type (...) groups, not just const (...) and var (...).

  • The first argument to new and make is scoped as a type.

  • Methods declared inside an interface are added to the symbol index, reflecting the fact that they actually exist as functions.

  • Supports embedded structs.

  • Supports inherited interfaces.

  • Supports imaginary number literals.

  • Variable declarations are scoped differently from other variable occurrences.

  • iota is scoped only in constant initialization expressions.

  • Symbols in ⌘R are annotated with their origin: var, const, type or func.

  • Auto-pairing of backticks.

  • Much more comprehensive tests.

  • All keywords and predeclared identifiers are added to auto-completions by default.

  • Better snippets that don't mess with auto-completion.

  • Slightly better indentation rules that more closely match gofmt.

Current shortcomings (?):

  • Fewer meta scopes: needs feedback.

  • No support for block labels: needs feedback.

While this syntax is ready-to-use and fairly complete, I expect the current maintainers, particularly @wbond who's responsible for the current version of the Go syntax, to have many qualms and request many changes. Arguably, this version has some "regressions", particularly where meta scopes are concerned. This needs feedback and discussion, as I don't fully understand what role many of those meta scopes were intended to serve.

While the PR is pending, the syntax is available standalone: https://github.com/Mitranim/sublime-gox

Screenshot before-after:

screen shot 2018-07-28 at 10 54 50

Screenshot of symbol index before-after:

screen shot 2018-07-28 at 10 54 55

(Many other differences didn't fit into the screenshots. This uses my Cloud color scheme, available at https://github.com/Mitranim/sublime-themes.)

[Go] syntax rework
  * Contextual types. Types are scoped where types are expected. There's no whitelist of "known" types.

  * Better understands the language and its syntax. Better at handling whitespace and comments inside complex forms. Better at handling other nuances. Unlikely to be confused by unusual code style.

  * Vars and consts in the root scope are added to the symbol index and searchable by ⌘R, ⌘⇪R, and `goto_definition`.

  * Supports parenthesized `type (...)` groups, not just `const (...)` and `var (...)`.

  * The first argument to `new` and `make` is scoped as a type.

  * Methods declared inside an `interface` are added to the symbol index, reflecting the fact that they actually exist as functions.

  * Supports embedded structs.

  * Supports inherited interfaces.

  * Supports imaginary number literals.

  * Variable declarations are scoped differently from other variable occurrences.

  * `iota` is scoped only in constant initialization expressions.

  * Symbols in ⌘R are annotated with their origin: `var`, `const`, `type` or `func`.

  * Auto-pairing of backticks.

  * Much more comprehensive tests.

  * All keywords and predeclared identifiers are added to auto-completions by default.

  * Better snippets that don't mess with auto-completion.

  * Slightly better indentation rules that more closely match `gofmt`.

Current shortcomings (?):

  * Fewer meta scopes: needs feedback.

  * No support for block labels: needs feedback.

@mitranim mitranim force-pushed the mitranim:go branch from f543171 to 984f814 Jul 14, 2018

@FichteFoll

Did a small review of everything that's not the actual syntax definition or test file(s).

},
{ "keys": ["backspace"], "command": "run_macro_file", "args": {"file": "res://Packages/Default/Delete Left Right.sublime-macro"}, "context":
[
{ "key": "setting.auto_match_enabled", "operator": "equal", "operand": true },

This comment has been minimized.

@FichteFoll

FichteFoll Jul 15, 2018

Contributor

Doesn't have a selector context specifier.

This comment has been minimized.

@mitranim

mitranim Jul 16, 2018

Contributor

Thanks, added the missing selector.

{ "keys": ["enter"], "command": "run_macro_file", "args": {"file": "res://Packages/Default/Add Line in Braces.sublime-macro"}, "context":
[
{ "key": "setting.auto_indent", "operator": "equal", "operand": true },

This comment has been minimized.

@FichteFoll

FichteFoll Jul 15, 2018

Contributor

Also doesn't have a selector context specifier.

This comment has been minimized.

@mitranim

mitranim Jul 16, 2018

Contributor

Thanks, added the missing selector.

"completions": [
// https://golang.org/ref/spec#Keywords
"break",

This comment has been minimized.

@FichteFoll

FichteFoll Jul 15, 2018

Contributor

I suggest adding a description for these completions, i.e. keyword. See http://docs.sublimetext.info/en/latest/reference/completions.html.

This comment has been minimized.

@mitranim

mitranim Jul 16, 2018

Contributor

Hmm the docs don't mention any descriptions.

Apparently, some completions in the standard packages use tab-separated triggers: word\tdescription. Tried this; it looks cool in the completion list, but suppresses the normal buffer completions, which is exactly what you don't want, especially in a standard package. If we could add descriptions without messing things up, I'd love to.

This comment has been minimized.

@mitranim

mitranim Jul 16, 2018

Contributor

Played with generating sublime-snippet files for these completions. Snippets have descriptions and don't suppress normal buffer completions. They also don't appear when you just hit ^Space or equivalent to bring up the completion list. The downside is that they clog up the Command Palette.

Really the problem is that even one sublime-completion with a tab-separated description completely suppresses buffer completion. It's ridiculous and wrong.

This comment has been minimized.

@FichteFoll

FichteFoll Jul 16, 2018

Contributor

Meh, sounds like SublimeTextIssues/Core#1061 is striking again. I thought tab triggers wouldn't affect it.

Or maybe it is actually a different issue, but that would be surprising.

@@ -0,0 +1,11 @@
{
// Prefer tabs because gofmt enforces tabs.
"translate_tabs_to_spaces": false,

This comment has been minimized.

@FichteFoll

FichteFoll Jul 15, 2018

Contributor

This is certainly debatable, but I don't have insight on whether people write Go without gofmt.

// Prefer tabs because gofmt enforces tabs.
"translate_tabs_to_spaces": false,
// Trigger autocompletion on dot access.
"auto_complete_triggers": [

This comment has been minimized.

@FichteFoll

FichteFoll Jul 15, 2018

Contributor

Many packages use dots for attribute access, however none of them defines auto_complete_triggers. I'm not sure whether they should or should not, but this is something we should discuss, imo.

This comment has been minimized.

@mitranim

mitranim Jul 16, 2018

Contributor

This certainly could be removed. I'll run without this setting for a few days to get a feel.

This comment has been minimized.

@mitranim

mitranim Jul 17, 2018

Contributor

Removed the trigger setting for now.

<dict>
<key>scope</key>
<string>
variable.other.constant.declaration.go - meta.block.go

This comment has been minimized.

@FichteFoll

FichteFoll Jul 15, 2018

Contributor

entity.name.constant.go would be more appropriate, assuming the syntax definition is updated as well.

This comment has been minimized.

@mitranim

mitranim Jul 16, 2018

Contributor

Agreed. The problem is that applying different const/var scopes depending on whether we're inside a block seems to require lots of duplication in the syntax; not sure it's worth it.

@@ -0,0 +1,16 @@
<?xml version="1.0" encoding="UTF-8"?>

This comment has been minimized.

@FichteFoll

FichteFoll Jul 15, 2018

Contributor

You can define both, the indexed and non-indexed symbols, in the same file, assuming their scope selectors are equal. This applies to the other files as well.

This comment has been minimized.

@mitranim

mitranim Jul 16, 2018

Contributor

These are defined in separate files to keep symbolTransformation from applying to the indexed symbols and messing up "goto_definition". I'd love to be able to merge the files, but it doesn't seem possible while keeping the current rule logic.

This comment has been minimized.

@FichteFoll

FichteFoll Jul 16, 2018

Contributor

Can you confirm that symbolTransformation does indeed affect the symbol that is used for the index if showInIndexedSymbolList is set? Previous experience doesn't indicate so (and neither to the docs derived from that).

Even with two different files, since the scope selectors are the same it would have the exact same effect as all the meta settings will be set on any token that is matched.

This comment has been minimized.

@mitranim

mitranim Jul 17, 2018

Contributor

Interesting! I double checked, and indeed, it looks like symbolTransformation doesn't affect indexing. Must have mixed it up with something else when I was writing this.

Merged some of the symbol files. Thanks!

@FichteFoll

This comment has been minimized.

Contributor

FichteFoll commented Jul 26, 2018

Something I did for my YAML rework was to have a sample file containing lots of examples with bad highlighting in the original and then providing a Screenshot of that file being rendered with and without the pr to give a better before-after comparison (#90).

@mitranim

This comment has been minimized.

Contributor

mitranim commented Jul 28, 2018

@FichteFoll Nice idea. Added comparison screenshots.

@cskujbus

This comment has been minimized.

cskujbus commented Oct 25, 2018

Hi,
I'd like to ask why type, struct, func, chan etc. don't have keyword scope? They have storage.type which means they are styled differently than keywords.

@mitranim

This comment has been minimized.

Contributor

mitranim commented Oct 25, 2018

@cskujbus An excellent point. I followed the established convention of scoping type keywords the same as type names, but frankly, it makes no sense whatsoever. Perhaps we should break with the convention, and ask the rest of the syntax developers to follow suit.

@wbond @keith-hall

@FichteFoll

This comment has been minimized.

Contributor

FichteFoll commented Oct 25, 2018

#582

@mitranim

This comment has been minimized.

Contributor

mitranim commented Oct 25, 2018

As an experiment, I've scoped these as keywords, and the keyword coloring started "popping out" in the middle of type signatures. It feels weird when you're used to the "old" way. I've come to associate keyword coloring with control flow.

image
image

Thoughts?

@wbond

This comment has been minimized.

Collaborator

wbond commented Oct 25, 2018

I’m not inclined to revamp the scope names for function and class keywords. Most languages that use those have the ability to treat such as first class citizens, so I think it makes sense the keywords are scoped as storage.type. The keyword is describing the type of the identifier, in this case a function, class, etc.

@FichteFoll

This comment has been minimized.

Contributor

FichteFoll commented Oct 25, 2018

For now, I suppose the best action forward is to scope them as storage.type.class and storage.type.function (and not func!) and storage.type.struct, so that people may colorize them differently if they wish.

@mitranim

This comment has been minimized.

Contributor

mitranim commented Oct 26, 2018

My inner language-author/syntax-author/Lisper feels that all language keywords should have a unified scope, so that a simplified color scheme can consistently mark them as keywords, without special-casing the various storage. variants and without affecting the actual types names. However, the screenshots above convince me that for type keywords that may occur inside type signatures, giving them a coloring similar or identical to type names is a nice default. This is particularly useful for languages with anonymous inline types, such as Go, where a type signature may mix type names with keywords.

Satisfying both use cases requires the scope on keywords like struct or interface to be distinct from storage.type so we can distinguish type keywords from type names. The current convention is to scope them identically, which is really dubious.

I would propose keyword.storage if we had the luxury of wide breaking changes. Otherwise, storage.keyword appears to be the most sensible option. Probably not storage.modifier since it more commonly refers to mode of mutation rather than the data layout itself.

Long story short: what if we scope all type keywords, such as func, struct, interface, as storage.keyword?

Addendum for consistency with the argument above. I don't see a reason why type, var, const and the like shouldn't also be scoped as storage.keyword.X. The same logic applies: they're still keywords; a simpler color scheme may want to mark them as such; a plugin may want to use this scope to find all keywords, regardless of their fiddly little differences.

@cskujbus

This comment has been minimized.

cskujbus commented Oct 26, 2018

@mitranim thank you for this detailed explanation, I just wanted to know the reasoning behind classifying some keywords as storage types (VS Code for example seems classifying all Go keywords in the same scope).
I tested also my color scheme (with keywords in bold) with a very simple syntax with all Go keywords in the same scope and, I have to admit, the text looks a bit heavier than before but also keywords are more distinguishable from simple types like int, string etc.

Show resolved Hide resolved Go/Go.sublime-syntax Outdated

deathaxe and others added some commits Oct 29, 2018

[Go] cosmetic cleanup in noise regexp
Co-Authored-By: Mitranim <me@mitranim.com>
[Go] storage scope overhaul
Type keywords now use "storage.keyword" rather than "storage.type". This
distinguishes them from type names while keeping the "storage" prefix. This is
consistent with the default coloring for most other languages, but a color
scheme can also choose to treat them like regular keywords.

This commit also changes "var" and "const" from "storage.modifier" to
"storage.keyword", for logical consistency with the change above.
[Go] better comment scoping
Added punctuation scopes for comment start and comment end. Added more
detailed annotation meta scopes.

Also by @deathaxe, see mitranim/sublime-gox#1
@deathaxe

This comment has been minimized.

deathaxe commented on Go/Go.sublime-syntax in 7f44c59 Oct 31, 2018

This line will not pop off at the eof as you expect exactly one whitespace before the eol. In other syntaxes we therefore got used to use the $\n? to pop off at eol and optionally include the newline if available.

The basic question may be: Are trailing whitespace characters are part of the annotation or the comment only?

This comment has been minimized.

Owner

mitranim replied Oct 31, 2018

Yeah your version had $\n?. Looks like I replaced it while experimenting and forgot. The \s$ appears to be doing exactly the same thing; it seems to work even without any whitespace at the end of the line. The comment above the rule states as much.

I suppose $\n? is more explicit, but shouldn't it be $\r?\n?? Sublime uses unix line endings by default, but we want to be able to parse any text correctly.

This comment has been minimized.

deathaxe replied Oct 31, 2018

ST converts the eol character internally. Both python and the lexer always see the \n only. Hence the \r is not needed.

Yes \s includes the \n and therefore works as well. But it would require a space or tab at the eol if the caret is at the last line of a file, which does not end with newline. We can argue, whether popping off is useful anyway, but the $\n? is just what is used in many situations you want to optionally scope the whole space after the eol as well.

You may understand it as a suggestion in "coding style".

This comment has been minimized.

Owner

mitranim replied Oct 31, 2018

Yeah it makes more sense. Pushed a correction. Thanks! 🙂

@wbond

This comment has been minimized.

Collaborator

wbond commented Nov 27, 2018

I really wanted to merge this for the next dev build, but it is extensively using storage.keyword, which is not part of the guidelines, and has chosen to change them from scopes that are currently the standard.

I don't want to break a bunch of syntax coloring for users, and thus am not inclined to merge this since it takes scope names in a new direction that is different (in a major way) from all of the syntaxes that have been working on over the past couple of years. For consistency sake, I want new syntax work to be consistent with other actively maintained syntaxes. So to merge this, we would have to revamp scope names in a bunch of syntaxes, change the standard and revamp color schemes.

@mitranim

This comment has been minimized.

Contributor

mitranim commented Nov 28, 2018

Glad to see you back in action, @wbond!

I've updated the storage.keyword.X scopes to storage.type.keyword.X. Also pushed a few other minor changes. Let me know if there are any other issues.

@FichteFoll

This comment has been minimized.

Contributor

FichteFoll commented Nov 28, 2018

I suppose replacing storage.type.keyword.function with storage.type.function would be appropriate, which is used across multiple syntaxes iirc. And similar occurances.

Not sure about interface and const, however.

@wbond

This comment has been minimized.

Collaborator

wbond commented Nov 28, 2018

I am more inclined for that format @FichteFoll. The impression I get is that @mitranim is trying to make it possible to color storage.type.keyword the same as keyword, and so having to enumerate all of the different possible storage.type. specializations would not be viable.

@wbond

This comment has been minimized.

Collaborator

wbond commented Nov 29, 2018

I'd rather get this merged and we can discuss subtle changes to scope later.

@wbond wbond merged commit 14a35f6 into sublimehq:master Nov 29, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@ccampbell

This comment has been minimized.

Contributor

ccampbell commented Nov 30, 2018

Just updated my Sublime build. One thing I have noticed is the new highlighting used for types within a package.

For example:

From what I can tell, both the native type string and the Time from time.Time are using storage.type.go as the scope. Would it be possible to use something like storage.type.go.package or storage.type.go.imported to be able to differentiate the types for theming? It is also misleading to call it storage.type.go since it is not a native go type.

It may just be that I am not used to the change, but I think I would prefer to be able to highlight them differently. Previously time.Time was not highlighted at all (IIRC).

@deathaxe

This comment has been minimized.

Contributor

deathaxe commented Nov 30, 2018

Time would need to be scoped using support.type.go if it is part of a "basic" framework library. storage.type.go is meant for language types only.

(This is what I know from other syntaxes and is how I understand the scope names, which may be wrong)

@mitranim

This comment has been minimized.

Contributor

mitranim commented Dec 3, 2018

Thanks for asking. This is an important difference between my syntaxes and most other ST built-ins. I've been meaning to raise this topic for a while now.

For the language user, there's no functional difference between predefined and user-defined types. Standard library types like time.Time are user-defined as far as the language is concerned. A type is just a type.

The idea of treating types differently is so weird, I can't even comprehend where it's coming from. Maybe it's inertia from syntaxes that wanted but couldn't scope types properly?

Modern general-purpose languages are explicitly designed to be extensible. Complicated programs tend to develop their own DSLs, speaking mostly in their own "language", often supplanting built-in types and idioms. Treating all types equally, exactly the way the language does, supports this notion.

A syntax's job is to reflect the language. When the language treats all types equally, so does the syntax.

Yes, many built-in ST syntaxes have these weird ideas about "built-in" and "support" types. This doesn't make them any less wrong. I plan to convince other syntax developers to revise this notion across all standard packages.

@deathaxe

This comment has been minimized.

Contributor

deathaxe commented Dec 3, 2018

Not sure about the origin, but this is just the existing way to achieve @ccampbell 's request.

I honestly find it helpful to see language defined functions (support.function) in another color than user defined (variable.function) ones. Same might be helpful with basic/builtin/atomic language types like ints/floats/... and complex types like classes/structs/... which are a collection of the atomic types extended by functions/methods/... .

You are right with support. describing basic framework functions/types being less suitable for use with user defined types. This is indeed weird. It's even not possible to distinguish them in most languages.

But the possibility to distinguish between atomic and complex types should be applied anyway. Maybe something like storage.type.go for simple types and storage.class.go for classes (instead of support.class.go) or all in the 3rd level (storage.type.simple.go vs. storage.type.class.go)?

@Thom1729

This comment has been minimized.

Contributor

Thom1729 commented Dec 5, 2018

I'm in favor of scoping builtin types and functions.

It must be admitted that scoping builtins is not a truly syntactic concern. In most languages, a built-in function or type is parsed exactly the same as any other type. You can have a completely accurate syntax definition that doesn't scope any builtins at all.

On the other hand, to the extent that builtins are defined by the language itself, I think that it's useful to use scopes to distinguish them. One reason is that builtins generally don't have lexical declarations, so when I see something highlighted as a builtin, I know not to look for one. In the absence of full language-specific intellisense, builtin highlighting also serves as a first line of defense against typos.

I think that support scopes should be purely additive. That is, a builtin function should be scoped just like any other function, plus whatever extra scope builtins get. The core syntaxes don't all do this, and I think they should. For one thing, this makes it easier for users who don't want builtin highlighting to modify their color scheme to disable it.

What builtins deserve highlighting? The answer surely depends on the language. In JavaScript, there is a fixed set of global types and functions that do not need to be included and are available in any environment. These, I think, should be scoped. There are also many other global builtins available in other environments, such as web browsers. I'm not convinced that these should be scoped in the core language. For JavaScript, I'm planning to write an extension for JS Custom to allow users to choose between sets of builtins (like browser/node) and to add their own.

@FichteFoll

This comment has been minimized.

Contributor

FichteFoll commented Dec 6, 2018

This is becoming somewhat off-topic and approaching RFC levels. I suggest to summarize a few options and formulate those into a proposal in a several issue. I also have a thing or two to say about this, later.

After all, this doesn't just affect Go.

@mitranim

This comment has been minimized.

Contributor

mitranim commented Dec 7, 2018

@Thom1729 thanks for the explanation. These points make a lot of sense.

I like the idea of built-ins scoping being purely additive. Reserving it for a small, rarely-expanded set of language built-ins, rather than any "library" APIs like the DOM, also seems like a no-brainer.

Getting back to Go. Unlike most languages, it lets you tell apart, on a glance, local/external and private/public definitions:

  • builtin
  • privateLocal
  • PublicLocal
  • external.Public

Syntactically, built-ins overlap only with private local definitions. In addition, the number of built-in types and functions is fairly small and easy to remember. Since I still don't understand the benefit of additional "builtin" or "support" scoping, my default preference would be to stick with storage.type for simplicity and consistency. Obviously, this is up to consensus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment