Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lower-case-ing the heredoc language #554

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@

## [Unreleased]

- Fixed default tokenization of heredocs in Ruby (should be tokenized as string)
- Added a modern implementation of Tree-sitter grammars behind an experimental flag. Enable the “Use Modern Tree-Sitter Implementation” in the Core settings to try it out.
## 1.106.0

- Fixed bug that happens on some systems when trying to launch Pulsar using the Cinnamon desktop environment
Expand Down
35 changes: 18 additions & 17 deletions packages/language-ruby/grammars/ts/highlights.scm
Original file line number Diff line number Diff line change
Expand Up @@ -90,8 +90,7 @@

; "Foo" in `Foo::Bar`.
(scope_resolution
scope: (constant) @support.other.namespace.ruby
(#set! test.final "true"))
scope: (constant) @support.other.namespace.ruby)

; "::" in `Foo::Bar`.
(scope_resolution
Expand All @@ -100,8 +99,7 @@

; "Bar" in `Foo::Bar`.
(scope_resolution
name: (constant) @support.other.class.ruby
(#set! test.final "true"))
name: (constant) @support.other.class.ruby)
Copy link
Sponsor Contributor

@savetheclocktower savetheclocktower Jun 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I had for doing test.final here — and it's more of a feeling than anything — is that I felt we didn't really gain anything from tagging one range as both support.other.class.ruby and constant.ruby.

There are places where it makes sense to do stuff like that, but if we apply both scopes, we end up leaving it to the syntax theme to decide which styles win out, and often that's not an intentional decision by the theme author as much as it's just an accident of CSS specificity and ordering of selectors. It also brings ordering in the SCM file into play — since constant.ruby is declared later in the file, that scope name will be in the inner span, and that also has an effect on which styles win out.

Technically, lots of things are marked as constant in tree-sitter-ruby, just as lots of things are marked as identifier, and I'm not certain that we give the user much useful information by scoping each one as constant even when it serves a more specific function.

I've tended to assign a constant scope only for (a) number literals, (b) built-in language constants like booleans and null, and (c) declarations that LOOK_LIKE_CONSTANTS and don't serve other functions. Lots of TM-style themes go even more conservative than that — using constant for A and B but using the variable scope for C.

I'm not arguing that any of this is necessarily wrong, and I'm happy to see how it shakes out, but that was the thinking behind the original decision.




Expand Down Expand Up @@ -145,6 +143,8 @@
(assignment
left: (identifier) @variable.other.assignment.ruby)

(constant) @constant.ruby

(element_reference
(constant) @support.class.ruby
(#match? @support.class.ruby "^(Set)$"))
Expand All @@ -164,6 +164,7 @@
(#set! test.final "true"))

(call "." @keyword.operator.accessor.ruby (#set! test.final "true"))
(call "&." @keyword.operator.accessor.ruby (#set! test.final "true"))

((identifier) @constant.builtin.ruby
(#match? @constant.builtin.ruby "^__(FILE|LINE|ENCODING)__$"))
Expand Down Expand Up @@ -202,7 +203,10 @@
(#set! test.final true))

(pair
key: (hash_key_symbol)
key: (simple_symbol) @constant.other.symbol.hashkey.ruby)

(pair
key: (hash_key_symbol) @constant.other.symbol.hashkey.ruby
; In `{ foo: 'bar' }` syntax, the `:` both marks the key as a symbol and
; separates it from its value.
":" @punctuation.definition.constant.hashkey.ruby
Expand Down Expand Up @@ -263,16 +267,6 @@
(string_content)?
"\"" @punctuation.definition.string.end.ruby)
@string.quoted.other.interpolated.ruby
(#match? @string.quoted.other.interpolated.ruby "^%Q")
(#set! test.final true))


(
(string
"\"" @punctuation.definition.string.begin.ruby
(string_content)?
"\"" @punctuation.definition.string.end.ruby)
@string.quoted.other.ruby
Comment on lines -266 to -275
Copy link
Sponsor Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this section not needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were some misbehaviors on the string matching - some interpolated strings were being detected as non-interpolated, and vice-versa. When I fixed these, I extracted most of the changes into specific matches.

In this specific case, I did not remove the match of double-quoted strings - I just removed the restriction on the previous rule (https://github.com/pulsar-edit/pulsar/pull/554/files#diff-82574a2099439cbee59d5f974dbe593145214402078d948cf9f49378d9637e40R264-L266) and that basically became the exact same match that we have here :D

(#set! test.final true))

; Highlight the interpolation inside of a string.
Expand Down Expand Up @@ -446,7 +440,7 @@
(#set! test.final true))

(singleton_class
"<<" @keyword.operator.assigment.ruby)
"<<" @keyword.operator.assignment.ruby)

(binary
"|" @keyword.operator.bitwise.ruby)
Expand All @@ -469,7 +463,7 @@
"+="
"-="
"<<"
] @keyword.operator.assigment.ruby
] @keyword.operator.assignment.ruby

[
"||"
Expand All @@ -496,6 +490,13 @@
; PUNCTUATION
; ===========

(symbol_array
"%i(" @punctuation.definition.begin.array.other.ruby
")" @punctuation.definition.end.array.other.ruby)
(string_array
"%w(" @punctuation.definition.begin.array.other.ruby
")" @punctuation.definition.end.array.other.ruby)

"[" @punctuation.definition.begin.array.bracket.square.ruby
"]" @punctuation.definition.end.array.bracket.square.ruby

Expand Down
2 changes: 1 addition & 1 deletion packages/language-ruby/lib/main.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ exports.activate = function() {
atom.grammars.addInjectionPoint('source.ruby', {
type: 'heredoc_body',
language(node) {
return node.lastChild.text;
return node.lastChild.text?.toLowerCase();
},
content(node) {
return node.descendantsOfType('heredoc_content')
Expand Down
120 changes: 64 additions & 56 deletions packages/language-ruby/spec/fixtures/tree-sitter-grammar.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@

class A::B < C
# <- keyword.control.class
# ^ entity.name.type.class
# ^ punctuation.separator.namespace
# ^ entity.name.type.class
# ^ constant
# ^ support.other.namespace.ruby
# ^ keyword.operator.namespace
# ^ constant
# ^ support.other.class.ruby
# ^ punctuation.separator.inheritance
# ^ entity.other.inherited-class
# ^ entity.name.type.class
public
# <- keyword.other.special-method
protected
Expand All @@ -34,16 +35,16 @@ class A::B < C
end

thing&.call
# ^ punctuation.separator.method
# ^ source
# ^ keyword.operator.accessor
# ^ support.other.function

VAR1 = 100
# <- variable.other.constant
# <- constant
# ^ keyword.operator.assignment
# ^ constant.numeric

_VAR1 = 100_000
# <- variable.other.constant
# <- !constant
# ^ constant.numeric

# This dot will not be tokenized as a separator
Expand All @@ -52,7 +53,7 @@ class A::B < C

# But this will
a.b
#^ punctuation.separator.method
#^ keyword.operator.accessor

# These are all also numbers:
1.23e-4
Expand All @@ -78,69 +79,56 @@ class A::B < C
# <- constant.other.symbol
:<=>
# <- constant.other.symbol
%s(foo)
# <- constant.other.symbol.delimited

# Yes, these are ALL arrays
%i(foo)
# <- punctuation.section.array.begin
# <- punctuation.definition.begin.array.other
# ^ constant.other.symbol
# ^ punctuation.definition.end.array.other
%i!foo!
# <- punctuation.definition.begin.array.other
# ^ constant.other.symbol
# ^ punctuation.definition.end.array.other

%w(foo)
# <- punctuation.definition.begin.array.other
# ^ string.unquoted
# ^ punctuation.definition.end.array.other

[:foo]
# <- punctuation.definition.begin.array.bracket.square
# ^ constant.other.symbol
# ^ punctuation.definition.end.array.bracket.square

{foo: 1}
# <- punctuation.section.scope.begin
# <- punctuation.definition.hash.begin.bracket.curly
# ^ constant.other.symbol.hashkey
# ^ punctuation.definition.hash.end.bracket.curly
{:foo => 1}
# ^ constant.other.symbol.hashkey

class << A::B
# <- keyword.control.class
# ^ punctuation.definition.variable
# ^ entity.name.type.class
# ^ punctuation.separator.namespace
end

def a.b(*args)
# <- meta.function.method.with-arguments
# <- keyword.control.def
# ^ entity.name.function
# ^ entity.name.function
# ^ punctuation.separator.method
# ^ meta.function.method.with-arguments
# ^ punctuation.definition.parameters
# ^ storage.type.variable
# ^ variable.parameter.function
# ^ punctuation.definition.parameters
end
# <- keyword.control

# Strings.
"te\ste"
# <- punctuation.definition.string.begin
# ^ string.quoted.other.interpolated
# ^ string.quoted.double.interpolated
# ^ constant.character.escape
# ^ string.quoted.other.interpolated
# ^ string.quoted.double.interpolated
'te\ste'
# <- punctuation.definition.string.begin
# ^ string.quoted.single
# ^ !string.quoted.single.interpolated
# ^ !constant.character.escape

%(te(s)t)
# <- punctuation.definition.string.begin
# ^ string.quoted.other.interpolated
# ^ punctuation.section.scope
# ^ string.quoted.other.interpolated
# ^ punctuation.definition.string.end

%[te[s]t]
# <- punctuation.definition.string.begin
# ^ string.quoted.other.interpolated
# ^ punctuation.section.scope
# ^ string.quoted.other.interpolated
# ^ punctuation.definition.string.end

%{te{s}t}
# <- punctuation.definition.string.begin
# ^ string.quoted.other.interpolated
# ^ punctuation.section.scope
# ^ string.quoted.other.interpolated
# ^ punctuation.definition.string.end

%<te<s>t>
# <- punctuation.definition.string.begin
# ^ string.quoted.other.interpolated
# ^ punctuation.section.scope
# ^ string.quoted.other.interpolated
# ^ punctuation.definition.string.end

%~te\~s~
Expand All @@ -153,8 +141,6 @@ def a.b(*args)
%Q(te(s)t)
# <- punctuation.definition.string.begin
# ^ string.quoted.other.interpolated
# ^ punctuation.section.scope
# ^ string.quoted.other.interpolated
# ^ punctuation.definition.string.end

%x!#{"l" + "s"}!
Expand All @@ -166,6 +152,28 @@ def a.b(*args)
# ^ punctuation.definition.string.end

/test/
# <- punctuation.section.regexp
# <- punctuation.definition.begin.regexp
# ^ string.regexp
# ^ punctuation.section.regexp
# ^ punctuation.definition.end.regexp

%r(foo)
# <- punctuation.definition.begin.regexp
# ^ string.regexp
# ^ punctuation.definition.end.regexp


# HEREDOCs
<<~RUBY
# <- string.unquoted
:keyword
# ^ constant.other.symbol
RUBY
# <- string.unquoted

<<SOMESTRING
:keyword
# ^ !constant.other.symbol
# ^ constant.other.symbol
# Making two conflicting assetions because we're inside a string now
SOMESTRING
# <- string.unquoted
6 changes: 3 additions & 3 deletions packages/language-ruby/spec/tokenizer-spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ describe('Ruby grammars', () => {
await runGrammarTests(path.join(__dirname, 'fixtures', 'textmate-grammar.rb'), /#/)
});

xit('tokenizes the editor using node tree-sitter parser', async () => {
it('tokenizes the editor using modern tree-sitter parser', async () => {
atom.config.set('core.useTreeSitterParsers', true);
atom.config.set('core.useExperimentalModernTreeSitter', false);
await runGrammarTests(path.join(__dirname, 'fixtures', 'textmate-grammar.rb'), /#/)
atom.config.set('core.useExperimentalModernTreeSitter', true);
await runGrammarTests(path.join(__dirname, 'fixtures', 'tree-sitter-grammar.rb'), /#/)
});
});