Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Tree-sitter grammar injections #17551

Merged
merged 53 commits into from Jul 16, 2018

Conversation

Projects
None yet
5 participants
@maxbrunsfeld
Copy link
Contributor

maxbrunsfeld commented Jun 22, 2018

Motivation

The TextMate parsing system allows grammars to be composed in order to syntax-highlight things like:

  • JavaScript inside of script tags in an HTML file
  • HTML inside of template strings in JavaScript
  • SQL inside of a string literal in Python

Currently, these types of things don't work when Tree-sitter is enabled.

Solution

This PR adds the concept of 'grammar injection' to Tree-sitter grammars. Specifically, it adds two new APIs associated with Tree-sitter grammars. These APIs might be revised before this PR is merged.

1. Adding Injection Points

atom.grammars.addInjectionPoint(
  languageId,
  {
    type: syntaxNodeType,
    language: languageCallback,
    content: contentCallback
  }
)

This method allows you to express ideas like: "In JavaScript, tagged template strings are injection points. For each tagged template string, try to identify its language by looking at the name of its tag function. Parse the content between the backticks, omitting any template substitutions."

atom.grammars.addInjectionPoint(
 'javascript',
  {
    // tagged template literals are simply parsed as call expressions 
    // with a template string instead of an argument list
    type: 'call_expression', 
  
    // The language name can be found in the template string's "tag"
    language (callNode) {
      if (callNode.lastChild.type === 'template_string') {
        return callNode.firstChild.text
      }
    },

    // Parse the content inside of the template string
    content (callNode) {
      return callNode.lastChild
    }
  }
)

Note that this API does not indicate which grammar to use when parsing the content. That information can be provided by other packages, using the second API:

2. Specifying Injection Patterns

Grammars that use Tree-sitter have a new field called injectionRegExp. This field allows you to express ideas like: "The HTML language can be injected. Whenever there is an injection point where the language-name includes the string 'html', parse that injection point's content using the HTML parser."

id: 'html'

injectionRegExp: 'html|HTML|Html$'

This two-part API allows languages to be embedded within one another without every grammar having to know about every other grammar.

Example

EJS is a popular JavaScript templating system where JavaScript code is interspersed with HTML markup using the delimiters <% and %>. The HTML can of course contain more JavaScript code inside of script tags. And that JavaScript code can contain HTML inside of string literals.

html-in-js-in-html-in-ejs

Tasks

  • Add tests and documentation for the new GrammarRegistry APIs
  • When the buffer changes, only update the affected injections, not all of them
  • Update highlighting when grammars with injectionRegExps are added
  • Allow injections that are spread across many child nodes (needed for PHP, ERB, EJS, mustache, etc)
  • Use injections for the expand selection command
  • Use injections for folding specific lines
  • Fix syntax highlighting bugs
  • Use injections for getting scope descriptors
  • Use injections for folding at a given nesting level

Related Issues / PRs

Closes #17392
Depends on tree-sitter/tree-sitter#181
Depends on tree-sitter/node-tree-sitter#14
Depends on tree-sitter/node-tree-sitter#18

wip

@maxbrunsfeld maxbrunsfeld changed the title Start work on Tree-sitter grammar injections Add Tree-sitter grammar injections Jun 22, 2018

@maxbrunsfeld maxbrunsfeld force-pushed the tree-sitter-injections branch 2 times, most recently from 3542f3c to b944e24 Jun 22, 2018

Start work on Tree-sitter grammar injections
Co-Authored-By: Ashi Krishnan <queerviolet@github.com>

@maxbrunsfeld maxbrunsfeld force-pushed the tree-sitter-injections branch from b944e24 to 37a3ae1 Jun 22, 2018

@daviwil daviwil referenced this pull request Jun 25, 2018

Closed

Iteration Plan: June 25 - July 6, 2018 #17579

7 of 12 tasks complete

maxbrunsfeld and others added some commits Jun 26, 2018

Get first test for grammar injections passing
Co-Authored-By: Ashi Krishnan <queerviolet@github.com>
Fix syntax highlighting problems with injected languages
Co-Authored-By: Ashi Krishnan <queerviolet@github.com>
@thomasjo
Copy link
Member

thomasjo left a comment

Stupid style comment; any specific reason for prefixing some of these functions with an underscore? I'm assuming they are "private", but I don't think we do this anywhere else in the core code? This wasn't meant to be a review, just a line comment...

}
}

async _performUpdate (containingNode) {

This comment has been minimized.

@thomasjo

thomasjo Jun 27, 2018

Member

Stupid style comment; any specific reason for prefixing some of these functions with an underscore? I'm assuming they are "private", but I don't think we do this anywhere else in the core code?

maxbrunsfeld and others added some commits Jun 27, 2018

Add tests and docs for addInjectionPoint
Also, replace `addInjectionPattern` API with a single `injectionRegExp` 
field on the grammar.

Co-Authored-By: Ashi Krishnan <queerviolet@github.com>
Add `text` getter to SyntaxNode
Co-Authored-By: Ashi Krishnan <queerviolet@github.com>

@maxbrunsfeld maxbrunsfeld force-pushed the tree-sitter-injections branch from 095152c to c014f5e Jun 27, 2018

Start parsing right away when constructing a TreeSitterLanguageMode
Co-Authored-By: Ashi Krishnan <queerviolet@github.com>

@maxbrunsfeld maxbrunsfeld force-pushed the tree-sitter-injections branch from c014f5e to 6c85ff8 Jun 27, 2018

Emit highlight change events when removing injections
Co-Authored-By: Ashi Krishnan <queerviolet@github.com>

@maxbrunsfeld maxbrunsfeld force-pushed the tree-sitter-injections branch 2 times, most recently from 1f212f8 to d6bb7b2 Jun 28, 2018

Define highlight iter's position in terms of tree cursor position
Co-Authored-By: Ashi Krishnan <queerviolet@github.com>

@maxbrunsfeld maxbrunsfeld force-pushed the tree-sitter-injections branch from d6bb7b2 to 41c124c Jun 29, 2018

@maxbrunsfeld maxbrunsfeld force-pushed the tree-sitter-injections branch from d5f53fd to e16e680 Jun 29, 2018

@queerviolet queerviolet force-pushed the tree-sitter-injections branch from 4ff747f to ca854cc Jul 9, 2018

@daviwil daviwil referenced this pull request Jul 9, 2018

Closed

Iteration Plan: July 9 - July 20, 2018 #17660

8 of 16 tasks complete

@maxbrunsfeld maxbrunsfeld force-pushed the tree-sitter-injections branch from cd8c8af to 5b0ae5a Jul 13, 2018

@maxbrunsfeld maxbrunsfeld force-pushed the tree-sitter-injections branch from 5dfc7c1 to 8fa9a45 Jul 16, 2018

@maxbrunsfeld maxbrunsfeld merged commit 65201b7 into master Jul 16, 2018

0 of 3 checks passed

ci/circleci CircleCI is running your tests
Details
continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details

@maxbrunsfeld maxbrunsfeld deleted the tree-sitter-injections branch Jul 16, 2018

@banacorn banacorn referenced this pull request Aug 3, 2018

Open

.lagda.md support #65

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.