CST location information #932

bd82 · 2019-03-28T14:40:49Z

Released in 4.7.0
See: https://sap.github.io/chevrotain/docs/guide/concrete_syntax_tree.html#cstnodes-location

A CstNode currently contains all the data needed to compute it's location information as each terminal (IToken) contains it's own location information.

However the "aggregated" information is not available on the CstNode itself.
For example if we want to know what is the startOffset of a specific CstNode we must:

first traverse all it's children
Filter out only the Terminal children.
Find the smallest startOffset of the Terminal Children.

Perhaps Chevrotain could provide a toggle-able feature to perform this location collection on the fly.

export interface CstNode {
    readonly name: string
    readonly children: CstChildrenDictionary
    readonly recoveredNode?: boolean
    /**
     * Only relevant for [in-lined](http://sap.github.io/chevrotain/docs/guide/concrete_syntax_tree.html#in-lined-rules) rules.
     * the fullName will **also** include the name of the top level rule containing this nested rule.
     */
    readonly fullName?: string
    readonly location: { 
          startOffset:number,
          startLine:number,
          // ...
      }
}

Elements to Consider:

Performance impact both when the feature is enabled and disabled.
Should the location information collected match the Lexer positionTracking option?

kristianmandrup · 2019-03-28T15:04:39Z

Awesome!! I'll gladly help implement or try out this feature shortly :)

kristianmandrup · 2019-03-29T08:04:25Z

I would think the design could be something like this:

class Parser {
    // ...
    constructor(
        tokenVocabulary: TokenVocabulary,
        config: IParserConfig = DEFAULT_PARSER_CONFIG
    ) {
        const that: MixedInParser = this as any
        that.initErrorHandler(config)
        that.initLexerAdapter()
        that.initLooksAhead(config)
        that.initRecognizerEngine(tokenVocabulary, config)
        that.initRecoverable(config)
        that.initTreeBuilder(config)
        that.initContentAssist()

        // location info trait (returns immediately if locationInfo is NONE)
        that.initLocationInfoDecorator()

        this.ignoredIssues = has(config, "ignoredIssues")
            ? config.ignoredIssues
            : DEFAULT_PARSER_CONFIG.ignoredIssues
    }
  // ...
}

enum LocationInfo = {
  NONE,
  FULL
}

export interface IParserConfig {
  // ...
  locationInfo: LocationInfo
}

export class LocationInfoDecorator {

    initLocationInfoDecorator(this: MixedInParser, config: IParserConfig) {
      if (config.locationInfo === LocationInfo.NONE) return

      // initialize
    }

     public computeLocationInfo() {
       // ... smart compute
       // using memoized (cached) locations to achieve linear visits O(n) and thus good performance
     }
}

bd82 · 2019-03-29T19:55:15Z

https://gist.github.com/kristianmandrup/afbd8382e7a172aa8945b6023e3b3e8e

bd82 · 2019-03-29T21:15:55Z

I don't think we need a new trait to compute the location info.
We can introduce this code in the cstPostTerminal method.
https://github.com/SAP/chevrotain/blob/master/packages/chevrotain/src/parse/parser/traits/tree_builder.ts#L83-L91

because we can access the current CstNode being build there.

https://github.com/SAP/chevrotain/blob/master/packages/chevrotain/src/parse/parser/traits/tree_builder.ts#L83-L91

Note:

We should only do this computation when the nodes location feature is enabled.
- See how we "assemble" the parser with the relevant methods to avoid conditionals in "hot spots" in the code when certain features are disabled.
We may want to enable this feature by default depending on its performance impact.
We need to make sure it does not get "broken" when error recovery is active.
- SingleTokenInsertion heuristic could insert a virtual token with location information that is "NaN"

kristianmandrup · 2019-03-30T12:04:47Z

Thanks. I will have another go next week using your suggestions, unless you beat me to it. Was is your estimate on how complicated/time it would be to add this? 2-3 working days?

bd82 · 2019-03-30T13:57:47Z

Was is your estimate on how complicated/time it would be to add this? 2-3 working days?

I think the base logic is pretty simple, but there would be some work around
performance and documentation.

bd82 · 2019-06-13T19:17:06Z

Released in 4.7.0
See: https://sap.github.io/chevrotain/docs/guide/concrete_syntax_tree.html#cstnodes-location

bd82 mentioned this issue Mar 28, 2019

SourceMap support #931

Closed

bd82 added the New Feature label Mar 28, 2019

This was referenced Apr 1, 2019

add start and end offsets on nodes jhipster/prettier-java#165

Closed

Handling users' blank lines jhipster/prettier-java#164

Closed

clementdessoude mentioned this issue Apr 11, 2019

implement logic to compute node location information #943

Merged

5 tasks

bd82 mentioned this issue Apr 14, 2019

Evaluate Flag to support children ordered array in a CST #863

Closed

4 tasks

bd82 added Good First Issue Help Wanted labels Apr 15, 2019

bd82 closed this as completed in #943 Jun 5, 2019

bd82 reopened this Jun 7, 2019

bd82 mentioned this issue Jun 7, 2019

Continued work on CST Location tracking. #972

Merged

bd82 closed this as completed Jun 8, 2019

bd82 mentioned this issue Jun 8, 2019

Handle blank lines jhipster/prettier-java#202

Merged

Shaolans mentioned this issue Jun 13, 2019

Handle prettierignore jhipster/prettier-java#209

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CST location information #932

CST location information #932

bd82 commented Mar 28, 2019 •

edited

Loading

kristianmandrup commented Mar 28, 2019

kristianmandrup commented Mar 29, 2019

bd82 commented Mar 29, 2019

bd82 commented Mar 29, 2019

kristianmandrup commented Mar 30, 2019

bd82 commented Mar 30, 2019

bd82 commented Jun 13, 2019

CST location information #932

CST location information #932

Comments

bd82 commented Mar 28, 2019 • edited Loading

kristianmandrup commented Mar 28, 2019

kristianmandrup commented Mar 29, 2019

bd82 commented Mar 29, 2019

bd82 commented Mar 29, 2019

kristianmandrup commented Mar 30, 2019

bd82 commented Mar 30, 2019

bd82 commented Jun 13, 2019

bd82 commented Mar 28, 2019 •

edited

Loading