Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Document how to maintain line and column information with custom lexers #59

Open
fitzgen opened this Issue · 8 comments

5 participants

@fitzgen

I just ran in to this while hacking on CoffeeScript, whose lexer doesn't expose the proper information. It would be nice if the interface were documented. I'm going to dig in to the code now, but I can't guarantee a pull request.

@zaach
Owner

It's mentioned here: http://zaach.github.com/jison/docs/#tracking-locations

Basically, the lexer sets a yylloc property with a value that follows the API. How the lexer internally maintains line/column information is not specified; when lex() returns a token the parser will use whatever value lexer.yylloc is.

@fitzgen

That just mentions how to use @1.first_line, etc, not how to set lexer.yyloc from a custom lexer, which is what was throwing me off. Just thought it would be nice.

@zaach
Owner

Oh, I see. It would really depend on the lexer, though a suggestive guide or concrete example could help out, for sure. So far, CoffeeScript is the only implementation with a custom lexer that I've seen.

@showell

Hi, just following up on this. Like @fitzgen, I ran across this issue in trying to modify CoffeeScript.

@saikobee

I'm working on a project that uses a custom lexer, so I agree that better documentation on setting this would be useful. An example would be fantastic, but I think I can probably figure it out based on looking at the generated lexer.

@zaach
Owner

The relevant lines are here, when the lexer sets yylloc. Just make sure to set it whenever the lexer finds a match, as you would with yytext. You can see here how jison pushes yylloc onto a stack, same as with yytext.

@saikobee

Thanks!

@aaditmshah

I've written a custom scanner called Lexer which can be integrated with Jison. Here's how you can expose the line and column information from Lexer to Jison:

var Parser = require("jison").Parser;
var Lexer = require("lex");

var grammar = {
    "bnf": {
        // ...
    }
};

var parser = new Parser(grammar);
var lexer = parser.lexer = new Lexer;

var row = 1;
var col = 1;

function track(callback) {
    return function (lexeme) {
        var first_line = row;
        var first_column = col;

        var lines = lexeme.split("\n");
        var length =  lines.pop().length;
        var newlines = lines.length;

        row += newlines;
        if (newlines > 0) col = length;
        else col += length;

        this.yylloc = {
            first_line: first_line,
            first_column: first_column,
            last_line: row,
            last_column: col
        };

        this.yytext = lexeme;

        return callback.call(this, lexeme);
    };
}

lexer.addRule(/\s+/, track(function () {}));

lexer.addRule(/[0-9]+(?:\.[0-9]+)?\b/, track(function (lexeme) {
    return "NUMBER";
}));

lexer.addRule(/$/, function () {
    return "EOF";
});

Hope that helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.