Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP


Document how to maintain line and column information with custom lexers #59

fitzgen opened this Issue · 8 comments

5 participants


I just ran in to this while hacking on CoffeeScript, whose lexer doesn't expose the proper information. It would be nice if the interface were documented. I'm going to dig in to the code now, but I can't guarantee a pull request.


It's mentioned here:

Basically, the lexer sets a yylloc property with a value that follows the API. How the lexer internally maintains line/column information is not specified; when lex() returns a token the parser will use whatever value lexer.yylloc is.


That just mentions how to use @1.first_line, etc, not how to set lexer.yyloc from a custom lexer, which is what was throwing me off. Just thought it would be nice.


Oh, I see. It would really depend on the lexer, though a suggestive guide or concrete example could help out, for sure. So far, CoffeeScript is the only implementation with a custom lexer that I've seen.


Hi, just following up on this. Like @fitzgen, I ran across this issue in trying to modify CoffeeScript.


I'm working on a project that uses a custom lexer, so I agree that better documentation on setting this would be useful. An example would be fantastic, but I think I can probably figure it out based on looking at the generated lexer.


The relevant lines are here, when the lexer sets yylloc. Just make sure to set it whenever the lexer finds a match, as you would with yytext. You can see here how jison pushes yylloc onto a stack, same as with yytext.




I've written a custom scanner called Lexer which can be integrated with Jison. Here's how you can expose the line and column information from Lexer to Jison:

var Parser = require("jison").Parser;
var Lexer = require("lex");

var grammar = {
    "bnf": {
        // ...

var parser = new Parser(grammar);
var lexer = parser.lexer = new Lexer;

var row = 1;
var col = 1;

function track(callback) {
    return function (lexeme) {
        var first_line = row;
        var first_column = col;

        var lines = lexeme.split("\n");
        var length =  lines.pop().length;
        var newlines = lines.length;

        row += newlines;
        if (newlines > 0) col = length;
        else col += length;

        this.yylloc = {
            first_line: first_line,
            first_column: first_column,
            last_line: row,
            last_column: col

        this.yytext = lexeme;

        return, lexeme);

lexer.addRule(/\s+/, track(function () {}));

lexer.addRule(/[0-9]+(?:\.[0-9]+)?\b/, track(function (lexeme) {
    return "NUMBER";

lexer.addRule(/$/, function () {
    return "EOF";

Hope that helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.