CLI improvements + potpourri #143

Closed
wants to merge 60 commits into
from

Conversation

Projects
None yet
4 participants
@GerHobbelt
Contributor

GerHobbelt commented Feb 16, 2013

Though instead of really pulling this one you might just want to take a quick look at it all and cherrypick, say, commit 79ac482 instead:

  • added a sanity check to the CLI --module-type parameter
  • made sure the CLI --debug parameter is treated as the boolean flag it is

I HOPE, next to that, you go for the git submodules way like I did[*] because node/npm-based 'tracking the latest work in the libraries used by the jison tool' isn't exactly exhilarating when you're working on the [jison] tool itself (and since jison is a development tool, not a production-critical-path component, I think using submodules is a better way than going happy happy joy joy in the node/npm dept.)

[*] I know that bit is graded It Works For Me(tm) so tread carefully. The idea is more important than the actual code here.


Anyway, this should be about the last of the pull req's tonight. Been sitting on the stuff long enough -- at least it got (& gets) tested with a complex production grammar so the most glaring bugs have been removed from the mix.

atrniv and others added some commits Oct 1, 2012

patched the code generator as it broke on the offending line (where t…
…he action code is poured into a Function()) for action blocks such as this one:

    ... rule ...
    | TAN '(' e ')'
        {
            $$ = {
                token:      FKW_FUNCTION,
                token_attr: FKA_ABSOLUTE,
                childs:     [null, $3],
                loc_info:   @$,
                type:       FT_NUMBER,
                unit:       $3.unit,
                value:      Math.abs($3.value)
            };
        }

The object creation like that would produce a 'Unexpected symbol' error.
and the same 'barf hairball on good action code' problem in the lexer…
…, fixed now. Unfortunately it also means that errorneous generated action code is only detected when you try to run the generated parser.

(I ran into the issue that jison is very whitespace sensitive: the lexer b0rked because a line separating two lexer rules contained a series of spaces and jison then decided that the second rule was simply some more action code of the first :-( )
fix: ensure that ALL tokens are converted to their numeric ID in the …
…lexer actions. Of course the lexer MAY return strings but it is cleaner and faster to return numeric IDs every time.

This resolves the lexer output oddness for larger lexer action blocks where multiple 'return' statements can be placed in conditional sections inside the action block. for example (simplified):

([A-Za-z0-9_][A-Za-z0-9_]*         %{
    console.log("looking up row/column/named identifier token in symbol table: ", yytext, this, this.matches);
    s = yytext.toUpperCase();
    if (yy.symbolTable[s]) {
        rv = yy.symbolTable[s];
        yytext = {
            text: yytext,
            col: function() {
                throw new Exception("I'll be buggered!");
            }
        };
        return rv.token;
    }

    // Would it be a viable basic column identifier?
    match = s.match(/^[A-Z]+$/);
    if (match) {
        // check if this is a legal column id:
        return 'COLUMN_ID';
    }
    // Would it be a viable basic row identifier?
    match = s.match(/^[0-9]+$/);
    if (match) {
        return 'ROW_ID';
    }
    return 'error';
%}
http://api.jquery.com/live/ : .live() is long gone deprecated. Now i…
…t is finally removed and the code had to be changed to use .on() like it should.
add some basic documentation to the generated parser for easier maint…
…ainability / usability of the generated code.
NONASSOC constant defined twice + some code beautification (which als…
…o makes the generated code a bit more palatable for me) + minimal sync between base code and 'web/assets' copies.
copied the 'relative path' code from gh-pages back to the nanoc sourc…
…es (TBD: get the layout to work for a/b/ subdirs again)

This copying includes the USF page updates (style, content and code) and the TRY page updates (linking to the USF page as that one is pretty useful during development)
make sure the parseError member function is reset to the default once…
… a previous run of the same instance has temporarily overridden the parseError handler via the parser.yy.parseError override.

sample code (shortened):
  parser.yy.parseError = function local_override() { ... };
  parser.parse(input_a);
  // the next run should use the default error handler again:
  parser.yy.parseError = null;
  parser.parse(input_b);
- generated code now includes the lexer methods' descriptions for imp…
…roved readability of the generated code.

- fixes the prototype bug in the parseError reset code: this fix is browser-independent.
added support for a 'backtracking' lexer: this is a standard (non-fle…
…x) lexer which for each matching regex will invoke the corresponding action code: when this returns a TOKEN (~ truthy value), then that token is produced (so far the behaviour is identical to the regular lexer); when no token is produced, then the context is rewound and the next matching regex is tried instead, until either a token has been produced or the lexer rules set (regex set) has been exhausted, in which case an ERROR exception is thrown (this last bit is identical to the default lexer too).

This behaviour can be controlled in the generated run-time by the options.backtrack_lexer boolean.
- added support for more() in the backtracking lexer: now when a lexe…
…r rule invokes more() it is assumed to have succeeded, just as when it would have returned a token.

- The parser.lex() method has been corrected to correctly support the new lexer code, which will return boolean FALSE when no token was produced (so that .lexer.js() will recurse until a token has been produced by the lexer action code after all) -- I kept the recursion in there instead of turning it into a loop as it would rather quickly cause a fatal error in the JavaScript engine when the lexer action code is broken (and never produces a token): replacing this recursion by a loop would hang the lexer more or less indefinitely, making debugging such a situation much harder.

- added comments for the lexer condition state methods.

- fixed the topState method similar to commit 32bf307 from @eliasdorneles

- augmented the lexer condition state access methods to improve robustness (e.g. against overzealous use of popState() in user action code)

- topState() now can produce one of the pushed lexer condition states when a (positive, non-zero) numeric index is passed as a parameter: topState(0) is identical to topState() and produces the currently active lexer condition state, while topState(1) produces the previously active lexer condition state, etc.

- minor legibility tweak of the generated lexer code
- lexer: corrected the backtracking feature: the user must call the l…
…exer method .reject() to signal that the current lexer rule FAILED to match a suitable token. (This fixes the scenario where a lexer rule is used to skip whitespace: such a tule won't return a token, nor call more() and should still 'succeed'.)

- lexer: ._less was unused

- lexer: added the ._backtrack boolean member variable to track the invocation of reject() in the lexer action code.
Merge remote-tracking branch 'remotes/atrniv/master': topState fix, d…
…one another way...

Conflicts:
	lib/jison/lexer.js
remove exception throwing code when rule does not produce a token (th…
…ere are whitespace consuming lexer rules after all which INTENTIONALLY don't produce a token nor do such lexer rules call more() or reject().
fixes error recovery logic in the parser run-time (tested & verified …
…using the errorlab.js test file): previously 5 tests would fail, but after this fix, all pass.

The loop which looks for a matching error rule has been abstracted out into the function locateNearestErrorRecoveryRule() because the first cause for failed the tests was the parseError() handler firing before the erorr recovery could kick in: this (and user-defined) parseError handlers need a way to detect whether an error recovery rule is available (via the hash.recoverable boolean).

The tests also uncovered an infinitely loop in error recovery in the new code when the lexer hits EOF. This bug has been fixed.
adjusted the code to remove the error recovery from the run-time when…
… no error recovery rules are included in the user-provided grammar. (The correction was necessary as the run-time has been altered in the previous commits.)
- lexer.unput(): this.yyleng got out of sync with this.yytext.length …
…as it was not updated after the location range has been updated using its previous value

- ran 'make' and 'make site' to build and test the new code: zero failures.
- corrected the backtrack_lexer option name in the run-time exception…
… when reject() is invoked in the action code inside a non-backtracking lexer

- added lexer unit tests to verify proper backtrack_lexer operation
- corrected bugs in the backtracking lexer run-time code
- adjusted the backtracking lexer unit tests to ensure the correct behaviour is exhibited for both regular and backtracking lexers (i.e. backtracking errors due to reject() usage in regular lexers must be thrown)

- assert.throw instead of assert["throw"], just like the other assert.xyz NodeJS methods.
Added fix for when grammar file leaves some (invisible) whitespace on…
… the empty lines between lexer rules: previously, the lexer would, depending on the exact input, barf an inexplicable hairball or (worse!) fail silently. Unit tests to verify the success of the fix have been included.

GerHobbelt and others added some commits Jan 10, 2013

fixed bug in run-time lexer engine when backtracking is enabled but l…
…exer rule did not invoke reject() and did not 'return' a valid token ID: then the rule should be exactly like in the regular lexer: no further rules should be tested and the lexer should simply return false in order for the next bit of input to be lexed. (This happens, for instance, when skipping over whitespace in a backtracking lexer.)
missed one spot to update typo yyloc --> yylloc -- probably due to sc…
…rewup in git push/pull as I was working on two platforms at the same time :-( )
moved the USF images to the pngs/ asset directory + corrected the nan…
…oc rules to ensure the try/usf page is properly regenerated -- that's the page where you can see the diagnostics of the jison run (state machine), so it's quite handy to have around.
the lexer condition state stack was growing indefinitely (2 states pe…
…r rule with action code): the new code keeps the lexer condition state stack sane as now the action state is correctly POPped off the stack when a action code block has been lexed.
updated Makefile to use the gh-pages self-referential submodule: now …
…the gh-pages are updated immediately when invoking 'make deploy'.
updated the Makefile to ensure it does NOT FAIL when the gh-pages git…
… commands find there's nothing to update -- this is important when this Makefile is invoked by other Makefiles: 'nothing to update' is not really an error, after all.
- updated CLI code to work with recent nomnom releases (jison would l…
…ock up when used with a recent nonom version)

- updated the npm install package file: the listed versions are the ones used for testing.

- removed the npm install package dependency on an antiauated copy of jison, which was used to bootstrap this stuff.
Merge remote-tracking branch 'remotes/petkaantonov/master'
Conflicts:
	lib/jison/lexer.js
	tests/lexer/regexplexer.js
Merge remote-tracking branch 'remotes/zaach-original/master'; also re…
…generated the files and site.

Conflicts:
	Makefile
	lib/jison.js
	lib/jison/cli-wrapper.js
	lib/jison/ebnf.js
	lib/jison/lexer.js
	lib/jison/util/bnf-parser.js
	lib/jison/util/lex-parser.js
	package.json
	src/bnf.l
	src/bnf.y
	src/jisonlex.l
	src/jisonlex.y
	tests/all-tests.js
	tests/grammar/ebnf.js
	tests/grammar/lex_parse.js
	tests/lexer/regexplexer.js
	web/content/assets/js/jison.js
- added a sanity check to the CLI --module-type parameter
- made sure the CLI --debug parameter is treated as the boolean flag it is

- updated the submodules representing the various parts of jison: hopefully the migration of my patches into those buggers went all right...
@zaach

This comment has been minimized.

Show comment Hide comment
@zaach

zaach Feb 16, 2013

Owner

npm allows you to link to local versions of packages, which makes developing jison itself pretty seamless. No need to publish or even commit anything before you test changes made to any of the linked dependencies. Much better than submodules, IMO.

Have all of these changes been split into the other PR requests, or are there new things in here as well?

Owner

zaach commented Feb 16, 2013

npm allows you to link to local versions of packages, which makes developing jison itself pretty seamless. No need to publish or even commit anything before you test changes made to any of the linked dependencies. Much better than submodules, IMO.

Have all of these changes been split into the other PR requests, or are there new things in here as well?

@GerHobbelt

This comment has been minimized.

Show comment Hide comment
@GerHobbelt

GerHobbelt Feb 18, 2013

Contributor
  1. well, that means I have to check out that way of npm then. (I'm not
    particularly fond of node for other reasons, so that's holding me back too
    much regarding this particular bit, I see now.)

  2. most of the stuff; IIRC only the Makefile stuff hasn't made it into a
    'manual extract', but that one is mostly about moving ../pages to
    ./gh-pages and has some edits that depends on the way I have things laid
    out. My verdict: let's see how the other pullreqs fare when I pull them
    back in (merge back); if I think there's any valuables still lingering we
    can always do another attempt at bloating your pullreq list. ;-)

Thanks for all your effort!

Met vriendelijke groeten / Best regards,

Ger Hobbelt


web: http://www.hobbelt.com/
http://www.hebbut.net/
mail: ger@hobbelt.com

mobile: +31-6-11 120 978

On Sat, Feb 16, 2013 at 7:38 PM, Zach Carter notifications@github.comwrote:

npm allows you to link to local versions of packages, which makes
developing jison itself pretty seamless. No need to publish or even commit
anything before you test changes made to any of the linked dependencies.
Much better than submodules, IMO.

Have all of these changes been split into the other PR requests, or are
there new things in here as well?


Reply to this email directly or view it on GitHubhttps://github.com/zaach/jison/pull/143#issuecomment-13665400.

Contributor

GerHobbelt commented Feb 18, 2013

  1. well, that means I have to check out that way of npm then. (I'm not
    particularly fond of node for other reasons, so that's holding me back too
    much regarding this particular bit, I see now.)

  2. most of the stuff; IIRC only the Makefile stuff hasn't made it into a
    'manual extract', but that one is mostly about moving ../pages to
    ./gh-pages and has some edits that depends on the way I have things laid
    out. My verdict: let's see how the other pullreqs fare when I pull them
    back in (merge back); if I think there's any valuables still lingering we
    can always do another attempt at bloating your pullreq list. ;-)

Thanks for all your effort!

Met vriendelijke groeten / Best regards,

Ger Hobbelt


web: http://www.hobbelt.com/
http://www.hebbut.net/
mail: ger@hobbelt.com

mobile: +31-6-11 120 978

On Sat, Feb 16, 2013 at 7:38 PM, Zach Carter notifications@github.comwrote:

npm allows you to link to local versions of packages, which makes
developing jison itself pretty seamless. No need to publish or even commit
anything before you test changes made to any of the linked dependencies.
Much better than submodules, IMO.

Have all of these changes been split into the other PR requests, or are
there new things in here as well?


Reply to this email directly or view it on GitHubhttps://github.com/zaach/jison/pull/143#issuecomment-13665400.

@GerHobbelt

This comment has been minimized.

Show comment Hide comment
@GerHobbelt

GerHobbelt Dec 29, 2013

Contributor

Considered closed; another series of edits is due shortly and this has been dealt with via further pull requests, none of which are pending anymore. So we assign this one to the dustbin. ;-)

Contributor

GerHobbelt commented Dec 29, 2013

Considered closed; another series of edits is due shortly and this has been dealt with via further pull requests, none of which are pending anymore. So we assign this one to the dustbin. ;-)

@GerHobbelt GerHobbelt closed this Dec 29, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment