Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parametrize the grammar by externally-supplied variables #36

Open
ceymard opened this issue Aug 15, 2011 · 7 comments
Open

Parametrize the grammar by externally-supplied variables #36

ceymard opened this issue Aug 15, 2011 · 7 comments
Labels
Milestone

Comments

@ceymard
Copy link

ceymard commented Aug 15, 2011

I'll take the example of JinJa (I created a JS clone of it with PegJS) :

You can write {% block my_block %}{% endblock %}, which is the usual syntax.

It is however possible in Jinja to redefine {% and %} to other tokens. I would like to be able to have a rule like tag = tk_open "block" tk_close contents tk_open "endblock" tk_close {...}

Maybe with another syntax ? Like tag = $tk_open "block" $tk_close to indicate these are terminals in a variable ?

The variable could then be declared in the leading { ... } section. This best used when in conjunction with another ticket I opened : "Create an optional argument 'options' in parse()"

@dmajda
Copy link
Contributor

dmajda commented Aug 20, 2011

It is however possible in Jinja to redefine {% and %} to other tokens.

You mean in the tempalte itself? Or this is specified before the template parsing starts? Can you give me an example and/or pointer to the documentation?

@ceymard
Copy link
Author

ceymard commented Aug 20, 2011

It is specified before the parsing starts. See http://jinja.pocoo.org/docs/api/#high-level-api with the block_start_* options.

@guiprav
Copy link

guiprav commented Oct 26, 2011

That would be some really handy feature I'd appreciate to see in PEG.js. Since I'm not familiar with the library's internals, I was considering to write a preprocessor to do that.

Not only variables would be needed in my case, though. I find myself writing this a lot:

CommaSeparatedIdentifierSequence
  = left:Identifier opt:(_* "," _* CommaSeparatedIdentifierSequence)?
    {
        var sequence = [left];
        var more = opt[3];

        if (more !== undefined)
            sequence = concat(sequence, more);

        return sequence;
    }

When I could be doing something like:

CommaSeparatedIdentifierSequence
  = @Sequence(Identifier, _* "," _*)

@Sequence being a rule template accepting the sequence element and separator as first and second "parameter", respectively:

@Sequence (Element, Separator)
  = left:Element opt:(Separator @Sequence(Element, Separator))? { ... }

@Skalman
Copy link

Skalman commented Dec 10, 2012

I have another, related use case, when parsing a freely formed document (wikitext from Wiktionary):

There are special keywords (the article title) which vary per parsed document:

// PEG:
title_line
  = "'''" %article_title "'''\n"
// JS:
parser.parse("...\n'''door'''\n...", { values: { article_title: 'door' } });

A further extension would be to also allow for arrays of values, in cases where there might be a long list of possible keyword values:

// PEG:
lang_header
  = '==' %lang '==\n'
// JS:
parser.parse("...\n==English==\n...\n==French==\n...", { values: { lang: ['English', 'French', 'Swedish', ... ] } });

Of course, it is possible to generate the PEG grammar, but it seems like this might be doable and it would be a lot more efficient.

@fusepilot
Copy link

This is something I'd like to see implemented. My current project requires dynamic rule similar to Skalman's example.

@andreineculau
Copy link
Contributor

@n2liquid I see myself doing this, with no annoyance

CommaSeparatedIdentifierSequence
  = left:Identifier opt:CommaSeparatedIdentifier*
  { return [left].concat(opt) }
CommaSeparatedIdentifier
  = _* "," _* Identifier:Identifier
  { return Identifier }

@ceymard I was about to write that what you're suggesting is possible now (maybe it wasn't in 2011) like this

// taken from your PR's test
{
  var patterns = {
    lit: 'okay',
    yes: '%%',
    fun: function() {return 'bop'}
  }
}

start
  = &{return patterns.lit} {return 'done'}
  / &{return patterns.yes} {return 'other'}
  / &{return patterns.fun()} {return 'fun'}

but it's not, since &{...} only matches, but doesn't capture and advance parser position. Pity, maybe there should be new operators for matching as well.

@guiprav
Copy link

guiprav commented Oct 15, 2014

@andreineculau, that's indeed better since it avoids the clunky "magic index" my code introduces, and writing that is indeed pretty simple. The ability to define and instantiate "template rules" might still be an important addition, though, to allow for more complex grammars without resorting to duplication.

I haven't been using PEG.js lately, but I know when I was, there were some recurring rule patterns that were not so trivial to factor into simpler reusable ones like you did; and, dare I say, some seemed simply impossible. So duplication was unavoidable.

Unfortunately, I don't have examples of those handy anymore. Anyways, I'm not asking anything here, I'm just saying, really, FWIW.

@dmajda dmajda changed the title Add the ability to use variables in rules Parametrize the grammar by externally-supplied variables Aug 14, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants