-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature request: make optional rules mandatory #21
Comments
Update: I'm able to identify the rules which include setupcat <<EOF > ./temp.g4
grammar temp;
foo: bar | opt_baz | quux;
opt_baz: baz | ; // <-- I'd like to delete `| ;`
bar: 'bar';
baz: 'baz';
quux: 'quux';
EOF #!/bin/bash
xpath='//parserRuleSpec//alternative[not(element)]'
trparse ./temp.g4 | trxgrep "$xpath" | trxml
# Result size 1
# <alternative>
# </alternative> and the names of rules which have an alternate branch xpath='//parserRuleSpec//alternative[not(element)]/ancestor::parserRuleSpec/RULE_REF'
trparse ./temp.g4 | trxgrep "$xpath" | trxml
# Result size 1
# opt_baz EDIT: I'm working with the following bash function: # an xpath query to seek and destroy `| ;` tokens that weren't grouped into
# populated `<alternative>` nodes
xpath_del="//parserRuleSpec[RULE_REF/text() = '<RULE_NAME>']//alternative[not(element)]/../../*[text() = '|']"
# an XPath query to find all relevant references to a rule
xpath_rename="//parserRuleSpec//ruleBlock//RULE_REF[text()='<RULE_NAME>']"
make_mandatory() {
local rule_name="${1:?rule name required}"
# read IR json from stdin
trdelete "${xpath_del//<RULE_NAME>/$rule_name}" | # delete the offending `| ;`
trreplace "${xpath_rename//<RULE_NAME>/$rule_name}" "$rule_name?" # rename references to `rule_name` to `rule_name?`
}
# which works as follows:
trparse ./temp.g4 | make_mandatory "opt_baz" | trprint
# grammar temp;
# foo: bar | opt_baz? | quux;
# opt_baz: baz ;
# bar: 'bar';
# baz: 'baz';
# quux: 'quux'; |
I think one way to do this would be to (1) find a LHS symbol of a rule that contains an empty top-level alt. (2) Construct an xpath expression that finds all uses of any of the symbol names found in step 1. Then, use trinsert to add a "?" after the symbol name. I don't think I have an "insert after" option for trinsert, so this would need to be added. (3) After you made the inserts, go back to find all the rules with empty top-level alts, and use trdelete to nuke those empty alts. So, there would be a little bit of work fixing trinsert. I already added some function to the "TrashBase" to have InsertBefore() and InsertAfter(). I'm not sure what the best xpath expression would be for step (1). |
Hi @SKalt , I finally figured out how to do this, and wrote this Bash script to do it.
In order to follow along at what this does, you have to really look at the parse tree (
The first line of this script parses the grammar then finds the names of rules that have an alternative that is empty. One of the problems was that I forgot what The second line is fairly grungy because I don't know of a great way in Bash to take a string of rule names separated by spaces, and turn it into an xpath test. E.g., in the grammar The last line finds all occurrences of the symbol To get this to work, I had to modify trinsert. I noticed that some of the tools were crashing because dependent assemblies where not loading, so I checked and fixed all the tools, and have a new release tomorrow (v0.11.3). This is a wonderful example of what the tool chain can do. Thanks so much for bringing it up. |
Nice use of
Do you think it's worth writing this transform out in C# rather than bash? In the meantime I found a new, interesting variation of this problem: how to
So grammar temp
foo: bar opt_baz quux;
opt_baz: bar | ;
bar: 'bar';
quux: 'quux'; would be transformed like +foo: bar baz? quux;
-foo: bar opt_baz quux;
-opt_baz: bar | ; I solved that with # an xpath to get the text of the non-optional branch of a trivially optional rule
populated="//parserRuleSpec[RULE_REF/text()='<RULE_NAME>']" # find a specific rule
populated="${populated}/ruleBlock/ruleAltList" # with a top-level list of alternatives
populated="${populated}[count(labeledAlt) = 2]" # check the alt list has only 2 branches
populated="${populated}/labeledAlt[alternative[element]]" # select the populated branch
# ^ this is largely paranoia; in my pipeline I've already used a variation of this
# query to find the names of the relevant rules.
trivial_ref="//parserRuleSpec//ruleref[RULE_REF/text()='<RULE_NAME>']"
trivial_def="//parserRuleSpec[RULE_REF/text()='<RULE_NAME>']"
dissolve_trivially_opt_rule() {
local rule_name="${1:?rule name required}"
local buffer
local text
buffer="$(cat -)"
text="$(
echo "$buffer" |
trxgrep "${populated//<RULE_NAME>/$rule_name}" |
trtext
)"
echo "$buffer" |
trreplace "${trivial_ref//<RULE_NAME>/$rule_name}" "($text)?" |
trdelete "${trivial_def//<RULE_NAME>/$rule_name}"
echo "done: $rule_name" >&2
} |
I've got a grammar where some rules match the empty string. I'd like to transform those rules such that they no longer match the empty string and have all references to the rules made optional. Example:
More difficult example:
If you point me to where this feature should go, I'd be happy to take a stab at it.
The text was updated successfully, but these errors were encountered: