Predictor within `Any` breaks if one parser is a substring of another #2

duncanmorris · 2018-05-04T14:29:07Z

The predictor feature of the Any function breaks if one parser is a substring of another. I've got code to produce a minimal version of the issue.

The parsing always works if the most specific (longest) parser is matched. If the least specific (shortest) parser matches, the predictor returns this template the next time it is called which means Any fails to match the more specific parser. We call Any with the most specific parsers first leading to the least specific parser, but the predictor means the order isn't always enforced.

We have solved this by creating a function called NonPredictiveAny that removes the predictive parts of the function. This comes at the expense of not being able to return the longest error (since Error.pos isn't exported). I thought you might have a better solution and/or might want to add an equivalent of NonPredictiveAny for others who have the same issue.

Code to reproduce.

package main

import (
	"os"
	"strings"

	parsify "github.com/vektah/goparsify"
)

func makeTemplateParser() parsify.Parser {
	var template parsify.Parser

	parsify.EnableLogging(os.Stdout)

	a := parsify.Exact(`a`).Map(func(n *parsify.Result) {
		n.Result = `a`
	})
	ab := parsify.Exact(`ab`).Map(func(n *parsify.Result) {
		n.Result = `ab`
	})

	aab := parsify.Any(ab, a)

	template = parsify.Many(aab).Map(func(node *parsify.Result) {
		var xs []string
		for _, child := range node.Child {
			xs = append(xs, child.Result.(string))
		}
		node.Result = strings.Join(xs, ", ")
	})

	return template

}

func main() {}

And a test suite.

package main

import (
	"testing"

	"github.com/stretchr/testify/assert"
	parsify "github.com/vektah/goparsify"
)

func TestRender1(t *testing.T) {
	input := `a`

	var template = makeTemplateParser()
	rendered, err := parsify.Run(template, input)

	assert.Nil(t, err)
	assert.Equal(t, "a", rendered)

}

func TestRender2(t *testing.T) {
	input := `ab`
	var template = makeTemplateParser()
	rendered, err := parsify.Run(template, input)
	assert.Nil(t, err)
	assert.Equal(t, "ab", rendered)

}

func TestRender3(t *testing.T) {
	input := `ab a`
	var template = makeTemplateParser()
	rendered, err := parsify.Run(template, input)

	assert.Nil(t, err)
	assert.Equal(t, "ab, a", rendered)

}

func TestRender4(t *testing.T) {
	input := `a ab`
	var template = makeTemplateParser()
	rendered, err := parsify.Run(template, input)

	assert.Nil(t, err)
	assert.Equal(t, "a, ab", rendered)

}

The text was updated successfully, but these errors were encountered:

vektah · 2018-06-11T02:00:07Z

I think to do this properly the parsers would need to statically expose the first chars they might accept so collisions could be detected and not put in the predictor. It would also mean the predictor table could be built during setup, rather then dynamically at runtime.

I don't really have time to dig deep on this, and this is definitely going to be a confusing bug for anyone hitting it so I've dropped the predictor from the Any parser. It wasn't shaving that much time off anyway.

Cleanup the README, add benchmark to Makefile original hash: d7d8466 Co-authored-by: Damien Stanton <damien.stanton@gmail.com>

vektah closed this as completed in 8a5260a Jun 11, 2018

austin-searchpilot mentioned this issue Jun 29, 2018

Possible bug with Any (or Seq)? #3

Open

ajitid pushed a commit to ajitid/goparsify that referenced this issue Apr 11, 2023

Merge pull request vektah#2 from damienstanton/next

e254fc0

Cleanup the README, add benchmark to Makefile original hash: d7d8466 Co-authored-by: Damien Stanton <damien.stanton@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predictor within `Any` breaks if one parser is a substring of another #2

Predictor within `Any` breaks if one parser is a substring of another #2

duncanmorris commented May 4, 2018

vektah commented Jun 11, 2018 •

edited

Loading

Predictor within Any breaks if one parser is a substring of another #2

Predictor within Any breaks if one parser is a substring of another #2

Comments

duncanmorris commented May 4, 2018

vektah commented Jun 11, 2018 • edited Loading

Predictor within `Any` breaks if one parser is a substring of another #2

Predictor within `Any` breaks if one parser is a substring of another #2

vektah commented Jun 11, 2018 •

edited

Loading