New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for parser combinators #27
Comments
Hi @axman6, You bring up an interesting question. I've been thinking about this and I have some ideas, but I will need a bit of time to express them in more detail. The TL;DR is:
I hope to elaborate on 2) and 3) more in the future, this is just a sketch. Anyways, thanks for thinking about this, if you have ideas we're happy to hear them. |
what is NFA? |
NFA - https://en.wikipedia.org/wiki/Nondeterministic_finite_automaton |
Would you consider supporting both regexes and parser combinators? I take your point in the simple cases of URLs and emails. However this machinery and contribution model could easily encompass, say, street addresses. I find many parallels between Duckling's and libpostal's architecture, but I find it unlikely we could achieve a sufficiently good street address parser using regex, even with community support. This might be a non-goal, but it is a worthy one. |
@dmvianna I wouldn't discard the idea, though the added value should gain on the complexity cost. It seems like libpostal is a good candidate for that, so I wouldn't jump in reinventing the wheel. What are the limitations that could be overcome by Duckling? |
Recursion. As far as I understand, What naïve code won't do is to assign |
@dmvianna I'm open to look at a prototype for US postal addresses to see if we can get something useful even without the resolution part. |
Many of the examples of regexes are reached the point where a parser combinator library would be a much better option - a prime example is the URL matcher which can easily be precisely defined using a parser combinator, while at the moment it's fairly ad hoc and loses a lot of information (the path doesn't work for URLs which contain usernames and passwords, something users might want to be able to match on to forbid or warn users who're posting URLs they shouldn't):
(For this specific example, the Network.URI package already provides
parseURI :: String -> Maybe URI
)I don't have an implementation for this yet (nor a preference for combinator library) because I don't fully understand how duckling all fits together, and wanted to open this to start discussion about it.
The text was updated successfully, but these errors were encountered: