Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use the bnf lib in sql parse function #135

Open
zerix opened this issue Jun 26, 2023 · 7 comments
Open

How to use the bnf lib in sql parse function #135

zerix opened this issue Jun 26, 2023 · 7 comments

Comments

@zerix
Copy link

zerix commented Jun 26, 2023

I want to write a custom VQL parser use nom. And I found it have not contain grammer define function.

Can I use bnf lib to define VQL/SQL grammar like antlr4 g4 file?

I can't find some example for this, I think the process is:

  1. define VQL/SQL grammar file, like vql.bnf.
  2. use bnf lib parse vql.bnf to parse tree.
  3. traverse the parse tree use Grammar::traverse or earley::Traversal and output AST?
  4. parse the AST and get the VQL/SQL stmt define.
  5. stmt mod output VQL/SQL logical plan.

I didn't know the process is right or not? Anyone could help me the right way?

Thanks.

@CrockAgile
Copy link
Collaborator

That process sounds right to me! I am not familiar with VQL, but if we could get a vql.bnf, then we could add this as an example. Thank you for asking!

@shnewto
Copy link
Owner

shnewto commented Jun 29, 2023

I agree with @CrockAgile except to note that Grammar::traverse isn't what you're looking for, thats just a private function used to generate random sentences from a grammar 😄

With an ANTLR shaped use case, I'd vote the "Parse Sentence" example in the Readme would be a good place to start https://github.com/shnewto/bnf#parse-sentence-via-grammar (it only prints the trees you'd need to walk through)

@zerix
Copy link
Author

zerix commented Jun 30, 2023

Thank you very much for your reply. @shnewto I have read and test the "Parse Sentence" example, and i got a sql parse tree. But I found it difficult to translate it into Abstract Syntax Tree (AST) , and it may be a commonly used function. so I think if the bnf project should provide it ?

@CrockAgile I write a simple vql.bnf file, and I will try to write walk through code with the bnf file. If you have some time, very glad to read your example code. (I 'am a newbie in the rust and interpreter field, I 'am translating the example from python to rust, and i will begin writing bnf example code once it is done. https://ruslanspivak.com/lsbasi-part1/)

vql.bnf.txt

@kiwilab-cn
Copy link

@shnewto Did bnf lib support '[..]' or ' { }' grammar define?

eg:

<jdbc_uri> ::= <mysql> <colon> <double_slash> <host> <colon> <port> <colon> [<database_name>]

<jdbc_connection_info> ::= <jdbc_connection_required> { <jdbc_connection_options> }

@shnewto
Copy link
Owner

shnewto commented Aug 4, 2023

@kiwilab-cn no, we don't currently support [] or {} (I would like to support some variants like EBNF eventually though)

@CrockAgile
Copy link
Collaborator

I started experimenting with a SQL example on a branch here.

It's not really "full SQL" (that grammar would be large) but it parses the input "CREATE VIEW employed_names AS SELECT e.first_name FROM Employees AS e" to this parse tree:

flowchart TD
0["view_definition"] --> 1["CREATE VIEW"]
0["view_definition"] --> 2["ws"]
2["ws"] --> 3[" "]
0["view_definition"] --> 4["identifier"]
4["identifier"] --> 5["letter"]
5["letter"] --> 6["e"]
4["identifier"] --> 7["identifier_rest"]
7["identifier_rest"] --> 8["letter"]
8["letter"] --> 9["m"]
7["identifier_rest"] --> 10["identifier_rest"]
10["identifier_rest"] --> 11["letter"]
11["letter"] --> 12["p"]
10["identifier_rest"] --> 13["identifier_rest"]
13["identifier_rest"] --> 14["letter"]
14["letter"] --> 15["l"]
13["identifier_rest"] --> 16["identifier_rest"]
16["identifier_rest"] --> 17["letter"]
17["letter"] --> 18["o"]
16["identifier_rest"] --> 19["identifier_rest"]
19["identifier_rest"] --> 20["letter"]
20["letter"] --> 21["y"]
19["identifier_rest"] --> 22["identifier_rest"]
22["identifier_rest"] --> 23["letter"]
23["letter"] --> 24["e"]
22["identifier_rest"] --> 25["identifier_rest"]
25["identifier_rest"] --> 26["letter"]
26["letter"] --> 27["d"]
25["identifier_rest"] --> 28["identifier_rest"]
28["identifier_rest"] --> 29["_"]
28["identifier_rest"] --> 30["identifier_rest"]
30["identifier_rest"] --> 31["letter"]
31["letter"] --> 32["n"]
30["identifier_rest"] --> 33["identifier_rest"]
33["identifier_rest"] --> 34["letter"]
34["letter"] --> 35["a"]
33["identifier_rest"] --> 36["identifier_rest"]
36["identifier_rest"] --> 37["letter"]
37["letter"] --> 38["m"]
36["identifier_rest"] --> 39["identifier_rest"]
39["identifier_rest"] --> 40["letter"]
40["letter"] --> 41["e"]
39["identifier_rest"] --> 42["identifier_rest"]
42["identifier_rest"] --> 43["letter"]
43["letter"] --> 44["s"]
0["view_definition"] --> 45["ws"]
45["ws"] --> 46[" "]
0["view_definition"] --> 47["AS"]
0["view_definition"] --> 48["ws"]
48["ws"] --> 49[" "]
0["view_definition"] --> 50["select_statement"]
50["select_statement"] --> 51["SELECT"]
50["select_statement"] --> 52["ws"]
52["ws"] --> 53[" "]
50["select_statement"] --> 54["expr"]
54["expr"] --> 55["identifier"]
55["identifier"] --> 56["letter"]
56["letter"] --> 57["e"]
54["expr"] --> 58["."]
54["expr"] --> 59["identifier"]
59["identifier"] --> 60["letter"]
60["letter"] --> 61["f"]
59["identifier"] --> 62["identifier_rest"]
62["identifier_rest"] --> 63["letter"]
63["letter"] --> 64["i"]
62["identifier_rest"] --> 65["identifier_rest"]
65["identifier_rest"] --> 66["letter"]
66["letter"] --> 67["r"]
65["identifier_rest"] --> 68["identifier_rest"]
68["identifier_rest"] --> 69["letter"]
69["letter"] --> 70["s"]
68["identifier_rest"] --> 71["identifier_rest"]
71["identifier_rest"] --> 72["letter"]
72["letter"] --> 73["t"]
71["identifier_rest"] --> 74["identifier_rest"]
74["identifier_rest"] --> 75["_"]
74["identifier_rest"] --> 76["identifier_rest"]
76["identifier_rest"] --> 77["letter"]
77["letter"] --> 78["n"]
76["identifier_rest"] --> 79["identifier_rest"]
79["identifier_rest"] --> 80["letter"]
80["letter"] --> 81["a"]
79["identifier_rest"] --> 82["identifier_rest"]
82["identifier_rest"] --> 83["letter"]
83["letter"] --> 84["m"]
82["identifier_rest"] --> 85["identifier_rest"]
85["identifier_rest"] --> 86["letter"]
86["letter"] --> 87["e"]
50["select_statement"] --> 88["ws"]
88["ws"] --> 89[" "]
50["select_statement"] --> 90["FROM"]
50["select_statement"] --> 91["ws"]
91["ws"] --> 92[" "]
50["select_statement"] --> 93["expr"]
93["expr"] --> 94["identifier"]
94["identifier"] --> 95["letter"]
95["letter"] --> 96["E"]
94["identifier"] --> 97["identifier_rest"]
97["identifier_rest"] --> 98["letter"]
98["letter"] --> 99["m"]
97["identifier_rest"] --> 100["identifier_rest"]
100["identifier_rest"] --> 101["letter"]
101["letter"] --> 102["p"]
100["identifier_rest"] --> 103["identifier_rest"]
103["identifier_rest"] --> 104["letter"]
104["letter"] --> 105["l"]
103["identifier_rest"] --> 106["identifier_rest"]
106["identifier_rest"] --> 107["letter"]
107["letter"] --> 108["o"]
106["identifier_rest"] --> 109["identifier_rest"]
109["identifier_rest"] --> 110["letter"]
110["letter"] --> 111["y"]
109["identifier_rest"] --> 112["identifier_rest"]
112["identifier_rest"] --> 113["letter"]
113["letter"] --> 114["e"]
112["identifier_rest"] --> 115["identifier_rest"]
115["identifier_rest"] --> 116["letter"]
116["letter"] --> 117["e"]
115["identifier_rest"] --> 118["identifier_rest"]
118["identifier_rest"] --> 119["letter"]
119["letter"] --> 120["s"]
93["expr"] --> 121["ws"]
121["ws"] --> 122[" "]
93["expr"] --> 123["AS"]
93["expr"] --> 124["ws"]
124["ws"] --> 125[" "]
93["expr"] --> 126["identifier"]
126["identifier"] --> 127["letter"]
127["letter"] --> 128["e"]
Loading

@shnewto and anyone else, let me know what you think! What else should be included in this example? Should examples be linked to in the crate's main README ?

@shnewto
Copy link
Owner

shnewto commented Aug 25, 2023

adding this an example would be awesome, definitely thing a nod to the examples in the readme is a good call, and maybe an examples readme with some more specific details if there are some?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants