Replies: 11 comments
-
lexical precedence |
Beta Was this translation helpful? Give feedback.
-
Thank you for your answer! I still did not understand what I should do. Can you be more specific, please? |
Beta Was this translation helpful? Give feedback.
-
https://tree-sitter.github.io/tree-sitter/creating-parsers |
Beta Was this translation helpful? Give feedback.
-
Thank you so much, I had tried that after your first message but hadn't figured it out. Now, I did! |
Beta Was this translation helpful? Give feedback.
-
This time there is another issue showed up. As I mentioned before, this is a parser for a note-taking plugin. So the '<', identifiers and '>' should be declared as different rules. However, since I used (bold_text
"<" @comment
">" @comment) @kojl.boldtext And it gave me error:
So, basically this kind of query does not accept token rule either. Because when I tried to write it without token, it did what it should do. My question is: Is there a way to highlight specific parts of a token rule? or Is there a way to use the rules in the |
Beta Was this translation helpful? Give feedback.
-
What does your bold text and related rules look like now |
Beta Was this translation helpful? Give feedback.
-
So by using token: bold_text: $ => {
const identifier = /[a-zA-Z0-9_]\w*/
const space = ' '
const bsymbol = choice(
"#",
"*",
"-",
"$",
"~",
",",
".",
"%",
';',
'+',
'`',
'!',
'@',
'^',
'&',
'(',
')',
'{',
'}',
'[',
']',
'\\',
'|',
'/',
'?',
'\'',
'=',
'"',
)
return token(seq(
"<",
repeat(
choice(
identifier,
space,
bsymbol
)
),
">",
// optional(space)
))}, However, since I want to highlight "<" and ">" different than the inside, I need them as a different rule. bold_text: $ => prec.left(1, seq(
$.bold_symbol_s,
repeat(choice(
$.bold_identifier
)),
$.bold_symbol_e
)), However, the moment I put '<' it identifies it as bold_text which I don't want it to be. I want it to be identified as bold_text only if there's '>' after all the identifiers. Since it will be in neovim and neovim parses the buffer continuously it's really important for me. |
Beta Was this translation helpful? Give feedback.
-
Okay, so your grammar is not LR(1). You need a substantial amount of lookahead until you can determine whether the < token is a simple < or a bold start token. What language is this, did you develop it yourself? Usually in language design you have to keep an eye on ease of parsing so make choices accordingly. For example, what if a user uses a regular <, types some text, then uses a regular > later on? Any time you see a < you will have to read to the end of the file to determine whether there is a corresponding >. I figure there are two ways to do it, either you use dynamic precedence to trigger GLR parsing or you use an external scanner. |
Beta Was this translation helpful? Give feedback.
-
Oh, I got you. I'll take a look at what I can do on it, thanks! Yes, this is a language I am developing. And I thought when I am running my parser using nvim-treesitter, it parses the document in every change. I think my problem is on, as you've mentioned, on dynamic precedence. Although I read the documentation in the website, I didn't give much attention on it. I will take a look at it again. Thank you so much, you helped a lot! |
Beta Was this translation helpful? Give feedback.
-
You can take inspiration from markdown, which uses the * character to denote bold and italic text. If a user wants to write just |
Beta Was this translation helpful? Give feedback.
-
I actually had taken a look at it, but I'll check it again for finding the usages of these precedence functions. Thank you again! |
Beta Was this translation helpful? Give feedback.
-
Hello everyone,
I am writing my parser for taking notes in neovim. You can check my repository out. Right now, I am working on creating different kinds of text. Right now, I have normal text, bold text, and underlined text. For creating bold text, you should put them in '<' and '>'. So the format should be: "". Right now, it works. However, when you try to write '<' as a symbol, it will parse it as though it is a bold text. My shortened code is:
Example file to parse:
The output of
tree-sitter parse example.txt
:I tried eveything to make sure that bold_text has the most precedence. So if I am wrong with the
prec([number], rule)
function, please correct me.Conclusion:
So basically I want to use '<' and '>' as symbols as well as starters for bold_text. If there is anything you can suggest, I would like to know it.
Thank you!
Beta Was this translation helpful? Give feedback.
All reactions