Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No distinction between a function declaration and a variable declaration #124

Open
MaxVerevkin opened this issue Jul 24, 2021 · 5 comments

Comments

@MaxVerevkin
Copy link

The tree-sitter parser makes no distinction between a function declaration and this constructor call.
It even thinks that x or n in my example are types. So this is probably a tree-sitter-cpp issue.

image

Originally posted by @theHamsta in nvim-treesitter/nvim-treesitter#1625 (comment)

@IndianBoy42
Copy link

IndianBoy42 commented Jul 25, 2021

https://godbolt.org/z/PxfYhdEYn

parsing variable and function declarations is highly context dependent. When you see an identifier it could be a variable, function or type, and that affects the parsing of the code.

In your example (demonstrated in the godbolt link), if n were a type it really would be a function declaration. To correctly parse the code you also have to keep track of all types and variables declared, including includes so there seems to be no way for tree-sitter to get this correct.

The last example is known as the most vexing parse, where even though n is a value in scope, the int(n) in the parameters gets parsed as a function parameter, and that line is a function declaration again. I believe the rule (of thumb?) is that if it could be a function declaration then it is. So perhaps thats why tree-sitter-cpp is parsing your example as a function (although that rule doesn't actually apply here)

So there doesn't seem to be a way to get it right 100%, but perhaps precedence could be changed to make it parse as a variable declaration in cases like this. local function declarations are exceedingly rare anyway

@MaxVerevkin
Copy link
Author

MaxVerevkin commented Jul 25, 2021

Okay, I see. So, I suppose there are two solutions?

  • vector<int> vec(n); is a variable declaration if n is a (locally) known identifier. I'm not familiar with tree-sitter enough to tell whether it's possible.
  • vector<int> vec(n); is a function declaration if it's global and a variable declaration otherwise. Local variables are much more common than local functions declarations.

@theHamsta
Copy link
Contributor

theHamsta commented Jul 25, 2021

@IndianBoy42 is absolutely right. We could try to solve this by trying to take the context into account in our queries (not guess function declaration within function bodies) or just leave it the way it is. I guess even with locals this will be almost impossible to distinguish

@narpfel
Copy link

narpfel commented Aug 3, 2021

It is actually undecidable to solve this problem, as described in this blog post. Not only would the parser have to keep track of existing (local and global) variables, but it would also have to perform template instantiation (i. e. arbitrary computation).

@williamhCode
Copy link

Any updates on this? I would much rather prefer treesitter to parse them as variables, because when I'm using treesitter textobjects, go to next function makes me land on the variable declaration. And also, who would ever write a local function declaration 😭.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants