Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python module resolution not working via tree-sitter-stack-graphs-python cli #430

Closed
nohehf opened this issue Apr 30, 2024 · 7 comments · Fixed by #438
Closed

Python module resolution not working via tree-sitter-stack-graphs-python cli #430

nohehf opened this issue Apr 30, 2024 · 7 comments · Fixed by #438

Comments

@nohehf
Copy link
Contributor

nohehf commented Apr 30, 2024

Hello again,
I've been testing tree-sitter-stack-graphs-python via the cli, and while I have great results on a single file, I was not able to obtain concluent output on multiple files with modules.

Here is a minimal reproduction of my issue:

Given the following directory structure (working dir being: /Users/nohehf/tmp/py):

.
├── main.py
└── module.py

With:

# main.py
import module

baz = module.foo
# module.py
foo = "bar"

If I run:

DIR="/Users/nohehf/tmp/py"

tree-sitter-stack-graphs-python clean --all
tree-sitter-stack-graphs-python index $DIR
tree-sitter-stack-graphs-python status $DIR
tree-sitter-stack-graphs-python visualize $DIR

I get:

/Users/nohehf/tmp/py/main.py: indexed
/Users/nohehf/tmp/py/module.py: indexed
Visualization at stack-graph.html

Now if I query main.py on the foo reference (main.py:3:14), it does not find the definition in module.py:

tree-sitter-stack-graphs-python query definition "main.py:3:14"

/Users/nohehf/tmp/py/main.py:3:14: found 0 definitions for 1 references
queried reference
/Users/nohehf/tmp/py/main.py:3:14:
3 | baz = module.foo
  |              ^^^

has no definitions

However, trying to reproduce the behaviour in a test (test.py) I get no issues:

#------ path: module.py ------#

foo = "bar"

#------ path: main.py ------#

import module

baz = module.foo
#            ^ defined: 3
tree-sitter-stack-graphs-python test test.py
test.py: success

Am I missing something with the cli ? Are tests really behaving like those are separate files ?

I'm sure this is supposed to work and I have trouble understanding what's going on in the graph visualization:
image

Thank you

@nohehf
Copy link
Contributor Author

nohehf commented Apr 30, 2024

I just tried in javascript with a similar project structure, and it works just fine:

.
├── index.js
└── module.js
// index.js
import { foo } from "./module"

const baz = foo
// module.js
export const foo = "bar"
REPO="/Users/nohehf/tmp/js"

tree-sitter-stack-graphs-javascript clean --all
tree-sitter-stack-graphs-javascript index $REPO
tree-sitter-stack-graphs-javascript status $REPO
/Users/nohehf/tmp/js/index.js: indexed
/Users/nohehf/tmp/js/module.js: indexed
tree-sitter-stack-graphs-javascript query definition "index.js:3:13"
/Users/nohehf/tmp/js/index.js:3:13: found 2 definitions for 1 references
queried reference
/Users/nohehf/tmp/js/index.js:3:13:
3 | const baz = foo
  |             ^^^

has 2 definitions
/Users/nohehf/tmp/js/index.js:1:10:
1 | import { foo } from "./module"
  |          ^^^
/Users/nohehf/tmp/js/module.js:1:14:
1 | export const foo = "bar"
  |              ^^^

@hendrikvanantwerpen
Copy link
Collaborator

From what I see, the import is modeled as a push (with a downward arrow) of module, but the module access requires pops (the upward arrows) of the full path. I don't quite remember if this is on purpose or an oversight in the rules.

I wonder what happens if you run the index command in the directory itself? I think there's some logic missing that takes the source root into account when computing the module names.

@nohehf
Copy link
Contributor Author

nohehf commented Apr 30, 2024

Oh I see what you mean. I think I'm starting to understand the visualization, you are referring about (1) and (2) right?
CleanShot 2024-04-30 at 14 28 40

So I tried running it with both the full path:

tree-sitter-stack-graphs-python clean --all
tree-sitter-stack-graphs-python index /Users/nohehf/tmp/py
tree-sitter-stack-graphs-python query definition "/Users/nohehf/tmp/py/main.py:3:14"
/Users/nohehf/tmp/py/main.py:3:14: found 0 definitions for 1 references
queried reference
/Users/nohehf/tmp/py/main.py:3:14:
3 | baz = module.foo
  |              ^^^

has no definitions

And just the current directory (.)

tree-sitter-stack-graphs-python clean --all
tree-sitter-stack-graphs-python index .
tree-sitter-stack-graphs-python query definition "main.py:3:14"
/Users/nohehf/tmp/py/main.py:3:14: found 0 definitions for 1 references
queried reference
/Users/nohehf/tmp/py/main.py:3:14:
3 | baz = module.foo
  |              ^^^

has no definitions

And visualisations are exactly the same.

@nohehf
Copy link
Contributor Author

nohehf commented May 4, 2024

Hello @hendrikvanantwerpen, do you have any idea on this one ?
I checked a bit into the code but I'm not sure where the issue comes from to be honest.
I 100% agree that there is an issue with absolute / relative paths here but I don't really know how to fix it / get around it.
Let me know if / how I can help.
Thanks

@hendrikvanantwerpen
Copy link
Collaborator

@nohehf I don't immediately know what might be going on here. One quick experiment you could do is to replicate your test case in TypeScript, and see how the graphs differ. This will give us a starting point for what to look at.

@nohehf
Copy link
Contributor Author

nohehf commented May 6, 2024

This is the JavaScript reproduction: #430 (comment)
Here is the visualization:
CleanShot 2024-05-06 at 17 43 53@2x

Here is an archive with the source code and the .html visualization.
js.zip

We can see that in the visualization we do have the right push & pop on the full path.

@nohehf
Copy link
Contributor Author

nohehf commented May 27, 2024

Getting back to this as I tried to reproduce the error in tests without success.
The following test passes:

# ------ path: module.py -----------#

foo = 42

# ------ path: main.py -------------#

from module import foo

print(foo)
#     ^ defined: 3, 7

yet reproducing the same hierarchy in a separate dir does not, when indexed and queried through the cli (the 3 ref is not found).

I added an output of the visualization from the test and got:
image

As opposed to the one from the "manual" cli testing:
CleanShot 2024-05-27 at 16 05 29@2x

We can see that the test one does not prefix with all the parts from the absolute path. Maybe the "base path" of the codebase should be stripped out from the cli graph construction ?

I cannot really figure out if the issue here is the tsg or the cli/indexer

EDIT: it actually does work via the cli if the full path is specified (but it's not valid in python)

from Users.nohehf.tmp.repro.module import foo

print(foo)
#     ^ defined: 3, 7

CleanShot 2024-05-27 at 18 59 46@2x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants