New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotated AST #166
Comments
There is, yes: the |
Can you help me understand what I'm doing wrong here? import ast
from pyanalyze import name_check_visitor
code = """
a = 1
"""
tree = ast.parse(code)
ncv = name_check_visitor.NameCheckVisitor("bla", code, tree, annotate=True)
ncv.visit(tree)
assign = tree.body[0]
assert assign.value.inferred_value.get_type() is int # OK
a = assign.targets[0].inferred_value
print(repr(a)) # UnresolvedValue() :( It would expect |
I guess we don't infer a value for names in write context because it doesn't matter for pyanalyze's own type inference. I submitted a fix in #167. (Also note that the version of pyanalyze on PyPI is pretty out of date, if you're using it. I'll hopefully make a new PyPI release soon.) |
Nice, it's working, thanks! I'll try it on more complicated stuff. |
Great to hear! Also for more complicated cases pyanalyze will likely want a real imported module. You may need a trick like in
I'm interested in making this a more directly usable API if it's useful for you, something like a documented |
Yes, that would be very useful for me. And I imagine that for others as well. |
Great! Let me know what your experience is with this functionality. |
Well, it's definitely a rocky road :) import ast
from pyanalyze import name_check_visitor
code = """
class A:
a = 1
"""
tree = ast.parse(code)
ncv = name_check_visitor.NameCheckVisitor("bla", code, tree, annotate=True)
ncv.visit(tree)
print( tree.body[0].body[0].targets[0].inferred_value ) # AttributeError :( |
OK, that's because class bodies are only visited during the second "checking" pass, but your code makes it only do the "collecting" pass. Calling |
Okay, it's working now. It crashes if any imports are wrong, which ideally it shouldn't, but for now it's not a problem. |
Thanks, could you share more details on that? I'd like to make it more robust to this sort of thing. |
Well, if I analyze a file within a package, it might have all sorts of relative imports that aren't available. Ideally they should be, of course, and that probably warrants a warning, but ideally I'd like it to just accept the import as unknown, like Anyway, new problem: import ast
from pyanalyze import name_check_visitor
from pyanalyze.test_name_check_visitor import _make_module
code = """
class A:
def __init__(self):
self.a = 1
"""
mod = _make_module(code)
tree = ast.parse(code)
ncv = name_check_visitor.NameCheckVisitor("bla", code, tree, annotate=True, module=mod)
ncv.visit(tree)
ncv.check()
a = tree.body[0].body[0].body[0].targets[0].inferred_value
print(repr(a)) # UnresolvedValue() |
For modules, this should be fixable by giving the right module name and filename to the constructor, so that pyanalyze knows how to do the relative imports right. It still shouldn't crash though, but just log an error and turn the modules into Any/UNRESOLVED_VALUE. I'l look into doing that. The attribute thing is the same issue as for Name, but needs to be fixed in the |
I wonder if perhaps this sort of API would require a lot of changes. |
I fixed the issue for attribute and subscript in #168. There aren't any more kinds of assignment targets, so hopefully at least this particular kind of issue won't require more changes. I think when I previously used this functionality I just looked at the type of expressions, not of variables. |
Nice. Well, that may be beyond the scope of what pyanalyze can do, but I would expect that to work: import ast
from pyanalyze import name_check_visitor
from pyanalyze.test_name_check_visitor import _make_module
code = """
class A:
def __init__(self):
self.a = 1
def bla(self):
return self.a
a = A()
b = a.bla()
"""
mod = _make_module(code)
tree = ast.parse(code)
ncv = name_check_visitor.NameCheckVisitor("bla", code, tree, annotate=True, module=mod)
ncv.visit(tree)
ncv.check()
print( tree.body[0].body[1].body[0].inferred_value) # UnresolvedValue(), expected int
print( tree.body[1].value.inferred_value ) # UnresolvedValue(), expected A
print( tree.body[2].value.inferred_value ) # UnresolvedValue(), expected int |
I'd expect that to work, it might require some different hack around the module object. It may work better if you're looking at values inside a function. At the module level pyanalyze tends to just look at module globals, so if those aren't available things might not work. I'll spend some time tonight trying to expose a usable API. Your feedback is very helpful! |
Happy to help! I'm surprised at how difficult it is to find such API, given how many different static analyzers exist for Python. But they each re-implement everything from scratch, and provide no access to it. Maybe this can become a popular building block. |
How does #169 work for you? |
The build failed. I work with py38, so I'll try when it's fixed. |
It's green now! Happily the tests uncovered a few more places where we weren't adding annotations to the AST. |
The example I gave here ( #166 (comment) ) is still giving me 3 P.S. I'm also getting the following error whenever I run it (on windows 8):
But it doesn't interrupt the run. |
Calling
The first Any is because you're printing the value of the Return node, but we only set the value on the expression, not on the Return node itself. I'm afraid I haven't tested on Windows before, I guess I need to be more careful what paths I use there. Maybe we should try using |
Doing
I don't know. But I would guess that you're not adding an annotation on return, or that it's not connected in the same way that an explicit annotation is. |
The UnresolvedValues you're seeing are likely related to the internal errors around Windows paths, at least some of them. Could you try locally changing the
It should work through the |
Just pushed a change to the PR that may fix Windows (it will skip looking for stubs for module names that aren't valid identifiers). |
Yeah, there's no error anymore. But the results are still unresolved. |
Not sure how to explain that. To be clear, you're running |
It says there is no I am literally running the exact code I pasted above. |
I see, I was referring to the new functions from the PR (in the |
Helps with #166 Plus some related changes: - Make the importer fall back to importing outside sys.path (which seems generally useful) - Make name_check_visitor.py work a bit better if self.module is None - Ensure all expressions have an annotation set
Hi, are you making any progress to fix this issue? I don't mean to pressure, but if not, it seems I'll have to find a different solution. |
From my perspective, it's basically fixed, since pyanalyze has public functions that do what you're asking for (released in 0.2.0 last week and as of yesterday documentd at https://pyanalyze.readthedocs.io/en/latest/reference/ast_annotator.html). But it sounded like it's not working for you due to some issue that I couldn't reproduce. And it might be due to Windows-specific issues, which I don't really have a great way to debug. Could you tell me exactly what code you're running now that doesn't do what you expect? I'm also open to doing a call or some sort of pair programming to help figure out the problems if you're interested. |
Here's an updated version of the code: from pyanalyze.ast_annotator import annotate_code
code = """
class A:
def __init__(self):
self.a = 1
def bla(self):
return self.a
a = A()
b = a.bla()
"""
tree = annotate_code(code)
print( repr(tree.body[0].body[1].body[0].value.inferred_value) ) # KnownValue(1) -- Okay
print( repr(tree.body[2].targets[0].inferred_value)) # UnresolvedValue() -- Not Okay! I get the same problem when I run it in both Windows and Ubuntu (WSL). Can you please run this and show me what you get as a result? Sure, we can do a voice call if you think it will help. I like discord, but I'm open to other apps. |
Thanks! I get the same as you, first KnownValue(1) and then UnresolvedValue(). Good to know that at least this isn't OS-dependent. It's a bit of an edge case, as it relies on some ad-hoc inference logic instead of type annotations, but I'd expect it to work, so I'll look into why it doesn't work with Interestingly, it does manage to infer the type as |
In general I wouldn't be surprised if you run into more places where pyanalyze doesn't infer things as well as it could. That's basically #64. I'll fix such issues as I encounter them, but well, there's a lot of Python :) |
From my perspective, this is a very very simple use-case. I don't mind the occasional bug or missing feature. But if this is considered an edge case, then maybe it's not ready for this kind of use yet. |
That's fair. I say it's an edge case because pyanalyze is now generally intended to be a PEP 484-compatible type checker, which means it largely works on the basis of annotations when working across functions. Mypy also thinks |
Fixed this case in #201. |
How can I access the call graph? For example, if I have a |
The inferred value of the |
Thanks, seems to be working! If you don't mind, I'd like to suggest adding an API for that, just to save the users the trouble of figuring it out each time for themselves. Something like class CollectFunctions(ast.NodeVisitor):
def __init__(self):
self.functions = {}
def visit_FunctionDef(self, func_def):
self.functions[func_def.inferred_value.val] = func_def
class Analyze:
def __init__(self, code):
self.code = code
self.tree = annotate_code(code)
c = CollectFunctions()
c.visit(self.tree)
self._functions = c.functions
def get_callee(self, call):
assert isinstance(call, ast.Call)
return self._functions[call.func.inferred_value.val] Or maybe instead, to write the references to |
There is something like that, the |
I think that would be a good idea. Especially since it turns out def get_callee(self, call):
assert isinstance(call, ast.Call), call
inferred_value = call.func.inferred_value
if isinstance(inferred_value, pyanalyze.value.MultiValuedValue):
for v in inferred_value.vals:
with suppress(KeyError):
return self._functions[v.val]
raise ValueError(f"Cannot find callee for call: {call}")
elif isinstance(inferred_value, pyanalyze.value.UnresolvedValue):
raise ValueError(f"Cannot find callee for call: {call}")
elif isinstance(inferred_value, pyanalyze.value.UnboundMethodValue):
f = getattr(inferred_value.typ.typ, inferred_value.attr_name)
return self._functions[f]
return self._functions[inferred_value.val] |
Makes sense. A few quick notes on your code:
|
Thanks, good to know.
Yes, you're right. For now I'm just trying to get it to work. Tbh I'm still in the dark regarding whether pyanalyze will fit my purposes. I'm trying to bring the interface up to speed, so I can explore it more easily on different files. |
Closing as the |
Is there a way to access the parsed AST, with the inferred types? I want to write a code translation tool, and knowing the types would be very useful.
I mean, having a regular AST (similar to
ast.AST
) but with added attributes, such as:declared_type
, andinferred_types
.I tried to look through the code, but couldn't figure it out.
The text was updated successfully, but these errors were encountered: