-
-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide a way to compare AST nodes for equality recursively #60191
Comments
As is, as far as I can tell, there's no way to easily compare two AST nodes to see if they have the same children and same fields (recursively). I'm writing some unit tests for a NodeTransformers, so I've settled for comparing I don't know the global implications of changing ast.AST.__eq__ to know if that's feasible (hopefully someone will comment on that), but if it isn't, another provided way would be nice. |
This is a reasonable request. Should comparison include lineno/col_offset or not? If you implement, __eq__, you should also implement __hash__. Maybe, it would be best to start with a helper comparison function in the ast module. |
I'd say yes (to both lineno/col_offset). And yeah that sounds like what I had in mind (a helper function). If I'm specific for a moment about implementation, perhaps something like Sound good to you? |
Yes, though some things like what to return if one has an entire subtree that the other doesn't have will have to be worked out. |
I have a use for this as well, but w/o the lineno/col_offset comparison and all I need is a True/False result. |
IOW I think a function that uses ast.walk() and has flags for specifying whether _attributes should also be checked and then uses a class check and then uses _fields to do all other checking. |
If we only need a binary True/False result, could we just return a compare of dump(a) == dump(b)? This can also add the include_attributes flags for need. |
Provide a recursive way to compare AST nodes, it will compare with fields, but no attributes. The performance compare with ast.dump methods in unittest # Recursive compare OK # ast.dump compare |
Update to AST base type richcompare. the unittest finished time was better than python version:
|
This discussion is inconclusive and targets an old version of CPython, can this issue be closed? |
Closing issue, PR branch has since been removed and targets Python 3.4 |
Please don't close an issue too fast. The PR 1368 is still open and contains valuable code. Moreover, I don't see anyone here saying that the feature is a bad idea. The feature has not been implemented, so the issue should remain open, even if PR 1368 is outdated.
Well, it's just that the issue has been created a long time ago, but the feature request remain value for Python 3.8. |
Btw, I am +1 on this feature (preferably with an option to check line, column, end line, and end column). I always wanted this, but never had time to actually implement this. |
If consensus has been reached on this, I am willing to do the work. |
It looks like there is already an active PR #14970, there are some non-implemented comments from a core review. |
I am not sure that implementing a rich comparison of AST nodes is the right way. There are too much options (compare attributes or not, compare recursively or not, compare types or not) and __eq__ does not support options. In addition, it requires implementing compatible __hash__, but how do you implement a hash of mutable object? In addition, recursive comparison can introduce a regression in existing code -- instead of fast returning False the comparison will spent a nontrivial amount of time. If implement a comparison of AST nodes, it should be a separate function which support multiple options. Would be nice also to have examples where this feature can be used before implementing it. See also bpo-37792. |
The solution to hashing used in PR 14970 was just using the default hashing function which is kind of a workaround to the real problem. I now concur with your comments. Will try to draft out an |
So, before implementing this, I would like to create a discussion on discuss.python.org to gather user feedback on what this comparison should do. Is it a full comparison that includes comparing each field, or is it a structural comparison nodes, where's these nodes are treated as equal The second thing is about hashing: our last nodes are mutable, so should we make them immutable and provide a |
* bpo-15987: Implement ast.compare Add a compare() function that compares two ASTs for structural equality. There are two set of attributes on AST node objects, fields and attributes. The fields are always compared, since they represent the actual structure of the code. The attributes can be optionally be included in the comparison. Attributes capture things like line numbers of column offsets, so comparing them involves test whether the layout of the program text is the same. Since whitespace seems inessential for comparing ASTs, the default is to compare fields but not attributes. ASTs are just Python objects that can be modified in arbitrary ways. The API for ASTs is under-specified in the presence of user modifications to objects. The comparison respects modifications to fields and attributes, and to _fields and _attributes attributes. A user could create obviously malformed objects, and the code will probably fail with an AttributeError when that happens. (For example, adding "spam" to _fields but not adding a "spam" attribute to the object.) Co-authored-by: Jeremy Hylton <jeremy@alum.mit.edu>
@Eclips4 I missed this discussion when I was working on pr19211 at the sprints. Happy to continue discussion on the topic, as we have a little time before 3.14 is released. As you see in the PR, this is a separate compare() function that totally side steps things like eq and hash. It also focuses on full-field-level equality. Seems hard to say that two ASTs should be equal if they are "1 + 2" and "2 + 3" since they are literally different (literally). A comparison function that says "some differences in the code are allowed" might be framed as more of a pattern-matching problem that could answer questions like "Are these both arithmetic expressions that have the same structure except for constants?" Or "Are these both functions of two-positional arguments that don't call other functions?" |
I think the current implementation is good enough. Actually, if we have |
* bpo-15987: Implement ast.compare Add a compare() function that compares two ASTs for structural equality. There are two set of attributes on AST node objects, fields and attributes. The fields are always compared, since they represent the actual structure of the code. The attributes can be optionally be included in the comparison. Attributes capture things like line numbers of column offsets, so comparing them involves test whether the layout of the program text is the same. Since whitespace seems inessential for comparing ASTs, the default is to compare fields but not attributes. ASTs are just Python objects that can be modified in arbitrary ways. The API for ASTs is under-specified in the presence of user modifications to objects. The comparison respects modifications to fields and attributes, and to _fields and _attributes attributes. A user could create obviously malformed objects, and the code will probably fail with an AttributeError when that happens. (For example, adding "spam" to _fields but not adding a "spam" attribute to the object.) Co-authored-by: Jeremy Hylton <jeremy@alum.mit.edu>
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: