New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-frontend] Inheritance and overriding #1639
Conversation
c5a2c96
to
88a3221
Compare
auto self_instance = instance_attr_map.find(self_id); | ||
if (self_instance != instance_attr_map.end()) | ||
{ | ||
std::vector<std::string> &attr_list = instance_attr_map[obj_symbol_id]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the order of elements in attr_list
important? If not, you could consider using std::unordered_map
and std::unordered_set
for faster lookups.
} | ||
struct_typet &class_type = static_cast<struct_typet &>(class_symbol->type); | ||
for (auto component : class_type.components()) | ||
clazz.components().push_back(component); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we use emplace_back
instead of push_back
for better performance?
regression/python/classes/main.py
Outdated
assert(obj1.data == 10) | ||
obj1.data = 20 | ||
assert(obj1.data == 20) | ||
assert(obj1.x == 5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend keeping the original regression and creating a new one with these changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the delegation to the first __init__
in the list of base classes work recursively, e.g., will instantiating D invoke B's init method here?
class A: pass
class B:
def __init__(self): pass
class C(B): pass
class D(A, C): pass
I am not a big fan of encoding the relationship between methods and classes in the symbol's identifier. Is the C++ frontend using the same mechanism, or could we possibly find a way to make these relationships accessible via some kind of API?
symbol_id.replace( | ||
pos, | ||
std::string("@C@" + current_class + "@F@" + current_func_name).length(), | ||
std::string("@C@" + base_class + "@F@" + method_name)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, the value given to parameter symbol_id
is modified in each iteration of the loop. Likely, this should operate on a copy of that value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something, but with the current change a copy is being created outside of the loop, and the loop still repeatedly modifies this single copy. If the intention is to replace current_class with base_class and current_func_name with method_name, I think this will only work for the first iteration, as after that the replacement will have been stored in sym_id
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will only work for the first iteration.
I guess I got your point now. method_name doesn't need to be updated.
We are searching for the same method across different classes, so only the classname needs to vary.
For example if main.py contains:
class A:
def foo()
class B(A):
pass
obj = B()
obj.foo()
After parsing A and B we will have the following symbol for the method foo in the context: py:main.py@C@A@foo
Given that the obj is of type B, we first look for "py:main.py@C@B@foo" (which doesn't exist), then look for "py:main.py@C@A@foo".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That wasn't my point but seems to be valid, too. :)
Consider a class Z
(one character) and one XY
(2 characters) to be iterated over in the loop, in this order, and let the current class name be ABC
(3 characters). What will happen is sym_id = "py:main.py@C@ABC@F@foo"
is getting replaced by sym_id = "py:main.py@C@Z@F@foo"
. In the second iteration, the substring @C@ABC
will not be found anymore in sym_id
, because after the first iteration the current_class ABC
has already been replaced with Z
.
That's why I thought that this should work on a new copy of the original string in each iteration.
Edit: Never mind, you're updating the current_class. Didn't see that, sorry. From my point of view this method could use a comment and I think a proper way of identifying sub-/base-class relationships and overridden methods in the symbol table is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the second iteration, the substring @C@ABC will not be found anymore in sym_id, because after the first iteration the current_class ABC has already been replaced with Z.
current_class is updated at the end of each iteration. So, in the second call to sym_id.rfind("@C@" + current_class);, current_class would hold "Z", and the substring would be found.
Not yet, but I'll implement it. Thanks for pointing that out.
The relationship is encoded in the AST/JSON: https://github.com/brcfarias/esbmc/blob/035bcf33fbcbf8b0a74762fc47fec05854ebd71c/src/python-frontend/python_converter.cpp#L377-L379 |
👍
Right. I meant an API in ESBMC that allows obtaining symbols corresponding to base classes and/or overridden functions without having to fiddle with the identifiers in a non-documented way. Do you know whether or how the C++ frontend is doing this? |
I see. In C++ this part is handled as follows: esbmc/src/clang-c-frontend/clang_c_convert.cpp Lines 1692 to 1723 in 14e5afd
If the method isn't virtual we simply call "new_expr = member_exprt(base, comp.name(), comp.type());", as seen on line 1706 above. If it is virtual, this part is responsible for managing dynamic binding: esbmc/src/clang-cpp-frontend/clang_cpp_convert_bind.cpp Lines 153 to 174 in 14e5afd
I'm unsure if it's worth reusing this approach or perhaps extending struct_typet to include "base classes". |
I also don't know, yet. To my mind, the best option would be to extend the |
035bcf3
to
92a0088
Compare
Thanks for submitting this PR, @brcfarias. |
This PR adds supports for inheritance in Python, addressing the following aspects:
The regression tests provides a overview of the features that are being handled.