-
Notifications
You must be signed in to change notification settings - Fork 51
cpp vtables Part2 Multiple Inheritance
The world of single-parent inheritance hierarchies is simpler for the compiler. As we saw in Part 1, each child class extends its parent vtable by appending entries for each new virtual method.
单一继承层次结构的世界对于编译器来说更简单。正如我们在第一部分中所看到的,每个子类通过添加新虚拟方法的条目来扩展其父类vtable。
In this post we will cover multiple inheritance, which complicates things even when only inheriting from pure-interfaces.
在本文中,我们将介绍多重继承,即使只从纯接口进行继承也会使事情变得复杂。
Let’s look at the following piece of code:
让我们看下面这段代码:
class Mother {
public:
virtual void MotherMethod() {}
int mother_data;
};
class Father {
public:
virtual void FatherMethod() {}
int father_data;
};
class Child : public Mother, public Father {
public:
virtual void ChildMethod() {}
int child_data;
};
Child ’s layout |
---|
vptr$Mother |
mother_data (+ padding) |
_vptr$Father |
father_data |
child_data |
Note that there are 2 vtable pointers. Intuitively I’d expect either 1 or 3 pointers (Mother
, Father
and Child
). In reality it’s impossible to have a single pointer (more on this soon), and the compiler is smart enough to combine Child
’s vtable entries as a continuation of Mother
’s vtable, thus saving 1 pointer.
请注意,有2个虚函数表指针。直觉上我会期望只有1个或3个指针(Mother
, Father
和 Child
)。实际上,拥有单一指针是不可能的(稍后会详细解释),编译器足够聪明地将 Child
的虚函数表条目作为 Mother
虚函数表的延续来合并,从而节省了1个指针。
Why can’t Child
have one vtable pointer for all 3 types? Remember that a Child
pointer can be passed to a function accepting a Mother
pointer or a Father
pointer, and both will expect the this
pointer to hold the correct data in the correct offsets. These functions don’t necessarily know of Child
, and definitely shouldn’t assume that a Child
is really what’s underneath the Mother
/Father
pointer they have in their hands.
为什么 Child
不能使用一个包含所有3种类型的虚函数表指针?记住,可以将 Child
指针传递给接受 Mother
指针或 Father
指针的函数,并且这两个函数都希望 this
指针在正确偏移处保存正确数据。这些函数不一定知道 Child
对象,并且绝对不应该假设手中持有的 Mother
/Father
指针下面真正存在着一个 Child
。
Unrelated to this topic, but interesting nontheless, is that child_data
is actually placed inside Father
’s padding. This is called ‘tail padding’, and might be the topic of a future post.
与此话题无关但同样有趣的是 child_data
实际上位于 Father
padding内部。这被称为“尾填充”,可能成为未来文章讨论的主题。
Here’s the vtable layout:
以下是虚函数表布局:
Address | Value | Meaning |
---|---|---|
0x4008b8 | 0 |
top_offset (more on this later) |
0x4008c0 | 0x400930 | pointer to typeinfo for Child
|
0x4008c8 | 0x400800 |
Mother::MotherMethod() . _vptr$Mother points here. |
0x4008d0 | 0x400810 | Child::ChildMethod() |
0x4008d8 | -16 |
top_offset (more on this later) |
0x4008e0 | 0x400930 | pointer to typeinfo for Child
|
0x4008e8 | 0x400820 |
Father::FatherMethod() . _vptr$Father points here. |
In this example, an instance of Child
will have the same pointer when casted to a Mother
pointer. But when casting to a Father
pointer the compiler calculates an offset of the this
pointer to point to the _vptr$Father
part of Child
(3rd field in Child
’s layout, see table above).
在这个例子中,当将 Child
的实例转换为 Mother
指针时,指针保持不变。但是当将其转换为 Father
指针时,编译器会计算一个偏移量,使得this
指针指向 Child
中 _vptr$Father
的部分(在 Child
的布局中是第3个字段,参见上表)。
In other words, for a given Child c;
: (void*)&c != (void*)static_cast<Father*>(&c)
. Some people don’t expect this, and maybe some day this information will save you some debugging time. I found it useful more than once.
换句话说,对于给定的 Child c;
: (void*)&c != (void*)static_cast<Father*>(&c)
。有些人可能不会预料到这一点,也许有一天这些信息会为你节省一些调试时间。我发现它不止一次地很有用。
But wait, there’s more.
但是,等等,还有更多。
What if Child
decided to override one of Father
’s methods? Consider this code:
如果 Child
决定覆盖 Father
的某个方法呢?考虑下面的代码:
class Mother {
public:
virtual void MotherFoo() {}
};
class Father {
public:
virtual void FatherFoo() {}
};
class Child : public Mother, public Father {
public:
void FatherFoo() override {}
};
This gets tricky. A function may take a Father*
argument and call FatherFoo()
on it. But if you pass a Child
instance, it is expected to invoke Child
’s overridden method with the correct this
pointer. However, the caller doesn’t know it’s really holding a Child
. It has a pointer to a Child
’s offset where Father
’s layout is. Someone needs to offset this
, but how is it done? What magic does the compiler perform to get this to work?
这变得棘手。一个函数可能会以 Father*
参数接受并调用其上的 FatherFoo()
方法。但是如果传递一个 Child
实例,它被期望调用具有正确 this
指针的 Child
的覆盖方法。然而,调用者并不知道它实际上持有一个 Child
。它持有一个指向 Child
的偏移位置,其中包含 Father
的布局。需要有人对 this
进行偏移,但是如何实现呢?编译器执行了什么魔法来使其工作呢?
[Before we answer that, note that overriding one of Mother
’s methods is not really tricky as the this pointer is the same. Child
knows to read beyond the Mother
vtable and expects the Child
methods to be right after that.]
[在回答这个问题之前,请注意覆盖 Mother
的方法并不真正棘手,因为this
指针是相同的。Child
知道要读取 Mother
虚函数表之外的部分,并期望 Child
的方法紧随其后。]
Here’s the solution: the compiler creates a ‘thunk’ method that corrects this
and then calls the ‘real’ method. The address of the thunk method will sit under Child
’s Father
vtable, while the ‘real’ method will be under Child
’s vtable.
解决方法如下:编译器创建一个“thunk”方法来修正 this
,然后调用“真正”的方法。thunk方法的地址将位于 Child
的 Father
虚函数表下,而“真正”的方法将位于 Child
的虚函数表下。
Here’s Child
’s vtable:
这是 Child
的虚函数表:
0x4008e8 <vtable for Child>: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x4008f0 <vtable for Child+8>: 0x60 0x09 0x40 0x00 0x00 0x00 0x00 0x00
0x4008f8 <vtable for Child+16>: 0x00 0x08 0x40 0x00 0x00 0x00 0x00 0x00
0x400900 <vtable for Child+24>: 0x10 0x08 0x40 0x00 0x00 0x00 0x00 0x00
0x400908 <vtable for Child+32>: 0xf8 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x400910 <vtable for Child+40>: 0x60 0x09 0x40 0x00 0x00 0x00 0x00 0x00
0x400918 <vtable for Child+48>: 0x20 0x08 0x40 0x00 0x00 0x00 0x00 0x00
Which means:
这意味着:
Address | Value | Meaning |
---|---|---|
0x4008e8 | 0 |
top_offset (soon!) |
0x4008f0 | 0x400960 | typeinfo for Child |
0x4008f8 | 0x400800 | Mother::MotherFoo() |
0x400900 | 0x400810 | Child::FatherFoo() |
0x400908 | -8 | top_offset |
0x400910 | 0x400960 | typeinfo for Child |
0x400918 | 0x400820 | non-virtual thunk to Child::FatherFoo() |
Explanation: as we saw earlier, Child
has 2 vtables - one used for Mother
and Child
, and the other for Father
. In Father
’s vtable, FatherFoo()
points to a thunk, while Child’s vtable points directly to Child::FatherFoo()
.
解释:正如我们之前所看到的,Child
有两个虚函数表 - 一个用于 Mother
和 Child
,另一个用于Father
。在Father
的虚函数表中,FatherFoo()
指向一个thunk(桥接代码),而 Child
的虚函数表直接指向 Child::FatherFoo()
。
And what’s in this thunk, you ask?
那么你可能会问,在这个thunk中是什么呢?
(gdb) disas /m 0x400820, 0x400850
Dump of assembler code from 0x400820 to 0x400850:
15 void FatherFoo() override {}
0x0000000000400820 <non-virtual thunk to Child::FatherFoo()+0>: push %rbp
0x0000000000400821 <non-virtual thunk to Child::FatherFoo()+1>: mov %rsp,%rbp
0x0000000000400824 <non-virtual thunk to Child::FatherFoo()+4>: sub $0x10,%rsp
0x0000000000400828 <non-virtual thunk to Child::FatherFoo()+8>: mov %rdi,-0x8(%rbp)
0x000000000040082c <non-virtual thunk to Child::FatherFoo()+12>: mov -0x8(%rbp),%rdi
0x0000000000400830 <non-virtual thunk to Child::FatherFoo()+16>: add $0xfffffffffffffff8,%rdi
0x0000000000400837 <non-virtual thunk to Child::FatherFoo()+23>: callq 0x400810 <Child::FatherFoo()>
0x000000000040083c <non-virtual thunk to Child::FatherFoo()+28>: add $0x10,%rsp
0x0000000000400840 <non-virtual thunk to Child::FatherFoo()+32>: pop %rbp
0x0000000000400841 <non-virtual thunk to Child::FatherFoo()+33>: retq
0x0000000000400842: nopw %cs:0x0(%rax,%rax,1)
0x000000000040084c: nopl 0x0(%rax)
Like we discussed - offsetting this
and calling FatherFoo()
. And by how much should we offset this to get Child
? top_offset!
就像我们讨论过的那样 - 偏移this
并调用 FatherFoo()
。我们应该偏移多少来得到Child
?top_offset
!
[Please note that I personally think that the name non-virtual thunk
is extremely confusing as this is the entry in the virtual table to the virtual function. I’m not sure what’s not virtual about it, but that’s just my opinion.]
请注意,我个人认为 non-virtual thunk
这个名字非常令人困惑,因为它是虚函数表中虚函数的入口。我不确定它有什么地方不是虚拟的,但这只是我的观点。
Stay tuned for Part 3 - Virtual inheritance - where things get even funkier.
敬请关注第三部分 - 虚继承 - 在那里事情变得更加奇怪。