Skip to content

cpp vtables Part2 Multiple Inheritance

jiaxw32 edited this page Nov 5, 2023 · 1 revision

C++ vtables - Part 2 - Multiple Inheritance

原文链接

The world of single-parent inheritance hierarchies is simpler for the compiler. As we saw in Part 1, each child class extends its parent vtable by appending entries for each new virtual method.

单一继承层次结构的世界对于编译器来说更简单。正如我们在第一部分中所看到的,每个子类通过添加新虚拟方法的条目来扩展其父类vtable。

In this post we will cover multiple inheritance, which complicates things even when only inheriting from pure-interfaces.

在本文中,我们将介绍多重继承,即使只从纯接口进行继承也会使事情变得复杂。

Let’s look at the following piece of code:

让我们看下面这段代码:

class Mother {
 public:
  virtual void MotherMethod() {}
  int mother_data;
};

class Father {
 public:
  virtual void FatherMethod() {}
  int father_data;
};

class Child : public Mother, public Father {
 public:
  virtual void ChildMethod() {}
  int child_data;
};
Child’s layout
vptr$Mother
mother_data (+ padding)
_vptr$Father
father_data
child_data

Note that there are 2 vtable pointers. Intuitively I’d expect either 1 or 3 pointers (Mother, Father and Child). In reality it’s impossible to have a single pointer (more on this soon), and the compiler is smart enough to combine Child’s vtable entries as a continuation of Mother’s vtable, thus saving 1 pointer.

请注意,有2个虚函数表指针。直觉上我会期望只有1个或3个指针(Mother, FatherChild)。实际上,拥有单一指针是不可能的(稍后会详细解释),编译器足够聪明地将 Child 的虚函数表条目作为 Mother 虚函数表的延续来合并,从而节省了1个指针。

Why can’t Child have one vtable pointer for all 3 types? Remember that a Child pointer can be passed to a function accepting a Mother pointer or a Father pointer, and both will expect the this pointer to hold the correct data in the correct offsets. These functions don’t necessarily know of Child, and definitely shouldn’t assume that a Child is really what’s underneath the Mother/Father pointer they have in their hands.

为什么 Child 不能使用一个包含所有3种类型的虚函数表指针?记住,可以将 Child 指针传递给接受 Mother 指针或 Father 指针的函数,并且这两个函数都希望 this 指针在正确偏移处保存正确数据。这些函数不一定知道 Child 对象,并且绝对不应该假设手中持有的 Mother/Father 指针下面真正存在着一个 Child

Unrelated to this topic, but interesting nontheless, is that child_data is actually placed inside Father’s padding. This is called ‘tail padding’, and might be the topic of a future post.

与此话题无关但同样有趣的是 child_data 实际上位于 Father padding内部。这被称为“尾填充”,可能成为未来文章讨论的主题。

Here’s the vtable layout:

以下是虚函数表布局:

Address Value Meaning
0x4008b8 0 top_offset (more on this later)
0x4008c0 0x400930 pointer to typeinfo for Child
0x4008c8 0x400800 Mother::MotherMethod(). _vptr$Mother points here.
0x4008d0 0x400810 Child::ChildMethod()
0x4008d8 -16 top_offset (more on this later)
0x4008e0 0x400930 pointer to typeinfo for Child
0x4008e8 0x400820 Father::FatherMethod(). _vptr$Father points here.

In this example, an instance of Child will have the same pointer when casted to a Mother pointer. But when casting to a Father pointer the compiler calculates an offset of the this pointer to point to the _vptr$Father part of Child (3rd field in Child’s layout, see table above).

在这个例子中,当将 Child 的实例转换为 Mother 指针时,指针保持不变。但是当将其转换为 Father 指针时,编译器会计算一个偏移量,使得this 指针指向 Child_vptr$Father 的部分(在 Child 的布局中是第3个字段,参见上表)。

In other words, for a given Child c;: (void*)&c != (void*)static_cast<Father*>(&c). Some people don’t expect this, and maybe some day this information will save you some debugging time. I found it useful more than once.

换句话说,对于给定的 Child c;: (void*)&c != (void*)static_cast<Father*>(&c)。有些人可能不会预料到这一点,也许有一天这些信息会为你节省一些调试时间。我发现它不止一次地很有用。

But wait, there’s more.

但是,等等,还有更多。

What if Child decided to override one of Father’s methods? Consider this code:

如果 Child 决定覆盖 Father 的某个方法呢?考虑下面的代码:

class Mother {
 public:
  virtual void MotherFoo() {}
};

class Father {
 public:
  virtual void FatherFoo() {}
};

class Child : public Mother, public Father {
 public:
  void FatherFoo() override {}
};

This gets tricky. A function may take a Father* argument and call FatherFoo() on it. But if you pass a Child instance, it is expected to invoke Child’s overridden method with the correct this pointer. However, the caller doesn’t know it’s really holding a Child. It has a pointer to a Child’s offset where Father’s layout is. Someone needs to offset this, but how is it done? What magic does the compiler perform to get this to work?

这变得棘手。一个函数可能会以 Father* 参数接受并调用其上的 FatherFoo() 方法。但是如果传递一个 Child 实例,它被期望调用具有正确 this 指针的 Child 的覆盖方法。然而,调用者并不知道它实际上持有一个 Child。它持有一个指向 Child 的偏移位置,其中包含 Father 的布局。需要有人对 this 进行偏移,但是如何实现呢?编译器执行了什么魔法来使其工作呢?

[Before we answer that, note that overriding one of Mother’s methods is not really tricky as the this pointer is the same. Child knows to read beyond the Mother vtable and expects the Child methods to be right after that.]

[在回答这个问题之前,请注意覆盖 Mother 的方法并不真正棘手,因为this指针是相同的。Child知道要读取 Mother 虚函数表之外的部分,并期望 Child 的方法紧随其后。]

Here’s the solution: the compiler creates a ‘thunk’ method that corrects this and then calls the ‘real’ method. The address of the thunk method will sit under Child’s Father vtable, while the ‘real’ method will be under Child’s vtable.

解决方法如下:编译器创建一个“thunk”方法来修正 this,然后调用“真正”的方法。thunk方法的地址将位于 ChildFather 虚函数表下,而“真正”的方法将位于 Child 的虚函数表下。

Here’s Child’s vtable:

这是 Child 的虚函数表:

0x4008e8 <vtable for Child>:	    0x00	0x00	0x00	0x00	0x00	0x00	0x00	0x00
0x4008f0 <vtable for Child+8>:	    0x60	0x09	0x40	0x00	0x00	0x00	0x00	0x00
0x4008f8 <vtable for Child+16>:	    0x00	0x08	0x40	0x00	0x00	0x00	0x00	0x00
0x400900 <vtable for Child+24>:	    0x10	0x08	0x40	0x00	0x00	0x00	0x00	0x00
0x400908 <vtable for Child+32>:	    0xf8	0xff	0xff	0xff	0xff	0xff	0xff	0xff
0x400910 <vtable for Child+40>:	    0x60	0x09	0x40	0x00	0x00	0x00	0x00	0x00
0x400918 <vtable for Child+48>:	    0x20	0x08	0x40	0x00	0x00	0x00	0x00	0x00

Which means:

这意味着:

Address Value Meaning
0x4008e8 0 top_offset (soon!)
0x4008f0 0x400960 typeinfo for Child
0x4008f8 0x400800 Mother::MotherFoo()
0x400900 0x400810 Child::FatherFoo()
0x400908 -8 top_offset
0x400910 0x400960 typeinfo for Child
0x400918 0x400820 non-virtual thunk to Child::FatherFoo()

Explanation: as we saw earlier, Child has 2 vtables - one used for Mother and Child, and the other for Father. In Father’s vtable, FatherFoo() points to a thunk, while Child’s vtable points directly to Child::FatherFoo().

解释:正如我们之前所看到的,Child 有两个虚函数表 - 一个用于 MotherChild,另一个用于Father。在Father的虚函数表中,FatherFoo()指向一个thunk(桥接代码),而 Child 的虚函数表直接指向 Child::FatherFoo()

And what’s in this thunk, you ask?

那么你可能会问,在这个thunk中是什么呢?

(gdb) disas /m 0x400820, 0x400850
Dump of assembler code from 0x400820 to 0x400850:
15	  void FatherFoo() override {}
   0x0000000000400820 <non-virtual thunk to Child::FatherFoo()+0>:	push   %rbp
   0x0000000000400821 <non-virtual thunk to Child::FatherFoo()+1>:	mov    %rsp,%rbp
   0x0000000000400824 <non-virtual thunk to Child::FatherFoo()+4>:	sub    $0x10,%rsp
   0x0000000000400828 <non-virtual thunk to Child::FatherFoo()+8>:	mov    %rdi,-0x8(%rbp)
   0x000000000040082c <non-virtual thunk to Child::FatherFoo()+12>:	mov    -0x8(%rbp),%rdi
   0x0000000000400830 <non-virtual thunk to Child::FatherFoo()+16>:	add    $0xfffffffffffffff8,%rdi
   0x0000000000400837 <non-virtual thunk to Child::FatherFoo()+23>:	callq  0x400810 <Child::FatherFoo()>
   0x000000000040083c <non-virtual thunk to Child::FatherFoo()+28>:	add    $0x10,%rsp
   0x0000000000400840 <non-virtual thunk to Child::FatherFoo()+32>:	pop    %rbp
   0x0000000000400841 <non-virtual thunk to Child::FatherFoo()+33>:	retq   
   0x0000000000400842:	nopw   %cs:0x0(%rax,%rax,1)
   0x000000000040084c:	nopl   0x0(%rax)

Like we discussed - offsetting this and calling FatherFoo(). And by how much should we offset this to get Child? top_offset!

就像我们讨论过的那样 - 偏移this并调用 FatherFoo()。我们应该偏移多少来得到Childtop_offset

[Please note that I personally think that the name non-virtual thunk is extremely confusing as this is the entry in the virtual table to the virtual function. I’m not sure what’s not virtual about it, but that’s just my opinion.]

请注意,我个人认为 non-virtual thunk 这个名字非常令人困惑,因为它是虚函数表中虚函数的入口。我不确定它有什么地方不是虚拟的,但这只是我的观点。

Stay tuned for Part 3 - Virtual inheritance - where things get even funkier.

敬请关注第三部分 - 虚继承 - 在那里事情变得更加奇怪。

Clone this wiki locally