Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V8源码-内存管理 #13

Open
tsy77 opened this issue Aug 18, 2018 · 0 comments
Open

V8源码-内存管理 #13

tsy77 opened this issue Aug 18, 2018 · 0 comments

Comments

@tsy77
Copy link
Owner

tsy77 commented Aug 18, 2018

本文我们将从源码角度来介绍V8引擎的内存管理部分,主要包括内存分配和垃圾回收。

为了聚焦思想,本文采用的V8比较低的版本0.1.5,这个版本实现起来比较简单,大家比较容易的看出实现思想。

内存分配

V8将内存空间分为几个区域,分别是NewSpace、OldSpace、LargeObjectSpace、MapSpace、CodeSpace,各个space的关系如下图所示:

各个space的作用:

LargeObjectSpace :为了避免大对象的拷贝,使用该空间专门存储大对象(大小超过Normal Page能容纳的对象范围),包括Code、Sequetial String、FixedArray;

MapSpace :存放对象的Map信息,即hidden_class;最大限制为8MB;每个Map对象固定大小,为了快速定位,所以将该空间单独出来;

NewSpace :存放多种类型对象,最大限制为2MB;

CodeSpace :存放预编译代码(?);最大限制为512MB;

Old_Pointer_Space :存放GC后surviving的指针对象;最大限制为512MB;

Old_Data_Space :存放GC后surviving的数据对象;最大限制为512MB;

初始化

首先是内存的初始化,这部分在V8初始化完OS的一些参数之后进行初始化,入口文件在src/heap.cc中。代码如下:

bool Heap::Setup(bool create_heap_objects) {
  // Initialize heap spaces and initial maps and objects. Whenever something
  // goes wrong, just return false. The caller should check the results and
  // call Heap::TearDown() to release allocated memory.
  //
  // If the heap is not yet configured (eg, through the API), configure it.
  // Configuration is based on the flags new-space-size (really the semispace
  // size) and old-space-size if set or the initial values of semispace_size_
  // and old_generation_size_ otherwise.
  if (!heap_configured) {
    if (!ConfigureHeap(FLAG_new_space_size, FLAG_old_space_size)) return false;
  }

  // Setup memory allocator and allocate an initial chunk of memory.  The
  // initial chunk is double the size of the new space to ensure that we can
  // find a pair of semispaces that are contiguous and aligned to their size.
  // 分配堆内存,新生代 + 老生代
  // setup chunks
  // MemoryAllocator为单例
  if (!MemoryAllocator::Setup(MaxCapacity())) return false;
  // 预留2 * young_generation_size_虚拟内存,MemoryAllocator::initial_chunk_
  void* chunk
      = MemoryAllocator::ReserveInitialChunk(2 * young_generation_size_);
  if (chunk == NULL) return false;

  // Put the initial chunk of the old space at the start of the initial
  // chunk, then the two new space semispaces, then the initial chunk of
  // code space.  Align the pair of semispaces to their size, which must be
  // a power of 2.
  ASSERT(IsPowerOf2(young_generation_size_));
  Address old_space_start = reinterpret_cast<Address>(chunk);
  // 方向沿低地址区域
  Address new_space_start = RoundUp(old_space_start, young_generation_size_);
  Address code_space_start = new_space_start + young_generation_size_;
  int old_space_size = new_space_start - old_space_start;
  int code_space_size = young_generation_size_ - old_space_size;

  // Initialize new space.
  new_space_ = new NewSpace(initial_semispace_size_, semispace_size_);
  if (new_space_ == NULL) return false;
  // mmap申请from_space、to_space
  if (!new_space_->Setup(new_space_start, young_generation_size_)) return false;

  // Initialize old space, set the maximum capacity to the old generation
  // size.
  // pagedSpace.setup
  old_space_ = new OldSpace(old_generation_size_, OLD_SPACE);
  if (old_space_ == NULL) return false;
  if (!old_space_->Setup(old_space_start, old_space_size)) return false;

  // Initialize the code space, set its maximum capacity to the old
  // generation size.
  code_space_ = new OldSpace(old_generation_size_, CODE_SPACE);
  if (code_space_ == NULL) return false;
  if (!code_space_->Setup(code_space_start, code_space_size)) return false;

  // Initialize map space.
  map_space_ = new MapSpace(kMaxMapSpaceSize);
  if (map_space_ == NULL) return false;
  // Setting up a paged space without giving it a virtual memory range big
  // enough to hold at least a page will cause it to allocate.
  if (!map_space_->Setup(NULL, 0)) return false;

  lo_space_ = new LargeObjectSpace();
  if (lo_space_ == NULL) return false;
  if (!lo_space_->Setup()) return false;

  if (create_heap_objects) {
    // Create initial maps.
    if (!CreateInitialMaps()) return false;
    if (!CreateApiObjects()) return false;

    // Create initial objects
    if (!CreateInitialObjects()) return false;
  }

  LOG(IntEvent("heap-capacity", Capacity()));
  LOG(IntEvent("heap-available", Available()));

  return true;
}

这里主要做了如下几件事:

1.配置Heap参数,包括young_generation_size_(2MB)和old_generation_size_(512MB),这里老生代基于页的内存管理,old_generation_size_表示老生代的内存页数(每页8KB);

2.MemoryAllocator::Setup,初始化chunk用于管理页,一个chunk拥有64页;

3.预留2 * young_generation_size_ 虚拟内存,地址保存在变量MemoryAllocator::initial_chunk_。注意这里MemoryAllocator是个单例;

4.从上一步分配的虚拟内存开始分配各个内存区域

虚拟内存被分成了new_space_, old_space_, code_space_, map_space_, lo_space_, 各个空间按照下图进行划分:

下面着重给大家讲一下各个内存区域是如何初始化的,初始化的代码在src/spaces.cc中。

NewSpace

bool NewSpace::Setup(Address start, int size) {
  ASSERT(size == 2 * maximum_capacity_);
  ASSERT(IsAddressAligned(start, size, 0));

  if (to_space_ == NULL
      || !to_space_->Setup(start, maximum_capacity_)) {
    return false;
  }
  if (from_space_ == NULL
      || !from_space_->Setup(start + maximum_capacity_, maximum_capacity_)) {
    return false;
  }

  start_ = start;
  address_mask_ = ~(size - 1);
  object_mask_ = address_mask_ | kHeapObjectTag;
  object_expected_ = reinterpret_cast<uint32_t>(start) | kHeapObjectTag;

  allocation_info_.top = to_space_->low();
  allocation_info_.limit = to_space_->high();
  mc_forwarding_info_.top = NULL;
  mc_forwarding_info_.limit = NULL;

  ASSERT_SEMISPACE_ALLOCATION_INFO(allocation_info_, to_space_);
  return true;
}

这里主要初始化了to_space_和from_space_,to_space_和from_space_的类型是SemiSpace,初始化代码如下:

bool SemiSpace::Setup(Address start, int size) {
  ASSERT(size == maximum_capacity_);
  if (!MemoryAllocator::CommitBlock(start, capacity_)) return false;

  start_ = start;
  address_mask_ = ~(size - 1);
  object_mask_ = address_mask_ | kHeapObjectTag;
  object_expected_ = reinterpret_cast<uint32_t>(start) | kHeapObjectTag;

  age_mark_ = start_;
  return true;
}

bool MemoryAllocator::CommitBlock(Address start, size_t size) {
  ASSERT(start != NULL);
  ASSERT(size > 0);
  ASSERT(initial_chunk_ != NULL);
  ASSERT(initial_chunk_->address() <= start);
  ASSERT(start + size <= reinterpret_cast<Address>(initial_chunk_->address())
                             + initial_chunk_->size());

  // mmap
  if (!initial_chunk_->Commit(start, size)) return false;
  Counters::memory_allocated.Increment(size);
  return true;
}

这里主要通过MemoryAllocator::CommitBlock去申请了预留的虚拟内存中的区域,initial_chunk_->Commit实际调用的是VirtualMemory::Commit,代码如下:

bool VirtualMemory::Commit(void* address, size_t size) {
  if (MAP_FAILED == mmap(address, size, PROT_READ | PROT_WRITE | PROT_EXEC,
                         MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED,
                         kMmapFd, kMmapFdOffset)) {
    return false;
  }

  UpdateAllocatedSpaceLimits(address, size);
  return true;
}

起始就是通过mmap开辟了一块虚拟内存,至于mmap和malloc的关系,大家可以参考Linux内存分配小结--malloc、brk、mmap

OldSpace

OldSpace继承自Pagedspace,old_space_->Setup实际调用的是基类Pagedspace的setup方法,代码如下:

bool PagedSpace::Setup(Address start, size_t size) {
  if (HasBeenSetup()) return false;

  int num_pages = 0;
  // Try to use the virtual memory range passed to us.  If it is too small to
  // contain at least one page, ignore it and allocate instead.
  // 如果在预留的虚拟内存里
  if (PagesInChunk(start, size) > 0) {
    first_page_ = MemoryAllocator::CommitPages(start, size, this, &num_pages);
  } else {
    // 申请多少页
    int requested_pages = Min(MemoryAllocator::kPagesPerChunk,
                              max_capacity_ / Page::kObjectAreaSize);
    first_page_ =
        MemoryAllocator::AllocatePages(requested_pages, &num_pages, this);
    if (!first_page_->is_valid()) return false;
  }

  // We are sure that the first page is valid and that we have at least one
  // page.
  ASSERT(first_page_->is_valid());
  ASSERT(num_pages > 0);
  accounting_stats_.ExpandSpace(num_pages * Page::kObjectAreaSize);
  ASSERT(Capacity() <= max_capacity_);

  for (Page* p = first_page_; p->is_valid(); p = p->next_page()) {
    // 用于Mack-compact内存回收
    p->ClearRSet();
  }

  // Use first_page_ for allocation.
  SetAllocationInfo(&allocation_info_, first_page_);

  return true;
}

这里做了如下几件事:

1.判断预留的虚拟内存里是否可以容纳
	a.可以容纳,MemoryAllocator::CommitPages,直接在预留虚拟内存中分配
	b.空间不足,MemoryAllocator::AllocatePages申请虚拟内存
2.遍历所有页,标记,用于后面垃圾回收

下面主要介绍下MemoryAllocator::CommitPagesMemoryAllocator::AllocatePages

MemoryAllocator::CommitPages在预留的虚拟内存里可以容纳OLD_SPACE时调用,代码如下:

Page* MemoryAllocator::CommitPages(Address start, size_t size,
                                   PagedSpace* owner, int* num_pages) {
  ASSERT(start != NULL);
  *num_pages = PagesInChunk(start, size);
  ASSERT(*num_pages > 0);
  ASSERT(initial_chunk_ != NULL);
  ASSERT(initial_chunk_->address() <= start);
  ASSERT(start + size <= reinterpret_cast<Address>(initial_chunk_->address())
                             + initial_chunk_->size());

  if (!initial_chunk_->Commit(start, size)) {
    return Page::FromAddress(NULL);
  }
  Counters::memory_allocated.Increment(size);

  // So long as we correctly overestimated the number of chunks we should not
  // run out of chunk ids.
  CHECK(!OutOfChunkIds());
  int chunk_id = Pop();
  chunks_[chunk_id].init(start, size, owner);
  return InitializePagesInChunk(chunk_id, *num_pages, owner);
}

这里主要做了两件事:

1.申请预留的虚拟内存,initial_chunk_->Commit,也就是前面介绍过的VirtualMemory::Commit
2.初始化chunks_,这里强调下chunks_是一个chunkinfo类型的数组,里面存储着每个chunk的信息。

MemoryAllocator::AllocatePages在预留的虚拟内存里不足以容纳OLD_SPACE时调用,代码如下:

Page* MemoryAllocator::AllocatePages(int requested_pages, int* allocated_pages,
                                     PagedSpace* owner) {
  if (requested_pages <= 0) return Page::FromAddress(NULL);
  size_t chunk_size = requested_pages * Page::kPageSize;

  // There is not enough space to guarantee the desired number pages can be
  // allocated.
  // 没有足够空间,那就有多大申请多大
  if (size_ + static_cast<int>(chunk_size) > capacity_) {
    // Request as many pages as we can.
    chunk_size = capacity_ - size_;
    requested_pages = chunk_size >> Page::kPageSizeBits;

    if (requested_pages <= 0) return Page::FromAddress(NULL);
  }

  void* chunk = AllocateRawMemory(chunk_size, &chunk_size);
  if (chunk == NULL) return Page::FromAddress(NULL);
  LOG(NewEvent("PagedChunk", chunk, chunk_size));

  *allocated_pages = PagesInChunk(static_cast<Address>(chunk), chunk_size);
  // 不够一页的化,申请无效,释放虚拟内存munmap
  if (*allocated_pages == 0) {
    FreeRawMemory(chunk, chunk_size);
    LOG(DeleteEvent("PagedChunk", chunk));
    return Page::FromAddress(NULL);
  }

  // 初始化新的chunk
  int chunk_id = Pop();
  chunks_[chunk_id].init(static_cast<Address>(chunk), chunk_size, owner);

  return InitializePagesInChunk(chunk_id, *allocated_pages, owner);
}

这里主要做了如下几件事:

1.如果超出了最大的capacity_,算出最多还能申请多少,作为申请的size
2.申请内存MemoryAllocator::AllocateRawMemory
3.判断申请的内存空间是否够一页(8KB),不够一页的化,申请无效,释放虚拟内存munmap
4.初始化新的chunk,这一步与MemoryAllocator::AllocatePages一致

下面我们来看下MemoryAllocator::AllocateRawMemory:

void* MemoryAllocator::AllocateRawMemory(const size_t requested,
                                         size_t* allocated) {
  if (size_ + static_cast<int>(requested) > capacity_) return NULL;

  // mmap & UpdateAllocatedSpaceLimits
  void* mem = OS::Allocate(requested, allocated);
  int alloced = *allocated;
  size_ += alloced;
  Counters::memory_allocated.Increment(alloced);
  return mem;
}

调用的OS::Allocate代码如下:

void* OS::Allocate(const size_t requested, size_t* allocated) {
  const size_t msize = RoundUp(requested, getpagesize());
  void* mbase = mmap(NULL, msize, PROT_READ | PROT_WRITE | PROT_EXEC,
                     MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
  if (mbase == MAP_FAILED) {
    LOG(StringEvent("OS::Allocate", "mmap failed"));
    return NULL;
  }
  *allocated = msize;
  UpdateAllocatedSpaceLimits(mbase, msize);
  return mbase;
}

这里也是用的mmap,但是第一个参数也就是起始地址为null,这个参数代表要映射到的内存区域的起始地址,这也是跟刚刚使用预留的虚拟内存不同的地方。

这里多说一句,使用预留的虚拟内存有助于增加读写效率,主要因为预留的申请后,不需要修改物理地址和逻辑地址的映射关系,也就是进程的页表。

其他space

其他space与oldspace类似,都是PagedSpace的子类,这里不再赘述。

分配

在具体讲解内存分配之前,我们先讲解几个概念:

1.page,pagedSpace中,内存以page为单位,一个对象不能跨page存储(page的大小与内存页大小一致)
2.扩展堆内存时,以chunk为单位,一个chunk最多包含64个page,这样做可以减少mmap系统调用次数,有利于提高效率;
3.freelist将每一页中的内部碎片收集起来,这里很像操作系统的内存管理

内存分配的入口在src/heap-inl.h中的Heap::AllocateRaw方法,代码如下:

// 内存分配入口
Object* Heap::AllocateRaw(int size_in_bytes, AllocationSpace space) {
  ASSERT(allocation_allowed_ && gc_state_ == NOT_IN_GC);
#ifdef DEBUG
  if (FLAG_gc_interval >= 0 &&
      !disallow_allocation_failure_ &&
      Heap::allocation_timeout_-- <= 0) {
    return Failure::RetryAfterGC(size_in_bytes, space);
  }
  Counters::objs_since_last_full.Increment();
  Counters::objs_since_last_young.Increment();
#endif
  if (NEW_SPACE == space) {
    return new_space_->AllocateRaw(size_in_bytes);
  }

  Object* result;
  if (OLD_SPACE == space) {
    result = old_space_->AllocateRaw(size_in_bytes);
  } else if (CODE_SPACE == space) {
    result = code_space_->AllocateRaw(size_in_bytes);
  } else if (LO_SPACE == space) {
    result = lo_space_->AllocateRaw(size_in_bytes);
  } else {
    ASSERT(MAP_SPACE == space);
    result = map_space_->AllocateRaw(size_in_bytes);
  }
  if (result->IsFailure()) old_gen_exhausted_ = true;
  return result;
}

这里主要根据不同的空间类型,调用其AllocateRaw方法进行内存分配。

OldSpace

这里首先提及一下OldSpace继承与PagedSpace,所以内存管理是基于页的。

AllocateRaw代码如下:

// Allocates requested bytes. May return Failure if the space is full.
  Object* AllocateRaw(int size_in_bytes) {
    ASSERT_OBJECT_SIZE(size_in_bytes);
    return AllocateRawInternal(size_in_bytes, &allocation_info_);
}

AllocateRawInternal代码如下:

Object* OldSpace::AllocateRawInternal(int size_in_bytes,
                                      AllocationInfo* alloc_info) {
  ASSERT(HasBeenSetup());

  if (allocation_mode_ == LINEAR_ONLY || allocation_mode_ == LINEAR) {
    // Try linear allocation in the current page.
    Address cur_top = alloc_info->top;
    Address new_top = cur_top + size_in_bytes;
    if (new_top <= alloc_info->limit) {
      Object* obj = HeapObject::FromAddress(cur_top);
      alloc_info->top = new_top;
      ASSERT_PAGED_ALLOCATION_INFO(*alloc_info);

      accounting_stats_.AllocateBytes(size_in_bytes);
      ASSERT(Size() <= Capacity());
      return obj;
    }
  } else {
    // For now we should not try free list allocation during m-c relocation.
    // 从free_list中申请
    ASSERT(alloc_info == &allocation_info_);
    int wasted_bytes;
    Object* object = free_list_.Allocate(size_in_bytes, &wasted_bytes);
    accounting_stats_.WasteBytes(wasted_bytes);
    if (!object->IsFailure()) {
      accounting_stats_.AllocateBytes(size_in_bytes);
      return object;
    }
  }
  // Fast allocation failed.
  return SlowAllocateRaw(size_in_bytes, alloc_info);
}

这里主要做了如下的事情:

1.判断分配方式是否为线型
	a.如果是,判断当前页剩余空间是否足够分配,
		i.如果空间足够,如果是则划分该区域,将page->top向前移动
		ii.当前页空间不足,跳转至步骤二(进行SlowAllocateRaw)
	b.如果不是,则从free_list_中分配空间,如果分配不成功,跳转至步骤二
2.步骤一的快速分配失败,执行SlowAllocateRaw

SlowAllocateRaw代码如下:

// Slow cases for AllocateRawInternal.  In linear allocation mode, try
// to allocate in the next page in the space.  If there are no more
// pages, switch to free-list allocation if permitted, otherwise try
// to grow the space.  In free-list allocation mode, try to grow the
// space and switch to linear allocation.
Object* OldSpace::SlowAllocateRaw(int size_in_bytes,
                                  AllocationInfo* alloc_info) {
  if (allocation_mode_ == LINEAR_ONLY || allocation_mode_ == LINEAR) {
    // 最后一页,对内存由低地址向高地址
    Page* top_page = TopPageOf(*alloc_info);
    // Until we implement free-list allocation during global gc, we have two
    // cases: one for normal allocation and one for m-c relocation allocation.
    // first_page
    if (alloc_info == &allocation_info_) {  // Normal allocation.
      // 最后一页还剩多少size
      int free_size = top_page->ObjectAreaEnd() - alloc_info->top;
      // Add the extra space at the top of this page to the free list.
      // 直接挪top
      if (free_size > 0) {
        int wasted_bytes = free_list_.Free(alloc_info->top, free_size);
        accounting_stats_.WasteBytes(wasted_bytes);
        alloc_info->top += free_size;
        ASSERT_PAGED_ALLOCATION_INFO(*alloc_info);
      }

      // Move to the next page in this space if there is one; switch
      // to free-list allocation, if we can; try to expand the space otherwise
      // 挪到下一页
      if (top_page->next_page()->is_valid()) {
        SetAllocationInfo(alloc_info, top_page->next_page());
      }
      // allocation_mode_设置成FREE_LIST,从FREE_LIST里分配内存
      else if (allocation_mode_ == LINEAR) {
        allocation_mode_ = FREE_LIST;
      }
      // expand a chunk
      else if (Expand(top_page)) {
        ASSERT(top_page->next_page()->is_valid());
        SetAllocationInfo(alloc_info, top_page->next_page());
      }
      // 回收内存垃圾并重试
      else {
        return Failure::RetryAfterGC(size_in_bytes, identity());
      }
    } else {  // Allocation during m-c relocation.
      // During m-c 'allocation' while computing forwarding addresses, we do
      // not yet add blocks to the free list because they still contain live
      // objects.  We also cache the m-c forwarding allocation pointer in the
      // current page.

      // If there are no more pages try to expand the space.  This can only
      // happen when promoting objects from the new space.
      if (!top_page->next_page()->is_valid()) {
        if (!Expand(top_page)) {
          return Failure::RetryAfterGC(size_in_bytes, identity());
        }
      }

      // Move to the next page.
      ASSERT(top_page->next_page()->is_valid());
      top_page->mc_relocation_top = alloc_info->top;
      SetAllocationInfo(alloc_info, top_page->next_page());
    }
  } else {  // Free-list allocation.
    // We failed to allocate from the free list; try to expand the space and
    // switch back to linear allocation.
    ASSERT(alloc_info == &allocation_info_);
    Page* top_page = TopPageOf(*alloc_info);
    if (!top_page->next_page()->is_valid()) {
      if (!Expand(top_page)) {
        return Failure::RetryAfterGC(size_in_bytes, identity());
      }
    }

    // We surely have more pages, move to the next page and switch to linear
    // allocation.
    ASSERT(top_page->next_page()->is_valid());
    SetAllocationInfo(alloc_info, top_page->next_page());
    ASSERT(allocation_mode_ == FREE_LIST);
    allocation_mode_ = LINEAR;
  }

  // Perform the allocation.
  return AllocateRawInternal(size_in_bytes, alloc_info);
}

这里主要做了如下几件事:

1.利用TopPageOf获取到已分配内存的最后一页top_page
2.拿到最后一页剩余的size
3.如果剩余size大于0,直接将剩余空间给free_list_,同时移动page->top到该页的末尾
4.分配内存空间,这里有四种选择
	a.如果top_page->next_page()有效,也就是当前的top_page下一页有效,那么直接分配,跳转到步骤5,执行AllocateRawInternal
	b.分配方式是线性,则将分配方式变为FREE_LIST,跳转到步骤5,也就是后面会从free_list_中分配内存
	c.重新分配一个chunk(有最大空间限制,在最大空间以内,最多分配64页),跳转到步骤5
	d.上述都没成功,则执行RetryAfterGC,垃圾回收后重试
5.AllocateRawInternal

下面重点讲下Expand,其实调用的是基类PagedSpace的Expand方法,代码如下:

bool PagedSpace::Expand(Page* last_page) {
  ASSERT(max_capacity_ % Page::kObjectAreaSize == 0);
  ASSERT(Capacity() % Page::kObjectAreaSize == 0);

  if (Capacity() == max_capacity_) return false;

  ASSERT(Capacity() < max_capacity_);
  // Last page must be valid and its next page is invalid.
  ASSERT(last_page->is_valid() && !last_page->next_page()->is_valid());

  // 可用的页数
  int available_pages = (max_capacity_ - Capacity()) / Page::kObjectAreaSize;
  if (available_pages <= 0) return false;

  // 最大一个chunk
  int desired_pages = Min(available_pages, MemoryAllocator::kPagesPerChunk);
  Page* p = MemoryAllocator::AllocatePages(desired_pages, &desired_pages, this);
  if (!p->is_valid()) return false;

  accounting_stats_.ExpandSpace(desired_pages * Page::kObjectAreaSize);
  ASSERT(Capacity() <= max_capacity_);

  MemoryAllocator::SetNextPage(last_page, p);

  // Clear remembered set of new pages.
  while (p->is_valid()) {
    p->ClearRSet();
    p = p->next_page();
  }

  return true;
}

这里做了如下几件事:

1.获取还可以分配的最多页数available_pages
2.在available_pages和kPagesPerChunk(64)中取最小值,调用MemoryAllocator::AllocatePages分配,MemoryAllocator::AllocatePages上述已经讲解过,用于分配一个chunk并初始化其中的page
3.整个内存page链表,也就是将新分配的内存接到top_page后面

下面我们总结下oldSpace内存分配的流程图:

NewSpace

NewSpace::AllocateRawInternal代码如下:

Object* NewSpace::AllocateRawInternal(int size_in_bytes,
                                      AllocationInfo* alloc_info) {
  Address new_top = alloc_info->top + size_in_bytes;
  if (new_top > alloc_info->limit) {
    return Failure::RetryAfterGC(size_in_bytes, NEW_SPACE);
  }

  Object* obj = HeapObject::FromAddress(alloc_info->top);
  alloc_info->top = new_top;
#ifdef DEBUG
  SemiSpace* space =
      (alloc_info == &allocation_info_) ? to_space_ : from_space_;
  ASSERT(space->low() <= alloc_info->top
         && alloc_info->top <= space->high()
         && alloc_info->limit == space->high());
#endif
  return obj;
}

这里没有page的概念,直接移动top指针就好,空间不足则直接RetryAfterGC

LargeObjectSpace

LargeObjectSpace::AllocateRawInternal代码如下:

Object* LargeObjectSpace::AllocateRawInternal(int requested_size,
                                              int object_size) {
  ASSERT(0 < object_size && object_size <= requested_size);
  size_t chunk_size;
  LargeObjectChunk* chunk =
      LargeObjectChunk::New(requested_size, &chunk_size);
  if (chunk == NULL) {
    return Failure::RetryAfterGC(requested_size, LO_SPACE);
  }

  size_ += chunk_size;
  page_count_++;
  chunk->set_next(first_chunk_);
  chunk->set_size(chunk_size);
  first_chunk_ = chunk;

  // Set the object address and size in the page header and clear its
  // remembered set.
  Page* page = Page::FromAddress(RoundUp(chunk->address(), Page::kPageSize));
  Address object_address = page->ObjectAreaStart();
  // Clear the low order bit of the second word in the page to flag it as a
  // large object page.  If the chunk_size happened to be written there, its
  // low order bit should already be clear.
  ASSERT((chunk_size & 0x1) == 0);
  page->is_normal_page &= ~0x1;
  page->ClearRSet();
  int extra_bytes = requested_size - object_size;
  if (extra_bytes > 0) {
    // The extra memory for the remembered set should be cleared.
    memset(object_address + object_size, 0, extra_bytes);
  }

  return HeapObject::FromAddress(object_address);
}

LargeObjectChunk* LargeObjectChunk::New(int size_in_bytes,
                                        size_t* chunk_size) {
  size_t requested = ChunkSizeFor(size_in_bytes);
  void* mem = MemoryAllocator::AllocateRawMemory(requested, chunk_size);
  if (mem == NULL) return NULL;
  LOG(NewEvent("LargeObjectChunk", mem, *chunk_size));
  if (*chunk_size < requested) {
    MemoryAllocator::FreeRawMemory(mem, *chunk_size);
    LOG(DeleteEvent("LargeObjectChunk", mem));
    return NULL;
  }
  return reinterpret_cast<LargeObjectChunk*>(mem);
}

由于该空间中每个Page都只会存放一个对象,所以当申请内存块时,直接通过MemoryAllocator::AllocateRawMemory分出一块对象大小的内存,并加入到该空间的内存块管理链表中就可以了。

内存析构

v8实例销毁时会调用V8::TearDown,其中会调用Heap::TearDown,Heap::TearDown代码如下:

oid Heap::TearDown() {
  GlobalHandles::TearDown();

  if (new_space_ != NULL) {
    new_space_->TearDown();
    delete new_space_;
    new_space_ = NULL;
  }

  if (old_space_ != NULL) {
    old_space_->TearDown();
    delete old_space_;
    old_space_ = NULL;
  }

  if (code_space_ != NULL) {
    code_space_->TearDown();
    delete code_space_;
    code_space_ = NULL;
  }

  if (map_space_ != NULL) {
    map_space_->TearDown();
    delete map_space_;
    map_space_ = NULL;
  }

  if (lo_space_ != NULL) {
    lo_space_->TearDown();
    delete lo_space_;
    lo_space_ = NULL;
  }

  MemoryAllocator::TearDown();
}

其实就是把各自空间free掉。

垃圾回收

当对象申请内存空间失败,就会调用Failure::RetryAfterGC,这时会开始进行内存清理。垃圾回收的入口在src/heap-inl.h中,代码如下:

// Do not use the identifier __object__ in a call to this macro.
//
// Call the function FUNCTION_CALL.  If it fails with a RetryAfterGC
// failure, call the garbage collector and retry the function.  If the
// garbage collector cannot reclaim the required space or the second
// call fails with a RetryAfterGC failure, fail with out of memory.
// If there is any other failure, return a null handle.  If either
// call succeeds, return a handle to the functions return value.
//
// Note that this macro always returns or raises a fatal error.
#define CALL_HEAP_FUNCTION(FUNCTION_CALL, TYPE)                              \
  do {                                                                       \
    GC_GREEDY_CHECK();                                                       \
    Object* __object__ = FUNCTION_CALL;                                      \
    if (__object__->IsFailure()) {                                           \
      if (__object__->IsRetryAfterGC()) {                                    \
        if (!Heap::CollectGarbage(                                           \
                Failure::cast(__object__)->requested(),                      \
                Failure::cast(__object__)->allocation_space())) {            \
          /* TODO(1181417): Fix this. */                                     \
          v8::internal::V8::FatalProcessOutOfMemory("CALL_HEAP_FUNCTION");   \
        }                                                                    \
        __object__ = FUNCTION_CALL;                                          \
        if (__object__->IsFailure()) {                                       \
          if (__object__->IsRetryAfterGC()) {                                \
            /* TODO(1181417): Fix this. */                                   \
            v8::internal::V8::FatalProcessOutOfMemory("CALL_HEAP_FUNCTION"); \
          }                                                                  \
          return Handle<TYPE>();                                             \
        }                                                                    \
      } else {                                                               \
        return Handle<TYPE>();                                               \
      }                                                                      \
    }                                                                        \
    return Handle<TYPE>(TYPE::cast(__object__));                             \
  } while (false)

其中调用了Heap::CollectGarbage,代码如下:

bool Heap::CollectGarbage(int requested_size, AllocationSpace space) {
  // The VM is in the GC state until exiting this function.
  VMState state(GC);

#ifdef DEBUG
  // Reset the allocation timeout to the GC interval, but make sure to
  // allow at least a few allocations after a collection. The reason
  // for this is that we have a lot of allocation sequences and we
  // assume that a garbage collection will allow the subsequent
  // allocation attempts to go through.
  allocation_timeout_ = Max(6, FLAG_gc_interval);
#endif

  { GCTracer tracer;
    GarbageCollectionPrologue();

    GarbageCollector collector = SelectGarbageCollector(space);
    tracer.set_collector(collector);

    StatsRate* rate = (collector == SCAVENGER)
        ? &Counters::gc_scavenger
        : &Counters::gc_compactor;
    rate->Start();
    PerformGarbageCollection(space, collector);
    rate->Stop();

    GarbageCollectionEpilogue();
  }


#ifdef ENABLE_LOGGING_AND_PROFILING
  if (FLAG_log_gc) HeapProfiler::WriteSample();
#endif

  switch (space) {
    case NEW_SPACE:
      return new_space_->Available() >= requested_size;
    case OLD_SPACE:
      return old_space_->Available() >= requested_size;
    case CODE_SPACE:
      return code_space_->Available() >= requested_size;
    case MAP_SPACE:
      return map_space_->Available() >= requested_size;
    case LO_SPACE:
      return lo_space_->Available() >= requested_size;
  }
  return false;
}

这里主要做了两件事:

1.SelectGarbageCollector,选择垃圾回收器
2.PerformGarbageCollection,执行垃圾回收

Heap::SelectGarbageCollector选择垃圾回收器的代码如下:

GarbageCollector Heap::SelectGarbageCollector(AllocationSpace space) {
  // Is global GC requested?
  if (space != NEW_SPACE || FLAG_gc_global) {
    Counters::gc_compactor_caused_by_request.Increment();
    return MARK_COMPACTOR;
  }

  // Is enough data promoted to justify a global GC?
  if (PromotedSpaceSize() > promoted_space_limit_) {
    Counters::gc_compactor_caused_by_promoted_data.Increment();
    return MARK_COMPACTOR;
  }

  // Have allocation in OLD and LO failed?
  if (old_gen_exhausted_) {
    Counters::gc_compactor_caused_by_oldspace_exhaustion.Increment();
    return MARK_COMPACTOR;
  }

  // Is there enough space left in OLD to guarantee that a scavenge can
  // succeed?
  //
  // Note that old_space_->MaxAvailable() undercounts the memory available
  // for object promotion. It counts only the bytes that the memory
  // allocator has not yet allocated from the OS and assigned to any space,
  // and does not count available bytes already in the old space or code
  // space.  Undercounting is safe---we may get an unrequested full GC when
  // a scavenge would have succeeded.
  if (old_space_->MaxAvailable() <= new_space_->Size()) {
    Counters::gc_compactor_caused_by_oldspace_exhaustion.Increment();
    return MARK_COMPACTOR;
  }

  // Default
  return SCAVENGER;
}

垃圾回收算法主要有MarkCompact和Scavenge两种,这里有四种情况会返回MARK_COMPACTOR垃圾回收器,其余情况会返回SCAVENGER。四种情况分别是:

1.space不是NEW_SPACE或者是全局GC
2.提升空间(PromotedSpaceSize)的剩余空间大于提升空间的最大限制
3.之前在old_space_或lo_space_中分配失败
4.old_space_剩余空间小于new_space_的空间

Heap::PerformGarbageCollection执行垃圾回收的代码如下:

void Heap::PerformGarbageCollection(AllocationSpace space,
                                    GarbageCollector collector) {
  if (collector == MARK_COMPACTOR && global_gc_prologue_callback_) {
    ASSERT(!allocation_allowed_);
    global_gc_prologue_callback_();
  }

  if (collector == MARK_COMPACTOR) {
    MarkCompact();

    int promoted_space_size = PromotedSpaceSize();
    promoted_space_limit_ =
        promoted_space_size + Max(2 * MB, (promoted_space_size/100) * 35);
    old_gen_exhausted_ = false;

    // If we have used the mark-compact collector to collect the new
    // space, and it has not compacted the new space, we force a
    // separate scavenge collection.  THIS IS A HACK.  It covers the
    // case where (1) a new space collection was requested, (2) the
    // collector selection policy selected the mark-compact collector,
    // and (3) the mark-compact collector policy selected not to
    // compact the new space.  In that case, there is no more (usable)
    // free space in the new space after the collection compared to
    // before.
    if (space == NEW_SPACE && !MarkCompactCollector::HasCompacted()) {
      Scavenge();
    }
  } else {
    Scavenge();
  }
  Counters::objs_since_last_young.Set(0);

  // Process weak handles post gc.
  GlobalHandles::PostGarbageCollectionProcessing();

  if (collector == MARK_COMPACTOR && global_gc_epilogue_callback_) {
    ASSERT(!allocation_allowed_);
    global_gc_epilogue_callback_();
  }
}

这里主要根据选出的collector,在不同的space中执行不同的垃圾回收算法(MarkCompact或Scavenge)。

Scavenge

下面我们来看一下这两种垃圾回收算法:

新生代使用Scavenge算法进行回收。在Scavenge算法的实现中,主要采用了Cheney算法。

Cheney算法算法是一种采用复制的方式实现的垃圾回收算法。它将内存一分为二,每一部分空间称为semispace。在这两个semispace中,一个处于使用状态,另一个处于闲置状态。处于使用状态的semispace空间称为From空间,处于闲置状态的空间称为To空间,当我们分配对象时,先是在From空间中进行分配。当开始进行垃圾回收算法时,会检查From空间中的存活对象,这些存活对象将会被复制到To空间中(复制完成后会进行紧缩),而非活跃对象占用的空间将会被释放。完成复制后,From空间和To空间的角色发生对换。也就是说,在垃圾回收的过程中,就是通过将存活对象在两个semispace之间进行复制。可以很容易看出来,使用Cheney算法时,总有一半的内存是空的。但是由于新生代很小,所以浪费的内存空间并不大。而且由于新生代中的对象绝大部分都是非活跃对象,需要复制的活跃对象比例很小,所以其时间效率十分理想。复制的过程采用的是BFS(广度优先遍历)的思想,从根对象出发,广度优先遍历所有能到达的对象

需要注意的是,v8中的from_space_和to_space_与算法中描述的正好相反,对象在to_space_中分配,from_space_作为复制的目标空间。

下面是Heap::Scavenge的代码:

void Heap::Scavenge() {
#ifdef DEBUG
  if (FLAG_enable_slow_asserts) {
    VerifyCodeSpacePointersVisitor v;
    HeapObjectIterator it(code_space_);
    while (it.has_next()) {
      HeapObject* object = it.next();
      if (object->IsCode()) {
        Code::cast(object)->ConvertICTargetsFromAddressToObject();
      }
      object->Iterate(&v);
      if (object->IsCode()) {
        Code::cast(object)->ConvertICTargetsFromObjectToAddress();
      }
    }
  }
#endif

  gc_state_ = SCAVENGE;

  // Implements Cheney's copying algorithm
  LOG(ResourceEvent("scavenge", "begin"));


  // 为了避免newspace由于空间过小儿引起频繁地scavenge,于是在每次scavenge之前检查次数,如果超过限制次数(初始为8)且newspace能满足空间翻倍(初始为256KB,最大为2MB),则double空间以及该次数限制。这里的策略调整可以根据实际优化;
  scavenge_count_++;
  if (new_space_->Capacity() < new_space_->MaximumCapacity() &&
      scavenge_count_ > new_space_growth_limit_) {
    // Double the size of the new space, and double the limit.  The next
    // doubling attempt will occur after the current new_space_growth_limit_
    // more collections.
    // TODO(1240712): NewSpace::Double has a return value which is
    // ignored here.
    new_space_->Double();
    new_space_growth_limit_ *= 2;
  }

  // Flip the semispaces.  After flipping, to space is empty, from space has
  // live objects.
  // 交换from_space和to_space
  // 两个semispace的信息互换
  new_space_->Flip();
  // 重置allocation_info_
  new_space_->ResetAllocationInfo();

  // We need to sweep newly copied objects which can be in either the to space
  // or the old space.  For to space objects, we use a mark.  Newly copied
  // objects lie between the mark and the allocation top.  For objects
  // promoted to old space, we write their addresses downward from the top of
  // the new space.  Sweeping newly promoted objects requires an allocation
  // pointer and a mark.  Note that the allocation pointer 'top' actually
  // moves downward from the high address in the to space.
  //
  // There is guaranteed to be enough room at the top of the to space for the
  // addresses of promoted objects: every object promoted frees up its size in
  // bytes from the top of the new space, and objects are at least one pointer
  // in size.  Using the new space to record promoted addresses makes the
  // scavenge collector agnostic to the allocation strategy (eg, linear or
  // free-list) used in old space.
  // promoted object的指针在new_space中从后向前记录
  // to_space_->low()
  Address new_mark = new_space_->ToSpaceLow();
  Address promoted_mark = new_space_->ToSpaceHigh();
  promoted_top = new_space_->ToSpaceHigh();

  CopyVisitor copy_visitor;
  // Copy roots.
  IterateRoots(&copy_visitor);

  // Copy objects reachable from the old generation.  By definition, there
  // are no intergenerational pointers in code space.
  IterateRSet(old_space_, &CopyObject);
  IterateRSet(map_space_, &CopyObject);
  lo_space_->IterateRSet(&CopyObject);

  bool has_processed_weak_pointers = false;

  while (true) {
    ASSERT(new_mark <= new_space_->top());
    ASSERT(promoted_mark >= promoted_top);

    // Copy objects reachable from newly copied objects.
    // 广度优先遍历
    // 相等的时候停止
    // allocation_info_.top
    while (new_mark < new_space_->top() || promoted_mark > promoted_top) {
      // Sweep newly copied objects in the to space.  The allocation pointer
      // can change during sweeping.
      Address previous_top = new_space_->top();
      SemiSpaceIterator new_it(new_space_, new_mark);
      while (new_it.has_next()) {
        new_it.next()->Iterate(&copy_visitor);
      }
      new_mark = previous_top;

      // Sweep newly copied objects in the old space.  The promotion 'top'
      // pointer could change during sweeping.
      previous_top = promoted_top;
      for (Address current = promoted_mark - kPointerSize;
           current >= previous_top;
           current -= kPointerSize) {
        HeapObject* object = HeapObject::cast(Memory::Object_at(current));
        object->Iterate(&copy_visitor);
        UpdateRSet(object);
      }
      promoted_mark = previous_top;
    }

    if (has_processed_weak_pointers) break;  // We are done.
    // Copy objects reachable from weak pointers.
    GlobalHandles::IterateWeakRoots(&copy_visitor);
    has_processed_weak_pointers = true;
  }

  // Set age mark.
  new_space_->set_age_mark(new_mark);

  LOG(ResourceEvent("scavenge", "end"));

  gc_state_ = NOT_IN_GC;
}

这里其实就是依据上述的算法思想来执行相应逻辑,主要做了如下几件事:

1.为了避免newspace由于空间过小引起频繁地scavenge,每次scavenge之前检查已经scavenge的次数,如果超过限制次数(初始为8)且newspace能满足空间翻倍(初始为256KB,最大为2MB),则double空间以及该次数限制。
2.交换from_space和to_space,两个semispace的信息互换
3.重置allocation_info_
4.IterateRoots,拷贝from_space(交换前的to_space)中的root对象(包括strong_root_list、struct_map、symbol、bootstrapper、top、debug、compilation cache、handlescope、builtins、globalhandles、threadmanager等)到to_space
5.在to_space中广度优先遍历各个节点,对to_space中引用的其他对象执行复制操作
6.最后再对global handle list中处于weak或pengding状态的对象进行拷贝

下面我们挑几个重要的点详细讲解一下。

IterateRoots

IterateRoots(&copy_visitor)用来将根对象拷贝到to_space_,具体代码如下:

void Heap::IterateRoots(ObjectVisitor* v) {
  IterateStrongRoots(v);
  // copy object
  v->VisitPointer(reinterpret_cast<Object**>(&symbol_table_));
  SYNCHRONIZE_TAG("symbol_table");
}

void Heap::IterateStrongRoots(ObjectVisitor* v) {
#define ROOT_ITERATE(type, name) \
  v->VisitPointer(reinterpret_cast<Object**>(&name##_));
  STRONG_ROOT_LIST(ROOT_ITERATE);
#undef ROOT_ITERATE
  SYNCHRONIZE_TAG("strong_root_list");

#define STRUCT_MAP_ITERATE(NAME, Name, name) \
  v->VisitPointer(reinterpret_cast<Object**>(&name##_map_));
  STRUCT_LIST(STRUCT_MAP_ITERATE);
#undef STRUCT_MAP_ITERATE
  SYNCHRONIZE_TAG("struct_map");

#define SYMBOL_ITERATE(name, string) \
  v->VisitPointer(reinterpret_cast<Object**>(&name##_));
  SYMBOL_LIST(SYMBOL_ITERATE)
#undef SYMBOL_ITERATE
  SYNCHRONIZE_TAG("symbol");

  Bootstrapper::Iterate(v);
  SYNCHRONIZE_TAG("bootstrapper");
  Top::Iterate(v);
  SYNCHRONIZE_TAG("top");
  Debug::Iterate(v);
  SYNCHRONIZE_TAG("debug");

  // Iterate over local handles in handle scopes.
  HandleScopeImplementer::Iterate(v);
  SYNCHRONIZE_TAG("handlescope");

  // Iterate over the builtin code objects and code stubs in the heap. Note
  // that it is not strictly necessary to iterate over code objects on
  // scavenge collections.  We still do it here because this same function
  // is used by the mark-sweep collector and the deserializer.
  Builtins::IterateBuiltins(v);
  SYNCHRONIZE_TAG("builtins");

  // Iterate over global handles.
  GlobalHandles::IterateRoots(v);
  SYNCHRONIZE_TAG("globalhandles");

  // Iterate over pointers being held by inactive threads.
  ThreadManager::Iterate(v);
  SYNCHRONIZE_TAG("threadmanager");
}

这里拷贝了strong_root_list等根对象,拷贝的关键在于v->VisitPointer(reinterpret_cast<Object**>(&name##_));这段逻辑,这段逻辑实际调用的是CopyVisitor::VisitPointer,具体代码如下:

// Helper class for copying HeapObjects
class CopyVisitor: public ObjectVisitor {
 public:

  void VisitPointer(Object** p) {
    CopyObject(p);
  }

  void VisitPointers(Object** start, Object** end) {
    // Copy all HeapObject pointers in [start, end)
    for (Object** p = start; p < end; p++) CopyObject(p);
  }

 private:
  void CopyObject(Object** p) {
    if (!Heap::InFromSpace(*p)) return;
    Heap::CopyObject(reinterpret_cast<HeapObject**>(p));
  }
};

我们可以看到,实际调用的是Heap::CopyObject,代码如下:

void Heap::CopyObject(HeapObject** p) {
  ASSERT(InFromSpace(*p));

  HeapObject* object = *p;

  // We use the first word (where the map pointer usually is) of a
  // HeapObject to record the forwarding pointer.  A forwarding pointer can
  // point to the old space, the code space, or the to space of the new
  // generation.
  // 获取类型
  // (reinterpret_cast<byte*>(p) + offset - kHeapObjectTag)
  HeapObject* first_word = object->map();

  // If the first word (where the map pointer is) is not a map pointer, the
  // object has already been copied.  We do not use first_word->IsMap()
  // because we know that first_word always has the heap object tag.
  // 如果被引用的对象(live objects)已经被拷贝到to_space_,则简单地更新引用,通过forwarding pointer指向新的to_space_中的新对象
  if (first_word->map()->instance_type() != MAP_TYPE) {
    *p = first_word;
    return;
  }

  // Optimization: Bypass ConsString objects where the right-hand side is
  // Heap::empty_string().  We do not use object->IsConsString because we
  // already know that object has the heap object tag.
  InstanceType type = Map::cast(first_word)->instance_type();
  if (type < FIRST_NONSTRING_TYPE &&
      String::cast(object)->representation_tag() == kConsStringTag &&
      ConsString::cast(object)->second() == Heap::empty_string()) {
    object = HeapObject::cast(ConsString::cast(object)->first());
    *p = object;
    // After patching *p we have to repeat the checks that object is in the
    // active semispace of the young generation and not already copied.
    if (!InFromSpace(object)) return;
    first_word = object->map();
    if (first_word->map()->instance_type() != MAP_TYPE) {
      *p = first_word;
      return;
    }
    type = Map::cast(first_word)->instance_type();
  }

  int object_size = object->SizeFromMap(Map::cast(first_word));
  Object* result;
  // If the object should be promoted, we try to copy it to old space.
  // 上一次scavenge中survive(两次scavenge都survivi) 或 to_space可用空间少于75%时
  if (ShouldBePromoted(object->address(), object_size)) {
    // Heap numbers and sequential strings are promoted to code space, all
    // other object types are promoted to old space.  We do not use
    // object->IsHeapNumber() and object->IsSeqString() because we already
    // know that object has the heap object tag.
    bool has_pointers =
        type != HEAP_NUMBER_TYPE &&
        (type >= FIRST_NONSTRING_TYPE ||
         String::cast(object)->representation_tag() != kSeqStringTag);
    if (has_pointers) {
      result = old_space_->AllocateRaw(object_size);
    } else {
      result = code_space_->AllocateRaw(object_size);
    }

    if (!result->IsFailure()) {
      // object->set_map()
      // set forwarding pointer
      *p = MigrateObject(p, HeapObject::cast(result), object_size);
      if (has_pointers) {
        // Record the object's address at the top of the to space, to allow
        // it to be swept by the scavenger.
        promoted_top -= kPointerSize;
        Memory::Object_at(promoted_top) = *p;
      } else {
#ifdef DEBUG
        // Objects promoted to the code space should not have pointers to
        // new space.
        VerifyCodeSpacePointersVisitor v;
        (*p)->Iterate(&v);
#endif
      }
      return;
    }
  }

  // The object should remain in new space or the old space allocation failed.
  result = new_space_->AllocateRaw(object_size);
  // Failed allocation at this point is utterly unexpected.
  ASSERT(!result->IsFailure());
  *p = MigrateObject(p, HeapObject::cast(result), object_size);
}

这里主要做了如下几件事:

1.如果被引用的对象(live objects)已经被拷贝到to_space_,则简单地更新引用,通过forwarding pointer指向新的to_space_中的新对象。这里需要注意的是from_space中的对象map pointer指向拷贝到to_space_或old_space中的新对象,这里也是通过这个map pointer来判断是否已经被拷贝
2.判断是对象否需要提升至old_space_或code_space_,如果是,则在相应空间上分配空间并设置promoted_top。这里需要注意的是为了后面的广度优先遍历,需要记录已经提升的对象地址,所以,在to_space_中,会在to_space中从空间末尾开始,从后向前记录提升的对象地址,promoted_top代表提升变量地址的最顶端(反向)
3.最后,如果对象需要在new_space中分配或者old_space空间分配失败,则调用new_space_->AllocateRaw,并设置from_space中原有对象forwarding pointer指向新的to_space_中的新对象

这里需要注意下ShouldBePromoted方法,也就是对象晋升的条件,代码如下:

bool Heap::ShouldBePromoted(Address old_address, int object_size) {
  // An object should be promoted if:
  // - the object has survived a scavenge operation or
  // - to space is already 25% full.
  return old_address < new_space_->age_mark()
      || (new_space_->Size() + object_size) >= (new_space_->Capacity() >> 2);
}

从上述代码中可以看出,有两个条件可以触发变量提升:

1.To空间已经被使用了超过25%
2.对象在上一次scavenge中survive(两次scavenge都survive)

这里new_space_->age_mark()实际记录的是new_mark,记录的是to_space中以分配空间的最顶端,这里表示copy过去的对象的最顶端,后面介绍广度优先遍历to_space_大家会看到。

广度优先遍历to_space_

广度优先遍历to_space中已有的对象指针,其中包含新拷贝到to_space_中的对象和新拷贝到old_space_中的对象。其在to_space里遍历新拷贝到to_space_中的对象和新拷贝到old_space_中的对象时,各用到两个指针(mark、top)来表示遍历的位置,当四个指针两两重合时,遍历结束。

具体代码如下:

// Copy objects reachable from newly copied objects.
    // 广度优先遍历
    // 相等的时候停止
    // allocation_info_.top
    while (new_mark < new_space_->top() || promoted_mark > promoted_top) {
      // Sweep newly copied objects in the to space.  The allocation pointer
      // can change during sweeping.
      Address previous_top = new_space_->top();
      SemiSpaceIterator new_it(new_space_, new_mark);
      while (new_it.has_next()) {
        new_it.next()->Iterate(&copy_visitor);
      }
      new_mark = previous_top;

      // Sweep newly copied objects in the old space.  The promotion 'top'
      // pointer could change during sweeping.
      previous_top = promoted_top;
      for (Address current = promoted_mark - kPointerSize;
           current >= previous_top;
           current -= kPointerSize) {
        HeapObject* object = HeapObject::cast(Memory::Object_at(current));
        object->Iterate(&copy_visitor);
        UpdateRSet(object);
      }
      promoted_mark = previous_top;
    }

为了解释几个指针的作用,看如下场景:

看了上面的一堆东西,大家可能会看的云里雾里,可以看下浅谈V8引擎中的垃圾回收机制里面的例子,写的比较清楚。

Mark-Compact

Mark-Compact算法主要包含两个阶段:

1.标记阶段,找到并标记所有live objects
2.整理/清除阶段,在此阶段会通过将live objects复制到一个新的连续空间的方式对堆内存进行整理

Mark-Compact代码如下:

void Heap::MarkCompact() {
  gc_state_ = MARK_COMPACT;
#ifdef DEBUG
  mc_count_++;
#endif
  LOG(ResourceEvent("markcompact", "begin"));

  MarkCompactPrologue();

  MarkCompactCollector::CollectGarbage();

  MarkCompactEpilogue();

  LOG(ResourceEvent("markcompact", "end"));

  gc_state_ = NOT_IN_GC;

  Shrink();

  Counters::objs_since_last_full.Set(0);
}

这里主要做了垃圾回收的序幕(设置标记位等)、垃圾回收、垃圾回收的收尾(与序幕对应)、空间压缩这几件事,下面将对其进行详细介绍。

MarkCompactPrologue

MarkCompactPrologue代码如下:

void Heap::MarkCompactPrologue() {
  // 清除缓存的编译代码
  RegExpImpl::OldSpaceCollectionPrologue();
  // 对当前线程中所有栈帧中的所有stackhandler进行操作,将其程序计数器(PC)设置为偏移量,代替原来的绝对地址
  Top::MarkCompactPrologue();
  // 对所有线程执行上述操作
  ThreadManager::MarkCompactPrologue();
}

这里主要做了两件事:

1.清除缓存的编译代码
2.对所有线程的栈帧中的所有stack handle重新设置其pc_address(绝对值改成偏移量)

Top::MarkCompactPrologue();调用的代码如下:

void Top::MarkCompactPrologue(ThreadLocalTop* thread) {
  StackFrame::CookFramesForThread(thread);
}

void StackFrame::CookFramesForThread(ThreadLocalTop* thread) {
  ASSERT(!thread->stack_is_cooked());
  for (StackFrameIterator it(thread); !it.done(); it.Advance()) {
    it.frame()->Cook();
  }
  thread->set_stack_is_cooked(true);
}

void StackFrame::Cook() {
  Code* code = FindCode();
  for (StackHandlerIterator it(this, top_handler()); !it.done(); it.Advance()) {
    it.handler()->Cook(code);
  }
  ASSERT(code->contains(pc()));
  set_pc(AddressFrom<Address>(pc() - code->instruction_start()));
}

void StackHandler::Cook(Code* code) {
  ASSERT(code->contains(pc()));
  set_pc(AddressFrom<Address>(pc() - code->instruction_start()));
}

我们从上述代码可以看到,最终调用了StackHandler::set_pc来设置其pc_address。

MarkCompactCollector::CollectGarbage

这里进入到了MarkCompact的垃圾回收阶段,代码如下:

void MarkCompactCollector::CollectGarbage() {
  // 一些准备工作
  Prepare();

  // 从root object开始深度优先遍历,标记live objects
  MarkLiveObjects();

  // 清理LargeObjectSpace,释放un_marked的LargeObject
  SweepLargeObjectSpace();

  if (compacting_collection_) {
    // 分配空间、在old_object的map指针或from_space的对应位置记录new_object地址
    EncodeForwardingAddresses();

    // 更新所有指向live_objects的pointer,使其指向新对象地址
    UpdatePointers();

    // 把原对象内存拷贝至新的对象内存
    // 利用memmove
    RelocateObjects();

    // 重制remember_set
    // remember_set are sparse, faster (eg, binary) search for set bits
    RebuildRSets();

  } else {
    // 回收没有被标记的内存
    SweepSpaces();
  }

  Finish();
}

这里主要做了如下几件事:

1.获取compacting_collection_,决定后面的过程是否需要内存整理
2.从root object开始深度优先遍历,标记live objects
3.清理LargeObjectSpace,释放un_marked的LargeObject
4.判断需要整理,也就是`compacting_collection_`是否为`true`
	a.需要整理
		i.分配空间、在old_object的map指针或from_space的对应位置记录new_object地址
		ii.更新所有指向live_objects的pointer,使其指向新对象地址
		iii.用memmove方法把原对象内存拷贝至新的对象内存
		iv.重制remember_set(remember_set用来快速搜索bit)
	b.不需要整理,只是清除即可
		i.回收没有被标记的内存
5.Finish,清空StubCache。

这里分阶段详细讲解下:

Prepare

Prepare用来获取现有状态是否需要进行内存整理,还是只回收就好。代码如下:

void MarkCompactCollector::Prepare() {
  static const int kFragmentationLimit = 50;  // Percent.
#ifdef DEBUG
  ASSERT(state_ == IDLE);
  state_ = PREPARE_GC;
#endif
  ASSERT(!FLAG_always_compact || !FLAG_never_compact);

  compacting_collection_ = FLAG_always_compact;

  // We compact the old generation if it gets too fragmented (ie, we could
  // recover an expected amount of space by reclaiming the waste and free
  // list blocks).  We always compact when the flag --gc-global is true
  // because objects do not get promoted out of new space on non-compacting
  // GCs.
  // 碎片化严重时进行compact
  // 当--gc-global为true时,进行compact
  if (!compacting_collection_) {
    // 可恢复的空间
    // 整个空间有 Size(已用) + Waste(浪费) + AvailableFree(剩余) 组成
    int old_gen_recoverable = Heap::old_space()->Waste()
                            + Heap::old_space()->AvailableFree()
                            + Heap::code_space()->Waste()
                            + Heap::code_space()->AvailableFree();
    int old_gen_used = old_gen_recoverable
                     + Heap::old_space()->Size()
                     + Heap::code_space()->Size();
    int old_gen_fragmentation = (old_gen_recoverable * 100) / old_gen_used;
    // old_gen_fragmentation > 50
    if (old_gen_fragmentation > kFragmentationLimit) {
      compacting_collection_ = true;
    }
  }

  if (FLAG_never_compact) compacting_collection_ = false;

#ifdef DEBUG
  if (compacting_collection_) {
    // We will write bookkeeping information to the remembered set area
    // starting now.
    // page设置成NOT_IN_USE
    Page::set_rset_state(Page::NOT_IN_USE);
  }
#endif

  Heap::map_space()->PrepareForMarkCompact(compacting_collection_);
  Heap::old_space()->PrepareForMarkCompact(compacting_collection_);
  Heap::code_space()->PrepareForMarkCompact(compacting_collection_);

  Counters::global_objects.Set(0);

#ifdef DEBUG
  live_bytes_ = 0;
  live_young_objects_ = 0;
  live_old_objects_ = 0;
  live_immutable_objects_ = 0;
  live_map_objects_ = 0;
  live_lo_objects_ = 0;
#endif
}

这里碎片化严重时进行compact,具体的计算使用 可回收的空间/总空间,当大于50%时,使用compact。

MarkLiveObjects

MarkLiveObjects用来标记所有的live object,从root object开始深度优先遍历,代码如下:

// 遍历,为live object标记mark
void MarkCompactCollector::MarkLiveObjects() {
#ifdef DEBUG
  ASSERT(state_ == PREPARE_GC);
  state_ = MARK_LIVE_OBJECTS;
#endif
  // The to space contains live objects, the from space is used as a marking
  // stack.
  marking_stack.Initialize(Heap::new_space()->FromSpaceLow(),
                           Heap::new_space()->FromSpaceHigh());

  // 返回marking_stack.is_overflow
  ASSERT(!marking_stack.overflowed());

  // Mark the heap roots, including global variables, stack variables, etc.
  // 遍历根对象,set_mark & push stack
  MarkingVisitor marking_visitor;
  Heap::IterateStrongRoots(&marking_visitor);

  // Take care of the symbol table specially.
  SymbolTable* symbol_table = SymbolTable::cast(Heap::symbol_table());
#ifdef DEBUG
  UpdateLiveObjectCount(symbol_table);
#endif

  // 1. mark the prefix of the symbol table and push the objects on
  // the stack.
  symbol_table->IteratePrefix(&marking_visitor);
  // 2. mark the symbol table without pushing it on the stack.
  set_mark(symbol_table);  // map word is changed.

  bool has_processed_weak_pointers = false;

  // Mark objects reachable from the roots.
  while (true) {
    // 深度优先遍历,标记、入栈
    MarkObjectsReachableFromTopFrame();

    if (!marking_stack.overflowed()) {
      if (has_processed_weak_pointers) break;
      // First we mark weak pointers not yet reachable.
      GlobalHandles::MarkWeakRoots(&MustBeMarked);
      // Then we process weak pointers and process the transitive closure.
      GlobalHandles::IterateWeakRoots(&marking_visitor);
      has_processed_weak_pointers = true;
      continue;
    }

    // The marking stack overflowed, we need to rebuild it by scanning the
    // whole heap.
    marking_stack.clear_overflowed();

    // We have early stops if the stack overflowed again while scanning
    // overflowed objects in a space.
    SemiSpaceIterator new_it(Heap::new_space(), &OverflowObjectSize);
    ScanOverflowedObjects(&new_it);
    if (marking_stack.overflowed()) continue;

    HeapObjectIterator old_it(Heap::old_space(), &OverflowObjectSize);
    ScanOverflowedObjects(&old_it);
    if (marking_stack.overflowed()) continue;

    HeapObjectIterator code_it(Heap::code_space(), &OverflowObjectSize);
    ScanOverflowedObjects(&code_it);
    if (marking_stack.overflowed()) continue;

    HeapObjectIterator map_it(Heap::map_space(), &OverflowObjectSize);
    ScanOverflowedObjects(&map_it);
    if (marking_stack.overflowed()) continue;

    LargeObjectIterator lo_it(Heap::lo_space(), &OverflowObjectSize);
    ScanOverflowedObjects(&lo_it);
  }

  // Prune the symbol table removing all symbols only pointed to by
  // the symbol table.
  SymbolTableCleaner v;
  symbol_table->IterateElements(&v);
  symbol_table->ElementsRemoved(v.PointersRemoved());

#ifdef DEBUG
  if (FLAG_verify_global_gc) VerifyHeapAfterMarkingPhase();
#endif

  // Remove object groups after marking phase.
  GlobalHandles::RemoveObjectGroups();

  // Objects in the active semispace of the young generation will be relocated
  // to the inactive semispace.  Set the relocation info to the beginning of
  // the inactive semispace.
  Heap::new_space()->MCResetRelocationInfo();
}

void MarkCompactCollector::MarkObjectsReachableFromTopFrame() {
  MarkingVisitor marking_visitor;
  do {
    while (!marking_stack.is_empty()) {
      // marking_stack出栈
      HeapObject* obj = marking_stack.Pop();
      ASSERT(Heap::Contains(obj));
      ASSERT(is_marked(obj) && !is_overflowed(obj));

      // Because the object is marked, the map pointer is not tagged as a
      // normal HeapObject pointer, we need to recover the map pointer,
      // then use the map pointer to mark the object body.
      intptr_t map_word = reinterpret_cast<intptr_t>(obj->map());
      Map* map = reinterpret_cast<Map*>(clear_mark_bit(map_word));
      MarkObject(map);
      // 遍历其子节点
      obj->IterateBody(map->instance_type(), obj->SizeFromMap(map),
                       &marking_visitor);
    };
    // Check objects in object groups.
    MarkObjectGroups(&marking_visitor);
  } while (!marking_stack.is_empty());
}

这里主要做了两件事:

1.初始化栈(深度优先遍历使用),这里直接使用new_space的from_space作为标记所用的栈
2.遍历根节点相连对象,标记、入栈
3.将栈中对象一个个pop出来,进行深度优先遍历标记

标记的具体代码如下:

// Mark object pointed to by p.
  void MarkObjectByPointer(Object** p) {
    Object* obj = *p;
    if (!obj->IsHeapObject()) return;

    // Optimization: Bypass ConsString object where right size is
    // Heap::empty_string().
    // Please note this checks performed equals:
    //   object->IsConsString() &&
    //   (ConsString::cast(object)->second() == Heap::empty_string())
    // except the map for the object might be marked.
    intptr_t map_word =
        reinterpret_cast<intptr_t>(HeapObject::cast(obj)->map());
    uint32_t tag =
        (reinterpret_cast<Map*>(clear_mark_bit(map_word)))->instance_type();
    if ((tag < FIRST_NONSTRING_TYPE) &&
        (kConsStringTag ==
         static_cast<StringRepresentationTag>(tag &
                                              kStringRepresentationMask)) &&
        (Heap::empty_string() ==
         reinterpret_cast<String*>(
             reinterpret_cast<ConsString*>(obj)->second()))) {
      // Since we don't have the object start it is impossible to update the
      // remeber set quickly.  Therefore this optimization only is taking
      // place when we can avoid changing.
      Object* first = reinterpret_cast<ConsString*>(obj)->first();
      if (Heap::InNewSpace(obj) || !Heap::InNewSpace(first)) {
        obj = first;
        *p = obj;
      }
    }
    MarkCompactCollector::MarkObject(HeapObject::cast(obj));
  }

主要是通过对象的size是否与Heap::empty_string()相同来判断对象是否是活着的,标记过程中通过obj的map pointer记录标记,然后入栈。

compact

当需要进行整理时,主要进行如下几个步骤:

EncodeForwardingAddresses

EncodeForwardingAddresses用来分配空间并在old_object的map指针或from_space的对应位置记录new_object地址。代码如下

void MarkCompactCollector::EncodeForwardingAddresses() {
  ASSERT(state_ == ENCODE_FORWARDING_ADDRESSES);
  // Compute the forwarding pointers in each space.
  // 分配新空间、在old_object的map指针上记录new_object的地址
  // &mc_allocation_info记录分配信息(top等)
  EncodeForwardingAddressesInPagedSpace<MCAllocateFromOldSpace,
                                        IgnoreNonLiveObject>(
      Heap::old_space());

  EncodeForwardingAddressesInPagedSpace<MCAllocateFromCodeSpace,
                                        LogNonLiveCodeObject>(
      Heap::code_space());

  // Compute new space next to last after the old and code spaces have been
  // compacted.  Objects in new space can be promoted to old or code space.
  // 这里新生代对象有可能提升到老生代
  // 需要注意的是,new space使用from_space来帮助记录新对象地址,因为to_space和from_space空间大小相同,所以用from_space相同offset的位置记录相应的new_object地址的
  EncodeForwardingAddressesInNewSpace();

  // Compute map space last because computing forwarding addresses
  // overwrites non-live objects.  Objects in the other spaces rely on
  // non-live map pointers to get the sizes of non-live objects.
  EncodeForwardingAddressesInPagedSpace<MCAllocateFromMapSpace,
                                        IgnoreNonLiveObject>(
      Heap::map_space());

  // Write relocation info to the top page, so we can use it later.  This is
  // done after promoting objects from the new space so we get the correct
  // allocation top.
  Heap::old_space()->MCWriteRelocationInfoToPage();
  Heap::code_space()->MCWriteRelocationInfoToPage();
  Heap::map_space()->MCWriteRelocationInfoToPage();
}


void MarkCompactCollector::SweepSpaces() {
  ASSERT(state_ == SWEEP_SPACES);
  ASSERT(!IsCompacting());
  // Noncompacting collections simply sweep the spaces to clear the mark
  // bits and free the nonlive blocks (for old and map spaces).  We sweep
  // the map space last because freeing non-live maps overwrites them and
  // the other spaces rely on possibly non-live maps to get the sizes for
  // non-live objects.
  SweepSpace(Heap::old_space(), &DeallocateOldBlock);
  SweepSpace(Heap::code_space(), &DeallocateCodeBlock);
  SweepSpace(Heap::new_space());
  SweepSpace(Heap::map_space(), &DeallocateMapBlock);
}

这里分别对所有内存space进行操作。下面主要介绍对old_space和new_space的操作。

对old_space的分配、标记的代码如下:

template<MarkCompactCollector::AllocationFunction Alloc,
         MarkCompactCollector::ProcessNonLiveFunction ProcessNonLive>
void MarkCompactCollector::EncodeForwardingAddressesInPagedSpace(
    PagedSpace* space) {
  PageIterator it(space, PageIterator::PAGES_IN_USE);
  while (it.has_next()) {
    Page* p = it.next();
    // The offset of each live object in the page from the first live object
    // in the page.
    int offset = 0;
    // 为marked Object分配新内存,在老对象中存储新对象偏移量
    // 以Page(内存页)为单位进行处理
    EncodeForwardingAddressesInRange<Alloc,
                                     EncodeForwardingAddressInPagedSpace,
                                     ProcessNonLive>(
        p->ObjectAreaStart(),
        p->AllocationTop(),
        &offset);
  }
}

这里循环遍历每一页,对其执行EncodeForwardingAddressesInRange方法。具体代码如下:

// Function template that, given a range of addresses (eg, a semispace or a
// paged space page), iterates through the objects in the range to clear
// mark bits and compute and encode forwarding addresses.  As a side effect,
// maximal free chunks are marked so that they can be skipped on subsequent
// sweeps.
//
// The template parameters are an allocation function, a forwarding address
// encoding function, and a function to process non-live objects.
template<MarkCompactCollector::AllocationFunction Alloc,
         MarkCompactCollector::EncodingFunction Encode,
         MarkCompactCollector::ProcessNonLiveFunction ProcessNonLive>
inline void EncodeForwardingAddressesInRange(Address start,
                                             Address end,
                                             int* offset) {
  // The start address of the current free region while sweeping the space.
  // This address is set when a transition from live to non-live objects is
  // encountered.  A value (an encoding of the 'next free region' pointer)
  // is written to memory at this address when a transition from non-live to
  // live objects is encountered.
  Address free_start = NULL;

  // A flag giving the state of the previously swept object.  Initially true
  // to ensure that free_start is initialized to a proper address before
  // trying to write to it.
  bool is_prev_alive = true;

  int object_size;  // Will be set on each iteration of the loop.
  for (Address current = start; current < end; current += object_size) {
    HeapObject* object = HeapObject::FromAddress(current);
    if (is_marked(object)) {
      clear_mark(object);
      object_size = object->Size();

      Object* forwarded = Alloc(object, object_size);
      // Allocation cannot fail, because we are compacting the space.
      ASSERT(!forwarded->IsFailure());
      Encode(object, object_size, forwarded, offset);

#ifdef DEBUG
      if (FLAG_gc_verbose) {
        PrintF("forward %p -> %p.\n", object->address(),
               HeapObject::cast(forwarded)->address());
      }
#endif
      if (!is_prev_alive) {  // Transition from non-live to live.
        EncodeFreeRegion(free_start, current - free_start);
        is_prev_alive = true;
      }
    } else {  // Non-live object.
      object_size = object->Size();
      ProcessNonLive(object);
      if (is_prev_alive) {  // Transition from live to non-live.
        free_start = current;
        is_prev_alive = false;
      }
    }
  }

  // If we ended on a free region, mark it.
  if (!is_prev_alive) EncodeFreeRegion(free_start, end - free_start);
}


EncodeForwardingAddressesInRange中主要做了两件事:

1.分配新内存 Alloc
2.记录新对象地址,Encode

其中Encode调用了传入模版的EncodeForwardingAddressInPagedSpace方法在old_object中标记new_object的地址,代码如下:

// The forwarding address is encoded in the map pointer of the object as an
// offset (in terms of live bytes) from the address of the first live object
// in the page.
// forwarding addres用距离当前页第一个live object的距离来表示
// 存储在对象的map pointer中
inline void EncodeForwardingAddressInPagedSpace(HeapObject* old_object,
                                                int object_size,
                                                Object* new_object,
                                                int* offset) {
  // Record the forwarding address of the first live object if necessary.
  if (*offset == 0) {
    Page::FromAddress(old_object->address())->mc_first_forwarded =
        HeapObject::cast(new_object)->address();
  }

  uint32_t encoded = EncodePointers(old_object->map()->address(), *offset);
  old_object->set_map(reinterpret_cast<Map*>(encoded));
  *offset += object_size;
  ASSERT(*offset <= Page::kObjectAreaSize);
}

这里需要注意的是在old_object所在页的mc_first_forwarded属性上记录了给第一个live object分配的新对象地址,在old_object中的map指针上只记录new_object距离第一个live object的new_object的距离(forwarding address),同时forwarding address还包含当前页的一些信息(page_index等)。

对新生代分配、标记的代码如下:

// Functions to encode the forwarding pointers in each compactable space.
void MarkCompactCollector::EncodeForwardingAddressesInNewSpace() {
  int ignored;
  EncodeForwardingAddressesInRange<MCAllocateFromNewSpace,
                                   EncodeForwardingAddressInNewSpace,
                                   IgnoreNonLiveObject>(
      Heap::new_space()->bottom(),
      Heap::new_space()->top(),
      &ignored);
}

EncodeForwardingAddressesInRange与old_space中的操作一样,分配和记录两项工作,只不过对应函数不同。

新生代内存分配之前讲过,这里与老生代有一点不同的是会优先通过对象晋升来分配新内存,代码如下:

// Try to promote all objects in new space.  Heap numbers and sequential
// strings are promoted to the code space, all others to the old space.
inline Object* MCAllocateFromNewSpace(HeapObject* object, int object_size) {
  bool has_pointers = !object->IsHeapNumber() && !object->IsSeqString();
  Object* forwarded = has_pointers ?
      Heap::old_space()->MCAllocateRaw(object_size) :
      Heap::code_space()->MCAllocateRaw(object_size);

  if (forwarded->IsFailure()) {
    forwarded = Heap::new_space()->MCAllocateRaw(object_size);
  }
  return forwarded;
}

这里可以看到,先在old_space分配,分配失败才会在new_space上分配。

新生代old_object记录new_object地址的方式也跟old_space不同,主要看如下代码:

// The forwarding address is encoded at the same offset as the current
// to-space object, but in from space.
// 再from_space的相同offset的位置记录new_object的地址
// new_object可能提升到老生代,也可能还在新生代
inline void EncodeForwardingAddressInNewSpace(HeapObject* old_object,
                                              int object_size,
                                              Object* new_object,
                                              int* ignored) {
  int offset =
      Heap::new_space()->ToSpaceOffsetForAddress(old_object->address());
  Memory::Address_at(Heap::new_space()->FromSpaceLow() + offset) =
      HeapObject::cast(new_object)->address();
}

这里可以看到,使用的是from_space的相同offset的位置记录new_object的地址。这里利用new space使用from_space来帮助记录新对象地址,因为to_space和from_space空间大小相同,所以用from_space相同offset的位置记录相应的new_object地址的。

UpdatePointers

UpdatePointers代码如下:

void MarkCompactCollector::UpdatePointers() {
#ifdef DEBUG
  ASSERT(state_ == ENCODE_FORWARDING_ADDRESSES);
  state_ = UPDATE_POINTERS;
#endif
  UpdatingVisitor updating_visitor;
  Heap::IterateRoots(&updating_visitor);
  GlobalHandles::IterateWeakRoots(&updating_visitor);

  int live_maps = IterateLiveObjects(Heap::map_space(),
                                     &UpdatePointersInOldObject);
  int live_olds = IterateLiveObjects(Heap::old_space(),
                                     &UpdatePointersInOldObject);
  int live_immutables = IterateLiveObjects(Heap::code_space(),
                                           &UpdatePointersInOldObject);
  int live_news = IterateLiveObjects(Heap::new_space(),
                                     &UpdatePointersInNewObject);

  // Large objects do not move, the map word can be updated directly.
  LargeObjectIterator it(Heap::lo_space());
  while (it.has_next()) UpdatePointersInNewObject(it.next());

  USE(live_maps);
  USE(live_olds);
  USE(live_immutables);
  USE(live_news);

#ifdef DEBUG
  ASSERT(live_maps == live_map_objects_);
  ASSERT(live_olds == live_old_objects_);
  ASSERT(live_immutables == live_immutable_objects_);
  ASSERT(live_news == live_young_objects_);

  if (FLAG_verify_global_gc) VerifyHeapAfterUpdatingPointers();
#endif
}

这里其实是更新所有指向live_object的pointer,使其指向新地址:

1.遍历root object,更新其指针指向新分配的对象
2.遍历所有space,更新其指针

这里挑一些重要的点给大家讲解一下:

获取新地址并更新的操作在MarkCompactCollector::UpdatePointer中,代码如下:

// 获取新地址并更新
void MarkCompactCollector::UpdatePointer(Object** p) {
  // We need to check if p is in to_space.
  if (!(*p)->IsHeapObject()) return;

  HeapObject* obj = HeapObject::cast(*p);
  Address old_addr = obj->address();
  Address new_addr;

  ASSERT(!Heap::InFromSpace(obj));

  if (Heap::new_space()->Contains(obj)) {
    Address f_addr = Heap::new_space()->FromSpaceLow() +
                     Heap::new_space()->ToSpaceOffsetForAddress(old_addr);
    new_addr = Memory::Address_at(f_addr);

#ifdef DEBUG
    ASSERT(Heap::old_space()->Contains(new_addr) ||
           Heap::code_space()->Contains(new_addr) ||
           Heap::new_space()->FromSpaceContains(new_addr));

    if (Heap::new_space()->FromSpaceContains(new_addr)) {
      ASSERT(Heap::new_space()->FromSpaceOffsetForAddress(new_addr) <=
             Heap::new_space()->ToSpaceOffsetForAddress(old_addr));
    }
#endif

  } else if (Heap::lo_space()->Contains(obj)) {
    // Don't move objects in the large object space.
    new_addr = obj->address();

  } else {
    ASSERT(Heap::old_space()->Contains(obj) ||
           Heap::code_space()->Contains(obj) ||
           Heap::map_space()->Contains(obj));

    new_addr = GetForwardingAddressInOldSpace(obj);
    ASSERT(Heap::old_space()->Contains(new_addr) ||
           Heap::code_space()->Contains(new_addr) ||
           Heap::map_space()->Contains(new_addr));

#ifdef DEBUG
    if (Heap::old_space()->Contains(obj)) {
      ASSERT(Heap::old_space()->MCSpaceOffsetForAddress(new_addr) <=
             Heap::old_space()->MCSpaceOffsetForAddress(old_addr));
    } else if (Heap::code_space()->Contains(obj)) {
      ASSERT(Heap::code_space()->MCSpaceOffsetForAddress(new_addr) <=
             Heap::code_space()->MCSpaceOffsetForAddress(old_addr));
    } else {
      ASSERT(Heap::map_space()->MCSpaceOffsetForAddress(new_addr) <=
             Heap::map_space()->MCSpaceOffsetForAddress(old_addr));
    }
#endif
  }

  *p = HeapObject::FromAddress(new_addr);

#ifdef DEBUG
  if (FLAG_gc_verbose) {
    PrintF("update %p : %p -> %p\n",
           reinterpret_cast<Address>(p), old_addr, new_addr);
  }
#endif
}

这里更新指针操作直接赋值就好,主要是是获取地址,其过程如下:

1.如果是新生代对象,直接从from_space中相同offset的地方获取就好
2.老生代通过GetForwardingAddressInOldSpace方法获取

GetForwardingAddressInOldSpace代码如下:

Address MarkCompactCollector::GetForwardingAddressInOldSpace(HeapObject* obj) {
  // Object should either in old or map space.
  uint32_t encoded = reinterpret_cast<uint32_t>(obj->map());

  // Offset to the first live object's forwarding address.
  int offset = DecodeOffset(encoded);
  Address obj_addr = obj->address();

  // Find the first live object's forwarding address.
  Page* p = Page::FromAddress(obj_addr);
  Address first_forwarded = p->mc_first_forwarded;

  // Page start address of forwarded address.
  Page* forwarded_page = Page::FromAddress(first_forwarded);
  int forwarded_offset = forwarded_page->Offset(first_forwarded);

  // Find end of allocation of in the page of first_forwarded.
  Address mc_top = forwarded_page->mc_relocation_top;
  int mc_top_offset = forwarded_page->Offset(mc_top);

  // Check if current object's forward pointer is in the same page
  // as the first live object's forwarding pointer
  // 在当前页
  if (forwarded_offset + offset < mc_top_offset) {
    // In the same page.
    return first_forwarded + offset;
  }

  // 不在当前页属时,顺延至下一页
  // Must be in the next page, NOTE: this may cross chunks.
  Page* next_page = forwarded_page->next_page();
  ASSERT(next_page->is_valid());

  offset -= (mc_top_offset - forwarded_offset);
  offset += Page::kObjectStartOffset;

  ASSERT_PAGE_OFFSET(offset);
  ASSERT(next_page->OffsetToAddress(offset) < next_page->mc_relocation_top);

  return next_page->OffsetToAddress(offset);
}

这里做了如下几件事:

1.获取当前page的mc_first_forwarded,也就是新分配的第一个对象地址
2.取出对应offset
3.判断是否在一页当中
	a.在一页中,直接返回first_forwarded + offset就好
	b.不在一页中(forwarded_offset + offset大于一页),在下一页中分配,这里需要重新更新下offset,然后使用next_page->OffsetToAddress(offset)获取地址
RelocateObjects

RelocateObjects将原对象的拷贝到新对象的内存中,代码如下:

void MarkCompactCollector::RelocateObjects() {
#ifdef DEBUG
  ASSERT(state_ == UPDATE_POINTERS);
  state_ = RELOCATE_OBJECTS;
#endif
  // Relocates objects, always relocate map objects first. Relocating
  // objects in other space relies on map objects to get object size.
  int live_maps = IterateLiveObjects(Heap::map_space(), &RelocateMapObject);
  int live_olds = IterateLiveObjects(Heap::old_space(), &RelocateOldObject);
  int live_immutables =
      IterateLiveObjects(Heap::code_space(), &RelocateCodeObject);
  int live_news = IterateLiveObjects(Heap::new_space(), &RelocateNewObject);

  USE(live_maps);
  USE(live_olds);
  USE(live_immutables);
  USE(live_news);
#ifdef DEBUG
  ASSERT(live_maps == live_map_objects_);
  ASSERT(live_olds == live_old_objects_);
  ASSERT(live_immutables == live_immutable_objects_);
  ASSERT(live_news == live_young_objects_);
#endif

  // Notify code object in LO to convert IC target to address
  // This must happen after lo_space_->Compact
  LargeObjectIterator it(Heap::lo_space());
  while (it.has_next()) { ConvertCodeICTargetToAddress(it.next()); }

  // Flips from and to spaces
  Heap::new_space()->Flip();

  // Sets age_mark to bottom in to space
  Address mark = Heap::new_space()->bottom();
  Heap::new_space()->set_age_mark(mark);

  Heap::new_space()->MCCommitRelocationInfo();
#ifdef DEBUG
  // It is safe to write to the remembered sets as remembered sets on a
  // page-by-page basis after committing the m-c forwarding pointer.
  Page::set_rset_state(Page::IN_USE);
#endif
  Heap::map_space()->MCCommitRelocationInfo();
  Heap::old_space()->MCCommitRelocationInfo();
  Heap::code_space()->MCCommitRelocationInfo();

#ifdef DEBUG
  if (FLAG_verify_global_gc) VerifyHeapAfterRelocatingObjects();
#endif
}

这里对所有空间的对象进行遍历,然后进行复制,复制的代码如下:

int MarkCompactCollector::RelocateMapObject(HeapObject* obj) {
  // decode map pointer (forwarded address)
  uint32_t encoded = reinterpret_cast<uint32_t>(obj->map());
  Address map_addr = DecodeMapPointer(encoded, Heap::map_space());
  ASSERT(Heap::map_space()->Contains(HeapObject::FromAddress(map_addr)));

  // Get forwarding address before resetting map pointer
  Address new_addr = GetForwardingAddressInOldSpace(obj);

  // recover map pointer
  obj->set_map(reinterpret_cast<Map*>(HeapObject::FromAddress(map_addr)));

  // The meta map object may not be copied yet.
  Address old_addr = obj->address();

  if (new_addr != old_addr) {
    memmove(new_addr, old_addr, Map::kSize);  // copy contents
  }

#ifdef DEBUG
  if (FLAG_gc_verbose) {
    PrintF("relocate %p -> %p\n", old_addr, new_addr);
  }
#endif

  return Map::kSize;
}

主要做了两件事:

1.获取新地址(与上面讲解的获取新地址逻辑相同)
2.利用memmove方法对内存空间进行复制
RebuildRSets
void MarkCompactCollector::RebuildRSets() {
#ifdef DEBUG
  ASSERT(state_ == RELOCATE_OBJECTS);
  state_ = REBUILD_RSETS;
#endif
  Heap::RebuildRSets();
}

这里主要对remember_set进行充值,remember_set用于快速(例如,二进制)搜索标记位

SweepSpaces

SweepSpaces用于清理内存空间而不会像compact去重新整理,当然这里的工作的也是在标记的基础上去做的,SweepSpaces入口代码如下:

void MarkCompactCollector::SweepSpaces() {
  ASSERT(state_ == SWEEP_SPACES);
  ASSERT(!IsCompacting());
  // Noncompacting collections simply sweep the spaces to clear the mark
  // bits and free the nonlive blocks (for old and map spaces).  We sweep
  // the map space last because freeing non-live maps overwrites them and
  // the other spaces rely on possibly non-live maps to get the sizes for
  // non-live objects.
  SweepSpace(Heap::old_space(), &DeallocateOldBlock);
  SweepSpace(Heap::code_space(), &DeallocateCodeBlock);
  SweepSpace(Heap::new_space());
  SweepSpace(Heap::map_space(), &DeallocateMapBlock);
}

这里主要对各个空间进行SweepSpace操作,这里同样对pagedSpace和newSpace的操作不同(函数冲载)。

对于pagedSpace,SweepSpace代码如下:

static void SweepSpace(PagedSpace* space, DeallocateFunction dealloc) {
  PageIterator it(space, PageIterator::PAGES_IN_USE);
  // 遍历每一页
  while (it.has_next()) {
    Page* p = it.next();

    bool is_previous_alive = true;
    Address free_start = NULL;
    HeapObject* object;

    for (Address current = p->ObjectAreaStart();
         current < p->AllocationTop();
         current += object->Size()) {
      object = HeapObject::FromAddress(current);
      if (is_marked(object)) {
        clear_mark(object);
        if (MarkCompactCollector::IsCompacting() && object->IsCode()) {
          // If this is compacting collection marked code objects have had
          // their IC targets converted to objects.
          // They need to be converted back to addresses.
          Code::cast(object)->ConvertICTargetsFromObjectToAddress();
        }
        if (!is_previous_alive) {  // Transition from free to live.
          dealloc(free_start, current - free_start);
          is_previous_alive = true;
        }
      } else {
        if (object->IsCode()) {
          LOG(CodeDeleteEvent(Code::cast(object)->address()));
        }
        if (is_previous_alive) {  // Transition from live to free.
          free_start = current;
          is_previous_alive = false;
        }
      }
      // The object is now unmarked for the call to Size() at the top of the
      // loop.
    }

    // If the last region was not live we need to from free_start to the
    // allocation top in the page.
    if (!is_previous_alive) {
      int free_size = p->AllocationTop() - free_start;
      if (free_size > 0) {
        dealloc(free_start, free_size);
      }
    }
  }
}

这里遍历每一页中每一个object,如果没有标记,说明需要清除,调用传入的DeallocateFunction,old_space传入的DeallocateOldBlock方法如下:

void MarkCompactCollector::DeallocateOldBlock(Address start,
                                              int size_in_bytes) {
  Heap::ClearRSetRange(start, size_in_bytes);
  Heap::old_space()->Free(start, size_in_bytes);
}

也就是清空空间,加入到free_list中。

新生代的SweepSpace代码如下:

static void SweepSpace(NewSpace* space) {
  HeapObject* object;
  for (Address current = space->bottom();
       current < space->top();
       current += object->Size()) {
    object = HeapObject::FromAddress(current);
    if (is_marked(object)) {
      clear_mark(object);
    } else {
      // We give non-live objects a map that will correctly give their size,
      // since their existing map might not be live after the collection.
      // 更新对象map,因为其对应的map将再下面的sweepSpace中被释放
      int size = object->Size();
      if (size >= Array::kHeaderSize) {
        object->set_map(Heap::byte_array_map());
        ByteArray::cast(object)->set_length(ByteArray::LengthFor(size));
      } else {
        ASSERT(size == kPointerSize);
        object->set_map(Heap::one_word_filler_map());
      }
      ASSERT(object->Size() == size);
    }
    // The object is now unmarked for the call to Size() at the top of the
    // loop.
  }
}

这里直接更新对象对应的map指针,因为其对应的map将再下面的sweepSpace中被释放。

Finish

Finish用来清空StubCache。代码如下:

void MarkCompactCollector::Finish() {
#ifdef DEBUG
  ASSERT(state_ == SWEEP_SPACES || state_ == REBUILD_RSETS);
  state_ = IDLE;
#endi
  // The stub cache is not traversed during GC; clear the cache to
  // force lazy re-initialization of it. This must be done after the
  // GC, because it relies on the new address of certain old space
  // objects (empty string, illegal builtin).
  StubCache::Clear();
}

Stub一般会含有已优化的代码,来处理某个IC(内联缓存)之前所碰到的特定类型的操作。一旦Stub碰到了优化代码无法解决的操作,它会调用C++运行时代码来进行处理。运行时代码处理了这个操作之后,会生成一个新的Stub,包含解决这个操作的方案(当然也包括之前的其他方案)。

Shrink

Shrink用于空间的收缩,分别对map_space_、old_space_、code_space_进行操作,代码如下:

void Heap::Shrink() {
  // Try to shrink map, old, and code spaces.
  map_space_->Shrink();
  old_space_->Shrink();
  code_space_->Shrink();
}

最终都会调用PagedSpace::Shrink方法,代码如下:

void PagedSpace::Shrink() {
  // Release half of free pages.
  // 释放后一般
  Page* top_page = AllocationTopPage();
  ASSERT(top_page->is_valid());

  // Loop over the pages from the top page to the end of the space to count
  // the number of pages to keep and find the last page to keep.
  int free_pages = 0;
  int pages_to_keep = 0;  // Of the free pages.
  Page* last_page_to_keep = top_page;
  Page* current_page = top_page->next_page();
  // Loop over the pages to the end of the space.
  while (current_page->is_valid()) {
    // Keep every odd-numbered page, one page for every two in the space.
    if ((free_pages & 0x1) == 1) {
      pages_to_keep++;
      last_page_to_keep = last_page_to_keep->next_page();
    }
    free_pages++;
    current_page = current_page->next_page();
  }

  // Free pages after last_page_to_keep, and adjust the next_page link.
  Page* p = MemoryAllocator::FreePages(last_page_to_keep->next_page());
  MemoryAllocator::SetNextPage(last_page_to_keep, p);

  // Since pages are only freed in whole chunks, we may have kept more than
  // pages_to_keep.
  while (p->is_valid()) {
    pages_to_keep++;
    p = p->next_page();
  }

  // The difference between free_pages and pages_to_keep is the number of
  // pages actually freed.
  ASSERT(pages_to_keep <= free_pages);
  int bytes_freed = (free_pages - pages_to_keep) * Page::kObjectAreaSize;
  accounting_stats_.ShrinkSpace(bytes_freed);

  ASSERT(Capacity() == CountTotalPages() * Page::kObjectAreaSize);
}

这里其实是释放掉了pagedSpace的后一半,如下图:

总结

本文从源码的角度介绍了V8的内存管理,可能大家会说对日常工作毫无作用,但读下来感觉还是很有意思,拓展了很多知识。

参考文献

V8之内存管理
浅谈V8引擎中的垃圾回收机制
V8 之旅:Full Compiler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant