Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
bmalloc: Don't use a whole page for metadata
https://bugs.webkit.org/show_bug.cgi?id=154510

Reviewed by Andreas Kling.

(1) Don't round up metadata to a page boundary. This saves 1.5% dirty
memory on iOS and 0.2% on Mac. It also enables a future patch to allocate
smaller chunks without wasting memory.

(2) Initialize metadata lazily. This saves dirty memory when the program
allocates primarily small or large objects (but not both), leaving some
metadata uninitialized.

* bmalloc.xcodeproj/project.pbxproj: Medium objects are gone now.

* bmalloc/BumpAllocator.h:
(bmalloc::BumpAllocator::refill): Added an ASSERT to help debug a bug
I cause while working on this patch.

* bmalloc/Heap.cpp:
(bmalloc::Heap::allocateSmallBumpRanges): Ditto.

(bmalloc::Heap::splitAndAllocate):
(bmalloc::Heap::allocateLarge): Updated for interface change.

* bmalloc/LargeChunk.h: Changed the boundaryTagCount calculation to
a static_assert.

Don't round up to page boundary. (See above.)

(bmalloc::LargeChunk::LargeChunk): Moved code here from LargeChunk::init.
A constructor is a more natural / automatic way to do this initialization.

* bmalloc/LargeObject.h:
(bmalloc::LargeObject::init): Deleted. Moved to LargeChunk.

* bmalloc/Sizes.h: Chagned largeChunkMetadataSize to a simpler constant
because metadata size no longer varies by page size.

* bmalloc/SmallChunk.h:
(bmalloc::SmallChunk::begin):
(bmalloc::SmallChunk::end):
(bmalloc::SmallChunk::lines):
(bmalloc::SmallChunk::pages): Use std::array to make begin/end
calculations easier.

(bmalloc::SmallChunk::SmallChunk): Treat our metadata like a series
of allocated objects. We used to avoid trampling our metadata by
starting object memory at the next page. Now we share the first page
between metadata and objects, and we account for metadata explicitly.

* bmalloc/SuperChunk.h:
(bmalloc::SuperChunk::SuperChunk):
(bmalloc::SuperChunk::smallChunk):
(bmalloc::SuperChunk::largeChunk):
(bmalloc::SuperChunk::create): Deleted. Don't eagerly run the SmallChunk
and LargeChunk constructors. We'll run them lazily as needed.

* bmalloc/VMHeap.cpp:
(bmalloc::VMHeap::VMHeap):
(bmalloc::VMHeap::allocateSmallChunk):
(bmalloc::VMHeap::allocateLargeChunk):
(bmalloc::VMHeap::allocateSuperChunk):
(bmalloc::VMHeap::grow): Deleted. Track small and large chunks explicitly
so we can initialize them lazily.

* bmalloc/VMHeap.h:
(bmalloc::VMHeap::allocateSmallPage):
(bmalloc::VMHeap::allocateLargeObject): Specify whether we're allocating
a small or large chunk since we don't allocate both at once anymore.


Canonical link: https://commits.webkit.org/172615@main
git-svn-id: https://svn.webkit.org/repository/webkit/trunk@196873 268f45cc-cd09-0410-ab3c-d52691b4dbfc
  • Loading branch information
geoffreygaren committed Feb 21, 2016
1 parent 610d0b0 commit 9bc62cb
Show file tree
Hide file tree
Showing 11 changed files with 207 additions and 131 deletions.
73 changes: 73 additions & 0 deletions Source/bmalloc/ChangeLog
@@ -1,3 +1,76 @@
2016-02-21 Geoffrey Garen <ggaren@apple.com>

bmalloc: Don't use a whole page for metadata
https://bugs.webkit.org/show_bug.cgi?id=154510

Reviewed by Andreas Kling.

(1) Don't round up metadata to a page boundary. This saves 1.5% dirty
memory on iOS and 0.2% on Mac. It also enables a future patch to allocate
smaller chunks without wasting memory.

(2) Initialize metadata lazily. This saves dirty memory when the program
allocates primarily small or large objects (but not both), leaving some
metadata uninitialized.

* bmalloc.xcodeproj/project.pbxproj: Medium objects are gone now.

* bmalloc/BumpAllocator.h:
(bmalloc::BumpAllocator::refill): Added an ASSERT to help debug a bug
I cause while working on this patch.

* bmalloc/Heap.cpp:
(bmalloc::Heap::allocateSmallBumpRanges): Ditto.

(bmalloc::Heap::splitAndAllocate):
(bmalloc::Heap::allocateLarge): Updated for interface change.

* bmalloc/LargeChunk.h: Changed the boundaryTagCount calculation to
a static_assert.

Don't round up to page boundary. (See above.)

(bmalloc::LargeChunk::LargeChunk): Moved code here from LargeChunk::init.
A constructor is a more natural / automatic way to do this initialization.

* bmalloc/LargeObject.h:
(bmalloc::LargeObject::init): Deleted. Moved to LargeChunk.

* bmalloc/Sizes.h: Chagned largeChunkMetadataSize to a simpler constant
because metadata size no longer varies by page size.

* bmalloc/SmallChunk.h:
(bmalloc::SmallChunk::begin):
(bmalloc::SmallChunk::end):
(bmalloc::SmallChunk::lines):
(bmalloc::SmallChunk::pages): Use std::array to make begin/end
calculations easier.

(bmalloc::SmallChunk::SmallChunk): Treat our metadata like a series
of allocated objects. We used to avoid trampling our metadata by
starting object memory at the next page. Now we share the first page
between metadata and objects, and we account for metadata explicitly.

* bmalloc/SuperChunk.h:
(bmalloc::SuperChunk::SuperChunk):
(bmalloc::SuperChunk::smallChunk):
(bmalloc::SuperChunk::largeChunk):
(bmalloc::SuperChunk::create): Deleted. Don't eagerly run the SmallChunk
and LargeChunk constructors. We'll run them lazily as needed.

* bmalloc/VMHeap.cpp:
(bmalloc::VMHeap::VMHeap):
(bmalloc::VMHeap::allocateSmallChunk):
(bmalloc::VMHeap::allocateLargeChunk):
(bmalloc::VMHeap::allocateSuperChunk):
(bmalloc::VMHeap::grow): Deleted. Track small and large chunks explicitly
so we can initialize them lazily.

* bmalloc/VMHeap.h:
(bmalloc::VMHeap::allocateSmallPage):
(bmalloc::VMHeap::allocateLargeObject): Specify whether we're allocating
a small or large chunk since we don't allocate both at once anymore.

2016-02-20 Mark Lam <mark.lam@apple.com>

Use of inlined asm statements causes problems for -std=c99 builds.
Expand Down
6 changes: 3 additions & 3 deletions Source/bmalloc/bmalloc.xcodeproj/project.pbxproj
Expand Up @@ -165,7 +165,7 @@
1448C2FD18F3752B00502839 /* api */,
14D9DB4D17F2865C00EAAB79 /* cache */,
147AAA9C18CE6010002201E4 /* heap: large */,
147AAA9A18CE5FD3002201E4 /* heap: small | medium */,
147AAA9A18CE5FD3002201E4 /* heap: small */,
14D9DB4E17F2866E00EAAB79 /* heap */,
14D9DB4F17F2868900EAAB79 /* stdlib */,
14B650C418F39F4800751968 /* Configurations */,
Expand All @@ -182,14 +182,14 @@
name = Products;
sourceTree = "<group>";
};
147AAA9A18CE5FD3002201E4 /* heap: small | medium */ = {
147AAA9A18CE5FD3002201E4 /* heap: small */ = {
isa = PBXGroup;
children = (
147AAA8C18CD36A7002201E4 /* SmallChunk.h */,
1452478618BC757C00F80098 /* SmallLine.h */,
143E29ED18CAE90500FE8A0F /* SmallPage.h */,
);
name = "heap: small | medium";
name = "heap: small";
sourceTree = "<group>";
};
147AAA9C18CE6010002201E4 /* heap: large */ = {
Expand Down
1 change: 1 addition & 0 deletions Source/bmalloc/bmalloc/BumpAllocator.h
Expand Up @@ -99,6 +99,7 @@ inline void BumpAllocator::refill(const BumpRange& bumpRange)
BASSERT(!canAllocate());
m_ptr = bumpRange.begin;
m_remaining = bumpRange.objectCount;
BASSERT(canAllocate());
}

inline void BumpAllocator::clear()
Expand Down
10 changes: 6 additions & 4 deletions Source/bmalloc/bmalloc/Heap.cpp
Expand Up @@ -125,6 +125,7 @@ void Heap::allocateSmallBumpRanges(std::lock_guard<StaticMutex>& lock, size_t si
// In a fragmented page, some free ranges might not fit in the cache.
if (rangeCache.size() == rangeCache.capacity()) {
m_smallPagesWithFreeLines[sizeClass].push(page);
BASSERT(allocator.canAllocate());
return;
}

Expand Down Expand Up @@ -153,6 +154,7 @@ void Heap::allocateSmallBumpRanges(std::lock_guard<StaticMutex>& lock, size_t si
rangeCache.push({ begin, objectCount });
}

BASSERT(allocator.canAllocate());
page->setHasFreeLines(lock, false);
}

Expand Down Expand Up @@ -304,15 +306,15 @@ inline LargeObject& Heap::splitAndAllocate(LargeObject& largeObject, size_t alig
return largeObject;
}

void* Heap::allocateLarge(std::lock_guard<StaticMutex>&, size_t size)
void* Heap::allocateLarge(std::lock_guard<StaticMutex>& lock, size_t size)
{
BASSERT(size <= largeMax);
BASSERT(size >= largeMin);
BASSERT(size == roundUpToMultipleOf<largeAlignment>(size));

LargeObject largeObject = m_largeObjects.take(size);
if (!largeObject)
largeObject = m_vmHeap.allocateLargeObject(size);
largeObject = m_vmHeap.allocateLargeObject(lock, size);

if (largeObject.vmState().hasVirtual()) {
m_isAllocatingPages = true;
Expand All @@ -326,7 +328,7 @@ void* Heap::allocateLarge(std::lock_guard<StaticMutex>&, size_t size)
return largeObject.begin();
}

void* Heap::allocateLarge(std::lock_guard<StaticMutex>&, size_t alignment, size_t size, size_t unalignedSize)
void* Heap::allocateLarge(std::lock_guard<StaticMutex>& lock, size_t alignment, size_t size, size_t unalignedSize)
{
BASSERT(size <= largeMax);
BASSERT(size >= largeMin);
Expand All @@ -340,7 +342,7 @@ void* Heap::allocateLarge(std::lock_guard<StaticMutex>&, size_t alignment, size_

LargeObject largeObject = m_largeObjects.take(alignment, size, unalignedSize);
if (!largeObject)
largeObject = m_vmHeap.allocateLargeObject(alignment, size, unalignedSize);
largeObject = m_vmHeap.allocateLargeObject(lock, alignment, size, unalignedSize);

if (largeObject.vmState().hasVirtual()) {
m_isAllocatingPages = true;
Expand Down
54 changes: 36 additions & 18 deletions Source/bmalloc/bmalloc/LargeChunk.h
Expand Up @@ -31,12 +31,13 @@
#include "ObjectType.h"
#include "Sizes.h"
#include "VMAllocate.h"
#include <array>

namespace bmalloc {

class LargeChunk {
public:
static LargeChunk* create();
LargeChunk();
static LargeChunk* get(void*);

static BeginTag* beginTag(void*);
Expand All @@ -46,8 +47,8 @@ class LargeChunk {
char* end() { return reinterpret_cast<char*>(this) + largeChunkSize; }

private:
// Round up to ensure 2 dummy boundary tags -- for the left and right sentinels.
static const size_t boundaryTagCount = max(2 * largeMin / sizeof(BoundaryTag), largeChunkSize / largeMin);
static const size_t boundaryTagCount = largeChunkSize / largeMin;
static_assert(boundaryTagCount > 2, "LargeChunk must have space for two sentinel boundary tags");

// Our metadata layout includes a left and right edge sentinel.
// Metadata takes up enough space to leave at least the first two
Expand All @@ -63,23 +64,40 @@ class LargeChunk {
//
// We use the X's for boundary tags and the O's for edge sentinels.

BoundaryTag m_boundaryTags[boundaryTagCount];

// Align to vmPageSize to avoid sharing physical pages with metadata.
// Otherwise, we'll confuse the scavenger into trying to scavenge metadata.
// FIXME: Below #ifdef workaround fix should be removed after all linux based ports bump
// own gcc version. See https://bugs.webkit.org/show_bug.cgi?id=140162#c87
#if BPLATFORM(IOS)
char m_memory[] __attribute__((aligned(16384)));
static_assert(vmPageSize == 16384, "vmPageSize and alignment must be same");
#else
char m_memory[] __attribute__((aligned(4096)));
static_assert(vmPageSize == 4096, "vmPageSize and alignment must be same");
#endif
std::array<BoundaryTag, boundaryTagCount> m_boundaryTags;
char m_memory[] __attribute__((aligned(largeAlignment)));
};

static_assert(largeChunkMetadataSize == sizeof(LargeChunk), "'largeChunkMetadataSize' should be the same number as sizeof(LargeChunk) or our computation in Sizes.h for 'largeMax' is wrong");
static_assert(largeChunkMetadataSize + largeMax <= largeChunkSize, "We will think we can accommodate larger objects than we can in reality");
static_assert(largeChunkMetadataSize == sizeof(LargeChunk), "Our largeChunkMetadataSize math in Sizes.h is wrong");
static_assert(largeChunkMetadataSize + largeMax == largeChunkSize, "largeMax is too small or too big");

inline LargeChunk::LargeChunk()
{
Range range(begin(), end() - begin());
BASSERT(range.size() == largeMax);

BeginTag* beginTag = LargeChunk::beginTag(range.begin());
beginTag->setRange(range);
beginTag->setFree(true);
beginTag->setVMState(VMState::Virtual);

EndTag* endTag = LargeChunk::endTag(range.begin(), range.size());
endTag->init(beginTag);

// Mark the left and right edges of our range as allocated. This naturally
// prevents merging logic from overflowing left (into metadata) or right
// (beyond our chunk), without requiring special-case checks.

EndTag* leftSentinel = beginTag->prev();
BASSERT(leftSentinel >= m_boundaryTags.begin());
BASSERT(leftSentinel < m_boundaryTags.end());
leftSentinel->initSentinel();

BeginTag* rightSentinel = endTag->next();
BASSERT(rightSentinel >= m_boundaryTags.begin());
BASSERT(rightSentinel < m_boundaryTags.end());
rightSentinel->initSentinel();
}

inline LargeChunk* LargeChunk::get(void* object)
{
Expand Down
29 changes: 0 additions & 29 deletions Source/bmalloc/bmalloc/LargeObject.h
Expand Up @@ -35,8 +35,6 @@ namespace bmalloc {

class LargeObject {
public:
static Range init(LargeChunk*);

LargeObject();
LargeObject(void*);

Expand Down Expand Up @@ -271,33 +269,6 @@ inline void LargeObject::validate() const
}
}

inline Range LargeObject::init(LargeChunk* chunk)
{
Range range(chunk->begin(), chunk->end() - chunk->begin());

BeginTag* beginTag = LargeChunk::beginTag(range.begin());
beginTag->setRange(range);
beginTag->setFree(true);
beginTag->setVMState(VMState::Virtual);

EndTag* endTag = LargeChunk::endTag(range.begin(), range.size());
endTag->init(beginTag);

// Mark the left and right edges of our chunk as allocated. This naturally
// prevents merging logic from overflowing beyond our chunk, without requiring
// special-case checks.

EndTag* leftSentinel = beginTag->prev();
BASSERT(leftSentinel >= static_cast<void*>(chunk));
leftSentinel->initSentinel();

BeginTag* rightSentinel = endTag->next();
BASSERT(rightSentinel < static_cast<void*>(range.begin()));
rightSentinel->initSentinel();

return range;
}

} // namespace bmalloc

#endif // LargeObject_h
19 changes: 7 additions & 12 deletions Source/bmalloc/bmalloc/Sizes.h
Expand Up @@ -56,28 +56,23 @@ namespace Sizes {
static const size_t superChunkSize = 2 * MB;
static const size_t superChunkMask = ~(superChunkSize - 1);

static const size_t smallMax = 1024;
static const size_t smallLineSize = 256;
static const size_t smallLineCount = vmPageSize / smallLineSize;
static const size_t smallLineMask = ~(smallLineSize - 1ul);

static const size_t smallChunkSize = superChunkSize / 2;
static const size_t smallChunkOffset = superChunkSize / 2;
static const size_t smallChunkMask = ~(smallChunkSize - 1ul);

static const size_t smallMax = 1024;
static const size_t smallLineSize = 256;
static const size_t smallLineCount = vmPageSize / smallLineSize;

static const size_t largeChunkSize = superChunkSize / 2;
#if BPLATFORM(IOS)
static const size_t largeChunkMetadataSize = 16 * kB;
#else
static const size_t largeChunkMetadataSize = 4 * kB;
#endif
static const size_t largeChunkOffset = 0;
static const size_t largeChunkMask = ~(largeChunkSize - 1ul);

static const size_t largeAlignment = 64;
static const size_t largeMax = largeChunkSize - largeChunkMetadataSize;
static const size_t largeMin = smallMax;

static const size_t largeChunkMetadataSize = 4 * kB; // sizeof(LargeChunk)
static const size_t largeMax = largeChunkSize - largeChunkMetadataSize;

static const size_t xLargeAlignment = vmPageSize;
static const size_t xLargeMax = std::numeric_limits<size_t>::max() - xLargeAlignment; // Make sure that rounding up to xLargeAlignment does not overflow.

Expand Down
53 changes: 29 additions & 24 deletions Source/bmalloc/bmalloc/SmallChunk.h
Expand Up @@ -35,37 +35,42 @@ namespace bmalloc {

class SmallChunk {
public:
SmallChunk(std::lock_guard<StaticMutex>&);

static SmallChunk* get(void*);

SmallPage* begin() { return SmallPage::get(SmallLine::get(m_memory)); }
SmallPage* end() { return &m_pages[pageCount]; }
SmallPage* end() { return m_pages.end(); }

SmallLine* lines() { return m_lines.begin(); }
SmallPage* pages() { return m_pages.begin(); }

SmallLine* lines() { return m_lines; }
SmallPage* pages() { return m_pages; }

private:
static_assert(!(vmPageSize % smallLineSize), "vmPageSize must be an even multiple of line size");
static_assert(!(smallChunkSize % smallLineSize), "chunk size must be an even multiple of line size");

static const size_t lineCount = smallChunkSize / smallLineSize;
static const size_t pageCount = smallChunkSize / vmPageSize;

SmallLine m_lines[lineCount];
SmallPage m_pages[pageCount];

// Align to vmPageSize to avoid sharing physical pages with metadata.
// Otherwise, we'll confuse the scavenger into trying to scavenge metadata.
// FIXME: Below #ifdef workaround fix should be removed after all linux based ports bump
// own gcc version. See https://bugs.webkit.org/show_bug.cgi?id=140162#c87
#if BPLATFORM(IOS)
char m_memory[] __attribute__((aligned(16384)));
static_assert(vmPageSize == 16384, "vmPageSize and alignment must be same");
#else
char m_memory[] __attribute__((aligned(4096)));
static_assert(vmPageSize == 4096, "vmPageSize and alignment must be same");
#endif
std::array<SmallLine, smallChunkSize / smallLineSize> m_lines;
std::array<SmallPage, smallChunkSize / vmPageSize> m_pages;
char m_memory[] __attribute__((aligned(smallLineSize)));
};

static_assert(!(vmPageSize % smallLineSize), "vmPageSize must be an even multiple of line size");
static_assert(!(smallChunkSize % smallLineSize), "chunk size must be an even multiple of line size");
static_assert(
sizeof(SmallChunk) - vmPageSize % sizeof(SmallChunk) < vmPageSize - 2 * smallMax,
"the first page of object memory in a small chunk can't allocate smallMax");

inline SmallChunk::SmallChunk(std::lock_guard<StaticMutex>& lock)
{
// Track the memory used for metadata by allocating imaginary objects.
for (SmallLine* line = m_lines.begin(); line < SmallLine::get(m_memory); ++line) {
line->ref(lock, 1);

SmallPage* page = SmallPage::get(line);
page->ref(lock);
}

for (SmallPage* page = begin(); page != end(); ++page)
page->setHasFreeLines(lock, true);
}

inline SmallChunk* SmallChunk::get(void* object)
{
BASSERT(isSmall(object));
Expand Down

0 comments on commit 9bc62cb

Please sign in to comment.