Skip to content
This repository has been archived by the owner. It is now read-only.

external memory allocation management #4964

Merged
merged 9 commits into from Jun 18, 2013

Conversation

@trevnorris
Copy link

commented Mar 8, 2013

Figured I'd get this out there early. There are a massive number of todo's, but wanted feedback as I moved along.

The general point of these changes are:

  • Create single location for all external memory allocation
  • Manage single location for improved performance and memory management
  • Remove Fast/Slow Buffer logic
  • Hopefully remove need for SlabAllocator (streams), SlabBuffer (tls) and allocNewPool (fs)

Everything here is up for debate. These changes will need to be ridiculously well tested and thought though.

EDIT: Anyone looking for the removal of the SlabAllocator should check trevnorris/node/no-slaballocator. That is built on top of this PR, but doesn't belong here. As soon as this is accepted we'll begin discussing the other.

@isaacs

This comment has been minimized.

Copy link

commented Mar 8, 2013

Just in case it is unclear to anyone: Under no circumstances should any of this go into 0.10.

Exciting stuff, though :)

@trevnorris

This comment has been minimized.

Copy link
Author

commented Mar 12, 2013

To help demonstrate one of the reasons for this change, take the following script:

var a = [];
for (var i = 0; i < 6e5; i++) {
  a.push(new Buffer(1));
  new Buffer(Buffer.poolSize - 2);
}
var rss = process.memoryUsage().rss;
console.log(((rss / 1024 / 1024)|0) + ' MB');
// output: 2639 MB

Where as the following similar script has very different results:

var a = [];
for (var i = 0; i < 6e5; i++) {
  a.push(new Buffer(1));
}
var rss = process.memoryUsage().rss;
console.log(((rss / 1024 / 1024)|0) + ' MB');
// output: 72 MB

What's happening here is a single byte Buffer is being stored in an Array. Which by itself doesn't cost much. Though when you allocate enough of the remainder of the buffer pool, telling it to allocate a new one, a lot of unused memory is left hanging out in the open.

Unfortunately v8 doesn't currently have a performant enough way to track when a js Object has been gc'd (right using Persistent objects and MakeWeak callbacks are the only way). So as a trade off node makes the assumption that most small buffers will be short lived.

@trevnorris

This comment has been minimized.

Copy link
Author

commented Apr 19, 2013

@isaacs @bnoordhuis going to say this is ready for initial review. Still waiting for a decision on #5323 to determine how to handle test-buffer.js. Other than that everything passes.

@rvagg

This comment has been minimized.

Copy link
Member

commented May 16, 2013

don't forget to increment NODE_MODULE_VERSION when this lands please.

@trevnorris

This comment has been minimized.

Copy link
Author

commented May 20, 2013

Rebased off latest crypto changes. All tests pass, and working fine.

review please

/cc @bnoordhuis @isaacs (and anyone else who cares)


No pooling is performed for these allocations. So there's no form of memory
leak.

This comment has been minimized.

Copy link
@isaacs

isaacs May 21, 2013

You mean "no form of memory management"? This area needs some giant warnings indicating that these alloc'ed objects need to be manually tracked and disposed of.

This comment has been minimized.

Copy link
@isaacs

isaacs May 21, 2013

Better yet, just don't document it at all, and put a _ in front of the functions. Then put the scary warnings in the source code comments :)

This comment has been minimized.

Copy link
@trevnorris

trevnorris May 21, 2013

Author

Sorry. Didn't make it clear. These are completely managed by gc, with the
option to dispose at will. My perf tests show if the user is good about
disposing even Persisted objects performance is much better. Seems the
biggest hit is from gc needing to transverse and look for them.

This comment has been minimized.

Copy link
@isaacs

isaacs May 21, 2013

Ohhhh... ok. So these aren't thinbuffers, they're just "stick some allocated stuff onto an object"?

So, just to be clear, what happens if you Buffer.alloc(1024, {})? Does the memory get free'd when the object is GC'ed?

This comment has been minimized.

Copy link
@trevnorris

trevnorris May 21, 2013

Author

Yes. With the perk of being able to manually delete the external memory and
transform it to a zero length buffer

This comment has been minimized.

Copy link
@trevnorris

trevnorris May 21, 2013

Author

Well. I should say zero length allocated object. This is mainly an api for
devs like @TooTallNate who are likely to find it useful to attach external
memory to any given object, and as a launch pad for my next patch.

There I added a dispose method to Buffers as well (which I'll show has some
awesome uses), but couldn't be done without removing the SlabAllocator.
Though all that definitely doesn't belong in this PR.

This comment has been minimized.

Copy link
@TooTallNate
@trevnorris

View changes

doc/api/buffer.markdown Outdated
@@ -277,6 +314,15 @@ byte from the original Buffer.
// abc
// !bc

### buf.yank()

This comment has been minimized.

Copy link
@trevnorris

trevnorris May 22, 2013

Author

This method is subjective, but thought it'd be useful to more easily allow developers to micro-manage their memory. I'll just drop the last commit if everyone else isn't in favor.

@isaacs

View changes

src/node_internals.h Outdated
@@ -137,6 +137,19 @@ inline bool IsBigEndian() {
return GetEndianness() == kBigEndian;
}

// parse index for external array data
inline static size_t ParseArrayIndex(v8::Handle<v8::Value> arg, size_t def) {

This comment has been minimized.

Copy link
@isaacs

isaacs May 30, 2013

I think the v8:: is unnecessary here?

This comment has been minimized.

Copy link
@trevnorris

trevnorris May 30, 2013

Author

not in node_internals.h. no using directive in there.

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 15, 2013

Member

Doesn't have to be static. Inline implies static (in C++ at least.)

@bnoordhuis

View changes

src/smalloc.cc Outdated
assert(source_start + length <= dest_length - dest_start);
assert(source_start >= 0);
assert(dest_start >= 0);
assert(length >= 0);

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 4, 2013

Member

Always true.

@bnoordhuis

View changes

src/smalloc.cc Outdated
assert(source_start <= source_length);
assert(source_start + length <= dest_length - dest_start);
assert(source_start >= 0);
assert(dest_start >= 0);

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 12, 2013

Member

These two are always true.

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 12, 2013

Member

There's a couple of issues with these checks:

  • It looks like they should check that source_start + length < source_length
  • Ditto for dest_start + length < dest_length
  • Include overflow checks
  • dest_length - dest_start can underflow, i.e. wrap around to something big.
However, this adds an additional loop to the function, so it is faster
to provide the length explicitly.

### Class Method: Buffer.alloc(length, [receiver])

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 12, 2013

Member

I would mark Buffer.alloc() and Buffer.dispose() very clearly (probably IN ALL CAPS) as experimental.

@bnoordhuis

View changes

doc/api/buffer.markdown Outdated
then Node will allocate a SlowBuffer slab for it directly.
A `SlowBuffer` is simply a non-pooled Buffer instance. In specific cases where a
developer knows a small chunk of data needs to exist for an indefinite time,
copying that data into a `SlowBuffer` can help prevent memory leaks.

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 12, 2013

Member

Needs more detail. Worded like this, people will cargo-cult it.

@bnoordhuis

View changes

src/smalloc.cc Outdated
using v8::kExternalUnsignedByteArray;


struct callback_info {

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 12, 2013

Member

Prefer CamelCased type names.

@bnoordhuis

View changes

src/smalloc.cc Outdated
FreeCallback target_free_cb;

void TargetCallback(Isolate* env, Persistent<Object>* target, char* arg);
void TargetFreeCallback(Isolate* env, Persistent<Object>* target, void* arg);

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 12, 2013

Member

s/env/isolate/. I suspect you picked that up from V8's cctest?

This comment has been minimized.

Copy link
@trevnorris
@bnoordhuis

View changes

src/smalloc.cc Outdated
cb_info->hint = hint;
Persistent<Object> p_obj(node_isolate, obj);

node_isolate->AdjustAmountOfExternalAllocatedMemory(length + sizeof(cb_info));

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 12, 2013

Member

sizeof(*cb_info)

@bnoordhuis

View changes

src/smalloc.cc Outdated
int len = obj->GetIndexedPropertiesExternalArrayDataLength();
char* data = static_cast<char*>(obj->GetIndexedPropertiesExternalArrayData());
callback_info* cb_info = static_cast<callback_info*>(arg);
env->AdjustAmountOfExternalAllocatedMemory(-(len + sizeof(cb_info)));

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 12, 2013

Member

sizeof(*cb_info)

@bnoordhuis

View changes

src/smalloc.h Outdated
// mirrors deps/v8/src/objects.h
NODE_EXTERN static const unsigned int kMaxLength = 0x3fffffff;

NODE_EXTERN typedef void (*free_callback)(char* data, void* hint);

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 12, 2013

Member

s/free_callback/FreeCallback/

EDIT: I don't care strongly though. If it's a lot of work to fix up, let it be.

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 12, 2013

Author

that name's for backwards compatibility with Buffer. i'm more than happy to change that if you say it's ok.

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 13, 2013

Author

@bnoordhuis it's not difficult to change at all in core. in fact it's not even used in core anymore. the reason for backwards compatibility was for module developers, but since some of the backwards compatibility will be breaking anyways this seems like a trivial change. I'll go ahead and do it.

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 17, 2013

Author

eh screw it. they'll have to append smalloc:: to it anyways. i'll change it. :)

var len = this.length;
start = clamp(start, len, 0);
end = clamp(end, len, len);
return new Buffer(this, end - start, start);

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 14, 2013

Member

Why length == null rather than typeof length === 'undefined'?

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 15, 2013

Author

because in 3.16 they uber optimized for == null checks. but savings is in the nano seconds. i'll change it.

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 15, 2013

Member

Interesting. You can leave it like that if you want. Do you know what V8 commits are the relevant ones?

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 17, 2013

Author

going to change it. core should be more explicit about each type it checks.

@bnoordhuis

View changes

lib/buffer.js Outdated

if (length < 0) length = 0;

if (list.length === 0)

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 14, 2013

Member

Always false.

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 17, 2013

Author

eh? Buffer.concat([])

@bnoordhuis

View changes

src/node_buffer.cc Outdated
#define MIN(a, b) ((a) < (b) ? (a) : (b))

#define CHECK_OOB(r) \
if (r) return ThrowRangeError("out of range index");

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 17, 2013

Member

Wrap in a do { ... } while (0) block without a trailing semi-colon.

This comment has been minimized.

Copy link
@trevnorris
@bnoordhuis

View changes

src/node_buffer.cc Outdated
}
new Buffer(args.This(), length);
Handle<Value> argv[2];
argv[0] = Undefined(node_isolate);

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 17, 2013

Member

Maybe repeat that comment here from a few lines up. Ditto for the similar code below.

This comment has been minimized.

Copy link
@trevnorris
@bnoordhuis

View changes

src/smalloc.h Outdated
// mirrors deps/v8/src/objects.h
NODE_EXTERN static const unsigned int kMaxLength = 0x3fffffff;

NODE_EXTERN typedef void (*freeCallback)(char* data, void* hint);

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 17, 2013

Member

s/freeCallback/FreeCallback/

This comment has been minimized.

Copy link
@trevnorris

This comment has been minimized.

Copy link
@trevnorris
@bnoordhuis

View changes

src/node_buffer.cc Outdated
Local<Object> obj = constructor_template->GetFunction()->NewInstance(1, &arg);
if (length > kMaxLength)
return Local<Object>::New(node_isolate,
ThrowRangeError("length > kMaxLength").As<Object>());

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 17, 2013

Member

This does not work like you think it does. ThrowRangeError() only schedules the exception, it doesn't return it. It's actual return value is Undefined which, when cast to Object, will produce undefined (pun intended) behavior.

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 17, 2013

Author

@bnoordhuis ah, well... um. ok, i'm at a lose what should be done then.

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 17, 2013

Author

changed to assert()

@bnoordhuis

View changes

src/node_internals.h Outdated
bool ParseArrayIndex(v8::Handle<v8::Value> arg, size_t def, size_t* ret) {
if (arg->IsUndefined()) {
*ret = def;
return false;

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 17, 2013

Member

Returning false on success and true on error is kind of counter-intuitive.

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 17, 2013

Author

fixed. thanks.

@bnoordhuis

View changes

src/node_buffer.cc Outdated
bool HasInstance(Handle<Value> val) {
if (!val->IsObject())
return false;
return HasInstance(val.As<Object>());

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 17, 2013

Member

Could be simplified to return val->IsObject() && HasInstance(val.As<Object>());

@bnoordhuis

View changes

src/node_buffer.cc Outdated
v8::ExternalArrayType type = obj->GetIndexedPropertiesExternalArrayDataType();
if (type != v8::kExternalUnsignedByteArray)
return false;
return true;

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 17, 2013

Member

Could be simplified to return type == v8::kExternalUnsignedByteArray;

@bnoordhuis

View changes

src/node_buffer.cc Outdated
#define MIN(a, b) ((a) < (b) ? (a) : (b))

#define CHECK_NOT_OOB(r) \
do { if (!r) return ThrowRangeError("out of range index"); } while (0)

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 18, 2013

Member

if (!(r)). Right now, CHECK_NOT_OOB(end > end_max && end > 0) gets evaluated as if (!end > end_max && ...) which presumably is not what you want.

Also, the line before it is too long (81 characters.) The backslash should preferably go on the 79th column.

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 18, 2013

Author

oy. two n00b mistakes. thanks.

@bnoordhuis

View changes

src/node_buffer.cc Outdated

p_buffer_fn = Persistent<Function>::New(node_isolate, bv);

Local<Object> proto = bv->Get(String::New("prototype")).As<Object>();

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 18, 2013

Member

I would add bv->IsFunction() and proto->IsObject() sanity checks here.

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 18, 2013

Author

bv is checked above as args[0]->IsFunction(). i'll add the assert on proto, thanks for pointing that out.

@bnoordhuis

View changes

src/node_buffer.h Outdated
} // namespace node buffer
namespace Buffer {

NODE_EXTERN static const unsigned int kMaxLength = smalloc::kMaxLength;

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 18, 2013

Member

I don't think this needs to be declared NODE_EXTERN. A static const variable has no linkage as long as you don't take its address.

/cc @piscisaureus - Is that true on Redmond OS as well?

This comment has been minimized.

Copy link
@piscisaureus

piscisaureus Jun 18, 2013

Member

It works on windows I think. dllimport/export on static variables acts as a shorthand to automatically do the indirection. e.g.

__declspec(dllexport) int bar = 5;

is roughtly equivalent to

int bar_value = 5;
__declspec(dllexport) int* bar = &bar_value;

and

__declspec(dllimport) int bar;
bar++

becomes

__declspec(dllimport) int* bar;
*bar++
@bnoordhuis

This comment has been minimized.

Copy link
Member

commented Jun 18, 2013

Nearly there, Trevor. :-)

@trevnorris

This comment has been minimized.

Copy link
Author

commented Jun 18, 2013

@bnoordhuis thanks dude. those 3 things are fixed. :)

@bnoordhuis

View changes

src/smalloc.cc Outdated
assert(length <= dest_length);
// now we can guarantee these will catch oob access and *_start overflow
assert(source_start + length <= source_length);
assert(dest_start + length <= dest_length);

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 18, 2013

Member

I think there's a logic bug here. Assuming size_t has 32 bits, then if dest_start=0xfffffffe, length=10 and dest_length=20 then (length < dest_length) and (dest_start + length <= dest_length) both hold but dest_data + dest_start will point outside the buffer.

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 18, 2013

Author

added the two necessary checks. thanks for pointing this out.

@bnoordhuis

View changes

src/node_buffer.cc Outdated
// optimize single ascii character case
if (at_length == 1) {
int value = static_cast<int>((*at)[0]);
if (value <= 127) {

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 18, 2013

Member

This is always true when char is signed: (int) (signed char) 128 == -128

@bnoordhuis

View changes

src/node_buffer.cc Outdated
#define MIN(a, b) ((a) < (b) ? (a) : (b))

#define CHECK_NOT_OOB(r) \
do { if (!r) return ThrowRangeError("out of range index"); } while (0)

This comment has been minimized.

Copy link
@bnoordhuis

bnoordhuis Jun 18, 2013

Member

Please put the parentheses around the expression here (i.e. if (!(r))) rather than adding parens on a case-by-case basis.

This comment has been minimized.

Copy link
@trevnorris

trevnorris Jun 18, 2013

Author

thanks. done.

trevnorris added some commits Apr 17, 2013

smalloc: initial implementation
smalloc is a simple utility for quickly allocating external memory onto
js objects. This will be used to centralize how memory is managed in
node, and will become the backer for Buffers. So in the future crypto's
SlabBuffer, stream's SlabAllocator will be removed.

Note on the js API: because no arguments are optional the order of
arguments have been placed to match their cc counterparts as closely as
possible.
smalloc: add api to manually dispose Persistent
If the user knows the allocation is no longer needed then the memory can
be manually released.

Currently this will not ClearWeak the Persistent, so the callback will
still run.

If the user passed a ClearWeak callback, and then disposed the object,
the buffer callback argument will == NULL.
buffer: use smalloc as backing data store
Memory allocations are now done through smalloc. The Buffer cc class has
been removed completely, but for backwards compatibility have left the
namespace as Buffer.

The .parent attribute is only set if the Buffer is a slice of an
allocation. Which is then set to the alloc object (not a Buffer).

The .offset attribute is now a ReadOnly set to 0, for backwards
compatibility. I'd like to remove it in the future (pre v1.0).

A few alterations have been made to how arguments are either coerced or
thrown. All primitives will now be coerced to their respective values,
and (most) all out of range index requests will throw.

The indexes that are coerced were left for backwards compatibility. For
example: Buffer slice operates more like Array slice, and coerces
instead of throwing out of range indexes. This may change in the future.

The reason for wanting to throw for out of range indexes is because
giving js access to raw memory has high potential risk. To mitigate that
it's easier to make sure the developer is always quickly alerted to the
fact that their code is attempting to access beyond memory bounds.

Because SlowBuffer will be deprecated, and simply returns a new Buffer
instance, all tests on SlowBuffer have been removed.

Heapdumps will now show usage under "smalloc" instead of "Buffer".

ParseArrayIndex was added to node_internals to support proper uint
argument checking/coercion for external array data indexes.

SlabAllocator had to be updated since handle_ no longer exists.
buffer: reimplement Buffer pools
While the new Buffer implementation is much faster we still have the
necessity of using Buffer pools. This is undesirable because it may
still lead to unwanted memory retention, but for the time being this is
the best solution.

Because of this re-introduction, and since there is no more SlowBuffer
type, the SlowBuffer method has been re-purposed to return a non-pooled
Buffer instance. This will be helpful for developers to store data for
indeterminate lengths of time without introducing a memory leak.

Another change to Buffer pools was that they are only allocated if the
requested chunk is < poolSize / 2. This was done because allocations are
much quicker now, and it's a better use of the pool.
buffer: deprecate legacy code
Several things are now no longer necessary. These have been deprecated,
and will be removed in v0.13.
buffer: expose class methods alloc and dispose
Expose the ability for users to allocate and manually dispose data on
any object. These are user-safe versions of internal smalloc functions.
buffer: implement new fill behavior
Old fill would take the char code of the first character and wrap around
the int to fit in the 127 range. Now fill will duplicate whatever string
is given through the entirety of the buffer.

Note: There is one bug around ending on a partial fill of any character
outside the ASCII range.
buffer: proper API export for Windows
So that Windows users can properly include smalloc and node_buffer,
NODE_EXTERN was added to the headers that export this functionality.
@bnoordhuis

This comment has been minimized.

Copy link
Member

commented Jun 18, 2013

LGTM

@trevnorris trevnorris merged commit 7373c4d into nodejs:master Jun 18, 2013

1 check was pending

default
@edhemphill

This comment has been minimized.

Copy link

commented Dec 20, 2014

I'm pretty late to the party and was hoping someone could catch me up... What's the correct way to wrap existing memory with a Buffer? I can't seem to get the free_callback to call with Buffer. I know there was some chatter on IRC quite a while back on getting rid of this. We're on the 0.10.x series but can move if necessary...

void free_test_cb(char *m,void *hint) {
    DBG_OUT("FREEING MEMORY.");
    free(m);
}

Handle<Value> WrapMemBufferTest(const Arguments& args) {
    HandleScope scope;
    char *mem = (char *) ::malloc(100);
    memset(mem,'A',100);
    node::Buffer *buf = node::Buffer::New(mem,100,free_test_cb,0);
    return scope.Close(buf->handle_);
}

But the free_test_cb() is just not getting called in a simple test program.
...and then I even tried throwing these in there:

void weak_cb(Persistent<Value> object, void* parameter) {
    object.Dispose();
}

Handle<Value> WrapMemBufferTest(const Arguments& args) {
    HandleScope scope;
    char *mem = (char *) ::malloc(100);
    memset(mem,'A',100);
    node::Buffer *buf = node::Buffer::New(mem,100,free_test_cb,0);
    buf->handle_.MakeWeak(NULL, weak_cb);  // new
    return scope.Close(buf->handle_);
}

Any advice appreciated.

@bnoordhuis

This comment has been minimized.

Copy link
Member

commented Dec 20, 2014

@edhemphill Finalizers are deferred, they don't run inmmediately when the last reference to the object goes away. If your program is short-lived, they may not run. Theoretically, even in longer-lived programs, they may not run ever. If your resource needs a deterministic life cycle, you have to manage it yourself.

@edhemphill

This comment has been minimized.

Copy link

commented Dec 20, 2014

@bnoordhuis Interesting... thanks.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
7 participants
You can’t perform that action at this time.