external memory allocation management #4964

Merged
merged 9 commits into from Jun 18, 2013

Projects

None yet

7 participants

@trevnorris
Collaborator

Figured I'd get this out there early. There are a massive number of todo's, but wanted feedback as I moved along.

The general point of these changes are:

  • Create single location for all external memory allocation
  • Manage single location for improved performance and memory management
  • Remove Fast/Slow Buffer logic
  • Hopefully remove need for SlabAllocator (streams), SlabBuffer (tls) and allocNewPool (fs)

Everything here is up for debate. These changes will need to be ridiculously well tested and thought though.

EDIT: Anyone looking for the removal of the SlabAllocator should check trevnorris/node/no-slaballocator. That is built on top of this PR, but doesn't belong here. As soon as this is accepted we'll begin discussing the other.

@isaacs
Collaborator
isaacs commented Mar 8, 2013

Just in case it is unclear to anyone: Under no circumstances should any of this go into 0.10.

Exciting stuff, though :)

@trevnorris
Collaborator

To help demonstrate one of the reasons for this change, take the following script:

var a = [];
for (var i = 0; i < 6e5; i++) {
  a.push(new Buffer(1));
  new Buffer(Buffer.poolSize - 2);
}
var rss = process.memoryUsage().rss;
console.log(((rss / 1024 / 1024)|0) + ' MB');
// output: 2639 MB

Where as the following similar script has very different results:

var a = [];
for (var i = 0; i < 6e5; i++) {
  a.push(new Buffer(1));
}
var rss = process.memoryUsage().rss;
console.log(((rss / 1024 / 1024)|0) + ' MB');
// output: 72 MB

What's happening here is a single byte Buffer is being stored in an Array. Which by itself doesn't cost much. Though when you allocate enough of the remainder of the buffer pool, telling it to allocate a new one, a lot of unused memory is left hanging out in the open.

Unfortunately v8 doesn't currently have a performant enough way to track when a js Object has been gc'd (right using Persistent objects and MakeWeak callbacks are the only way). So as a trade off node makes the assumption that most small buffers will be short lived.

@trevnorris
Collaborator

@isaacs @bnoordhuis going to say this is ready for initial review. Still waiting for a decision on #5323 to determine how to handle test-buffer.js. Other than that everything passes.

@rvagg
Member
rvagg commented May 16, 2013

don't forget to increment NODE_MODULE_VERSION when this lands please.

@trevnorris
Collaborator

Rebased off latest crypto changes. All tests pass, and working fine.

review please

/cc @bnoordhuis @isaacs (and anyone else who cares)

@isaacs isaacs commented on the diff May 21, 2013
doc/api/buffer.markdown
+However, this adds an additional loop to the function, so it is faster
+to provide the length explicitly.
+
+### Class Method: Buffer.alloc(length, [receiver])
+
+* `length` Number
+* `receiver` Object, Optional, Default: `new Object`
+
+Returns object with allocated external array data.
+
+Buffers are backed by a simple allocator that only handles the assignation of
+external raw memory. This exposes that functionality.
+
+No pooling is performed for these allocations. So there's no form of memory
+leak.
+
@isaacs
isaacs May 21, 2013 collaborator

You mean "no form of memory management"? This area needs some giant warnings indicating that these alloc'ed objects need to be manually tracked and disposed of.

@isaacs
isaacs May 21, 2013 collaborator

Better yet, just don't document it at all, and put a _ in front of the functions. Then put the scary warnings in the source code comments :)

@trevnorris
trevnorris May 21, 2013 collaborator

Sorry. Didn't make it clear. These are completely managed by gc, with the
option to dispose at will. My perf tests show if the user is good about
disposing even Persisted objects performance is much better. Seems the
biggest hit is from gc needing to transverse and look for them.

@isaacs
isaacs May 21, 2013 collaborator

Ohhhh... ok. So these aren't thinbuffers, they're just "stick some allocated stuff onto an object"?

So, just to be clear, what happens if you Buffer.alloc(1024, {})? Does the memory get free'd when the object is GC'ed?

@trevnorris
trevnorris May 21, 2013 collaborator

Yes. With the perk of being able to manually delete the external memory and
transform it to a zero length buffer

@trevnorris
trevnorris May 21, 2013 collaborator

Well. I should say zero length allocated object. This is mainly an api for
devs like @TooTallNate who are likely to find it useful to attach external
memory to any given object, and as a launch pad for my next patch.

There I added a dispose method to Buffers as well (which I'll show has some
awesome uses), but couldn't be done without removing the SlabAllocator.
Though all that definitely doesn't belong in this PR.

@trevnorris trevnorris commented on an outdated diff May 22, 2013
doc/api/buffer.markdown
@@ -277,6 +314,15 @@ byte from the original Buffer.
// abc
// !bc
+### buf.yank()
@trevnorris
trevnorris May 22, 2013 collaborator

This method is subjective, but thought it'd be useful to more easily allow developers to micro-manage their memory. I'll just drop the last commit if everyone else isn't in favor.

@isaacs isaacs and 2 others commented on an outdated diff May 30, 2013
src/node_internals.h
@@ -137,6 +137,19 @@ inline bool IsBigEndian() {
return GetEndianness() == kBigEndian;
}
+// parse index for external array data
+inline static size_t ParseArrayIndex(v8::Handle<v8::Value> arg, size_t def) {
@isaacs
isaacs May 30, 2013 collaborator

I think the v8:: is unnecessary here?

@trevnorris
trevnorris May 30, 2013 collaborator

not in node_internals.h. no using directive in there.

@bnoordhuis
bnoordhuis Jun 15, 2013 Node.js Foundation member

Doesn't have to be static. Inline implies static (in C++ at least.)

@bnoordhuis bnoordhuis commented on an outdated diff Jun 4, 2013
src/smalloc.cc
+ size_t dest_start = args[3]->Uint32Value();
+ size_t length = args[4]->Uint32Value();
+ size_t source_length = source->GetIndexedPropertiesExternalArrayDataLength();
+ size_t dest_length = dest->GetIndexedPropertiesExternalArrayDataLength();
+ char* source_data = static_cast<char*>(
+ source->GetIndexedPropertiesExternalArrayData());
+ char* dest_data = static_cast<char*>(
+ dest->GetIndexedPropertiesExternalArrayData());
+
+ assert(source_data != NULL);
+ assert(dest_data != NULL);
+ assert(source_start <= source_length);
+ assert(source_start + length <= dest_length - dest_start);
+ assert(source_start >= 0);
+ assert(dest_start >= 0);
+ assert(length >= 0);
@bnoordhuis
bnoordhuis Jun 4, 2013 Node.js Foundation member

Always true.

@bnoordhuis bnoordhuis commented on an outdated diff Jun 12, 2013
src/smalloc.cc
+ size_t source_start = args[1]->Uint32Value();
+ size_t dest_start = args[3]->Uint32Value();
+ size_t length = args[4]->Uint32Value();
+ size_t source_length = source->GetIndexedPropertiesExternalArrayDataLength();
+ size_t dest_length = dest->GetIndexedPropertiesExternalArrayDataLength();
+ char* source_data = static_cast<char*>(
+ source->GetIndexedPropertiesExternalArrayData());
+ char* dest_data = static_cast<char*>(
+ dest->GetIndexedPropertiesExternalArrayData());
+
+ assert(source_data != NULL);
+ assert(dest_data != NULL);
+ assert(source_start <= source_length);
+ assert(source_start + length <= dest_length - dest_start);
+ assert(source_start >= 0);
+ assert(dest_start >= 0);
@bnoordhuis
bnoordhuis Jun 12, 2013 Node.js Foundation member

These two are always true.

@bnoordhuis
bnoordhuis Jun 12, 2013 Node.js Foundation member

There's a couple of issues with these checks:

  • It looks like they should check that source_start + length < source_length
  • Ditto for dest_start + length < dest_length
  • Include overflow checks
  • dest_length - dest_start can underflow, i.e. wrap around to something big.
@bnoordhuis bnoordhuis commented on the diff Jun 12, 2013
doc/api/buffer.markdown
+Returns a buffer which is the result of concatenating all the buffers in
+the list together.
+
+If the list has no items, or if the totalLength is 0, then it returns a
+zero-length buffer.
+
+If the list has exactly one item, then the first item of the list is
+returned.
+
+If the list has more than one item, then a new Buffer is created.
+
+If totalLength is not provided, it is read from the buffers in the list.
+However, this adds an additional loop to the function, so it is faster
+to provide the length explicitly.
+
+### Class Method: Buffer.alloc(length, [receiver])
@bnoordhuis
bnoordhuis Jun 12, 2013 Node.js Foundation member

I would mark Buffer.alloc() and Buffer.dispose() very clearly (probably IN ALL CAPS) as experimental.

@bnoordhuis bnoordhuis commented on an outdated diff Jun 12, 2013
doc/api/buffer.markdown
-In order to avoid the overhead of allocating many C++ Buffer objects for
-small blocks of memory in the lifetime of a server, Node allocates memory
-in 8Kb (8192 byte) chunks. If a buffer is smaller than this size, then it
-will be backed by a parent SlowBuffer object. If it is larger than this,
-then Node will allocate a SlowBuffer slab for it directly.
+A `SlowBuffer` is simply a non-pooled Buffer instance. In specific cases where a
+developer knows a small chunk of data needs to exist for an indefinite time,
+copying that data into a `SlowBuffer` can help prevent memory leaks.
@bnoordhuis
bnoordhuis Jun 12, 2013 Node.js Foundation member

Needs more detail. Worded like this, people will cargo-cult it.

@bnoordhuis bnoordhuis commented on an outdated diff Jun 12, 2013
src/smalloc.cc
+using v8::FunctionTemplate;
+using v8::Handle;
+using v8::HandleScope;
+using v8::HeapProfiler;
+using v8::Isolate;
+using v8::Local;
+using v8::Object;
+using v8::Persistent;
+using v8::RetainedObjectInfo;
+using v8::String;
+using v8::Uint32;
+using v8::Value;
+using v8::kExternalUnsignedByteArray;
+
+
+struct callback_info {
@bnoordhuis
bnoordhuis Jun 12, 2013 Node.js Foundation member

Prefer CamelCased type names.

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 12, 2013
src/smalloc.cc
+using v8::kExternalUnsignedByteArray;
+
+
+struct callback_info {
+ void* hint;
+ free_callback cb;
+};
+
+typedef v8::WeakReferenceCallbacks<Object, char>::Revivable Callback;
+typedef v8::WeakReferenceCallbacks<Object, void>::Revivable FreeCallback;
+
+Callback target_cb;
+FreeCallback target_free_cb;
+
+void TargetCallback(Isolate* env, Persistent<Object>* target, char* arg);
+void TargetFreeCallback(Isolate* env, Persistent<Object>* target, void* arg);
@bnoordhuis
bnoordhuis Jun 12, 2013 Node.js Foundation member

s/env/isolate/. I suspect you picked that up from V8's cctest?

@trevnorris
trevnorris Jun 12, 2013 collaborator

yeah.

@bnoordhuis bnoordhuis commented on an outdated diff Jun 12, 2013
src/smalloc.cc
+}
+
+
+void Alloc(Handle<Object> obj,
+ char* data,
+ size_t length,
+ free_callback fn,
+ void* hint) {
+ assert(data != NULL);
+
+ callback_info* cb_info = new callback_info;
+ cb_info->cb = fn;
+ cb_info->hint = hint;
+ Persistent<Object> p_obj(node_isolate, obj);
+
+ node_isolate->AdjustAmountOfExternalAllocatedMemory(length + sizeof(cb_info));
@bnoordhuis
bnoordhuis Jun 12, 2013 Node.js Foundation member

sizeof(*cb_info)

@bnoordhuis bnoordhuis commented on an outdated diff Jun 12, 2013
src/smalloc.cc
+ p_obj.MarkIndependent(node_isolate);
+ p_obj.SetWrapperClassId(node_isolate, ALLOC_ID);
+ p_obj->SetIndexedPropertiesToExternalArrayData(data,
+ kExternalUnsignedByteArray,
+ length);
+}
+
+
+// TODO(trevnorris): running AllocDispose will cause data == NULL, which is
+// then passed to cb_info->cb, which the user will need to check for.
+void TargetFreeCallback(Isolate* env, Persistent<Object>* target, void* arg) {
+ Local<Object> obj = **target;
+ int len = obj->GetIndexedPropertiesExternalArrayDataLength();
+ char* data = static_cast<char*>(obj->GetIndexedPropertiesExternalArrayData());
+ callback_info* cb_info = static_cast<callback_info*>(arg);
+ env->AdjustAmountOfExternalAllocatedMemory(-(len + sizeof(cb_info)));
@bnoordhuis
bnoordhuis Jun 12, 2013 Node.js Foundation member

sizeof(*cb_info)

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 12, 2013
src/smalloc.h
+
+namespace node {
+
+/**
+ * Simple memory allocator.
+ *
+ * Utilities for external memory allocation management. Is an abstraction for
+ * v8's external array data handling to simplify and centralize how this is
+ * managed.
+ */
+namespace smalloc {
+
+// mirrors deps/v8/src/objects.h
+NODE_EXTERN static const unsigned int kMaxLength = 0x3fffffff;
+
+NODE_EXTERN typedef void (*free_callback)(char* data, void* hint);
@bnoordhuis
bnoordhuis Jun 12, 2013 Node.js Foundation member

s/free_callback/FreeCallback/

EDIT: I don't care strongly though. If it's a lot of work to fix up, let it be.

@trevnorris
trevnorris Jun 12, 2013 collaborator

that name's for backwards compatibility with Buffer. i'm more than happy to change that if you say it's ok.

@trevnorris
trevnorris Jun 13, 2013 collaborator

@bnoordhuis it's not difficult to change at all in core. in fact it's not even used in core anymore. the reason for backwards compatibility was for module developers, but since some of the backwards compatibility will be breaking anyways this seems like a trivial change. I'll go ahead and do it.

@trevnorris
trevnorris Jun 17, 2013 collaborator

eh screw it. they'll have to append smalloc:: to it anyways. i'll change it. :)

@bnoordhuis bnoordhuis commented on the diff Jun 14, 2013
lib/buffer.js
}
};
-// slice(start, end)
-SlowBuffer.prototype.slice = function(start, end) {
- var len = this.length;
- start = clamp(start, len, 0);
- end = clamp(end, len, len);
- return new Buffer(this, end - start, start);
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

Why length == null rather than typeof length === 'undefined'?

@trevnorris
trevnorris Jun 15, 2013 collaborator

because in 3.16 they uber optimized for == null checks. but savings is in the nano seconds. i'll change it.

@bnoordhuis
bnoordhuis Jun 15, 2013 Node.js Foundation member

Interesting. You can leave it like that if you want. Do you know what V8 commits are the relevant ones?

@trevnorris
trevnorris Jun 17, 2013 collaborator

going to change it. core should be more explicit about each type it checks.

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 14, 2013
lib/buffer.js
+ if (length == null) {
+ length = 0;
+ for (var i = 0; i < list.length; i++)
+ length += list[i].length;
+ } else {
+ length = ~~length;
+ }
+
+ if (length < 0) length = 0;
+
+ if (list.length === 0)
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

Always false.

@trevnorris
trevnorris Jun 17, 2013 collaborator

eh? Buffer.concat([])

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 14, 2013
lib/buffer.js
-// Buffer
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

I usually cache the value of .length in a local variable. In theory, V8 knows it's loop invariant and optimizes away the lookup. In practice, it doesn't (always?) seem to do it.

@trevnorris
trevnorris Jun 17, 2013 collaborator

i'll run some tests, but the loop size here is going to be small in practice not going to bother.

@bnoordhuis bnoordhuis commented on an outdated diff Jun 14, 2013
lib/buffer.js
length = remaining;
- } else {
- length = +length;
- if (length > remaining) {
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

!!encoding ? (encoding + '').toLowerCase() : 'utf8'? That's what you use elsewhere.

@bnoordhuis bnoordhuis commented on the diff Jun 14, 2013
lib/buffer.js
-Buffer.concat = function(list, length) {
- if (!Array.isArray(list)) {
- throw new TypeError('Usage: Buffer.concat(list, [length])');
- }
-
- if (list.length === 0) {
- return new Buffer(0);
- } else if (list.length === 1) {
- return list[0];
- }
-
- if (typeof length !== 'number') {
- length = 0;
- for (var i = 0; i < list.length; i++) {
- var buf = list[i];
- length += buf.length;
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

start ? ~~start : 0 is kind of superfluous, right?

@bnoordhuis bnoordhuis commented on an outdated diff Jun 14, 2013
src/node_buffer.cc
-#define BUFFER_CLASS_ID (0xBABE)
+#define ARGS_THIS(argT) \
+ Local<Object> obj = argT; \
+ size_t obj_length = obj->GetIndexedPropertiesExternalArrayDataLength(); \
+ char* obj_data = static_cast<char*>( \
+ obj->GetIndexedPropertiesExternalArrayData()); \
+
+#define SLICE_START_END(start_arg, end_arg, end_max) \
+ size_t start = ParseArrayIndex(start_arg, 0); \
+ size_t end = ParseArrayIndex(end_arg, end_max); \
+ if (end < start) end = start; \
+ if (end > end_max) ThrowRangeError("out of range index"); \
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

Missing return?

@bnoordhuis bnoordhuis commented on an outdated diff Jun 14, 2013
src/node_buffer.cc
-#define BUFFER_CLASS_ID (0xBABE)
+#define ARGS_THIS(argT) \
+ Local<Object> obj = argT; \
+ size_t obj_length = obj->GetIndexedPropertiesExternalArrayDataLength(); \
+ char* obj_data = static_cast<char*>( \
+ obj->GetIndexedPropertiesExternalArrayData()); \
+
+#define SLICE_START_END(start_arg, end_arg, end_max) \
+ size_t start = ParseArrayIndex(start_arg, 0); \
+ size_t end = ParseArrayIndex(end_arg, end_max); \
+ if (end < start) end = start; \
+ if (end > end_max) ThrowRangeError("out of range index"); \
+ size_t length = end - start;
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

Either you're missing a trailing backslash here or ARGS_THIS has one too many. :-)

@bnoordhuis bnoordhuis commented on an outdated diff Jun 14, 2013
src/node_buffer.cc
- // get Buffer from global scope.
- Local<Object> global = v8::Context::GetCurrent()->Global();
- Local<Value> bv = global->Get(String::NewSymbol("Buffer"));
- assert(bv->IsFunction());
- Local<Function> b = Local<Function>::Cast(bv);
+bool HasInstance(Handle<Value> val) {
+ if (!val->IsObject())
+ return false;
+ return HasInstance(val->ToObject());
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

val.As<Object>()?

@bnoordhuis bnoordhuis commented on an outdated diff Jun 14, 2013
src/node_buffer.cc
}
-Buffer* Buffer::New(size_t length) {
- HandleScope scope(node_isolate);
+char* Data(Handle<Value> val) {
+ return Data(val->ToObject());
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

assert(val->IsObject())?

@bnoordhuis bnoordhuis commented on an outdated diff Jun 14, 2013
src/node_buffer.cc
}
-Buffer* Buffer::New(const char* data, size_t length) {
- HandleScope scope(node_isolate);
+size_t Length(Handle<Value> val) {
+ return Length(val->ToObject());
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

Ditto.

@bnoordhuis bnoordhuis commented on an outdated diff Jun 14, 2013
src/node_buffer.cc
HandleScope scope(node_isolate);
- Local<Value> arg = Integer::NewFromUnsigned(0, node_isolate);
- Local<Object> obj = constructor_template->GetFunction()->NewInstance(1, &arg);
+ if (length > kMaxLength)
+ ThrowRangeError("length > kMaxLength");
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

Missing return?

@bnoordhuis bnoordhuis commented on an outdated diff Jun 14, 2013
src/node_buffer.cc
- Buffer *buffer = ObjectWrap::Unwrap<Buffer>(obj);
- buffer->Replace(data, length, callback, hint);
+ Handle<Value> argv[2];
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

Maybe add a comment why it's save to use Handle rather than Local here (Undefined() is a singleton and length fits in an SMI, hence no risk of the GC reclaiming the values prematurely.)

@bnoordhuis bnoordhuis commented on an outdated diff Jun 14, 2013
src/node_buffer.cc
HandleScope scope(node_isolate);
- if (!args[0]->IsUint32()) return ThrowTypeError("Bad argument");
+ if (length > kMaxLength)
+ ThrowRangeError("length > kMaxLength");
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

Missing return? Note that ThrowRangeError() returns a Handle<Value>.

EDIT: Happens in more places.

@bnoordhuis bnoordhuis commented on an outdated diff Jun 14, 2013
src/node_buffer.cc
HandleScope scope(node_isolate);
- if (!args[0]->IsInt32()) {
- return ThrowException(Exception::Error(String::New(
- "value is not a number")));
- }
- int value = (char)args[0]->Int32Value();
-
- Buffer *parent = ObjectWrap::Unwrap<Buffer>(args.This());
- SLICE_ARGS(args[1], args[2])
+ Handle<Object> target = args[0]->ToObject();
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

Should be Local. It's not officially deprecated yet.

@bnoordhuis bnoordhuis commented on an outdated diff Jun 14, 2013
src/node_buffer.cc
- return Undefined(node_isolate);
-}
+ ARGS_THIS(args.This())
+ size_t target_length = target->GetIndexedPropertiesExternalArrayDataLength();
+ char* target_data = static_cast<char*>(target->
@bnoordhuis
bnoordhuis Jun 14, 2013 Node.js Foundation member

Prefer to write that like this:

char* target_data = static_cast<char*>(
    target->GetIndexedPropertiesExternalArrayData());
@bnoordhuis bnoordhuis commented on an outdated diff Jun 15, 2013
src/node_internals.h
@@ -137,6 +137,19 @@ inline bool IsBigEndian() {
return GetEndianness() == kBigEndian;
}
+// parse index for external array data
+inline static size_t ParseArrayIndex(v8::Handle<v8::Value> arg, size_t def) {
+ if (arg->IsUndefined())
+ return def;
+
+ int32_t tmp_i = arg->Int32Value();
+
+ if (tmp_i < 0)
+ ThrowRangeError("out of range index");
@bnoordhuis
bnoordhuis Jun 15, 2013 Node.js Foundation member

I'm not a fan of how this schedules an exception, it's a non-obvious side effect. Maybe return static_cast<size_t>(-1) and have callers check the return value.

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 15, 2013
src/node_buffer.cc
- if (source_end > source->length_) {
- return ThrowRangeError("sourceEnd out of bounds");
+ if (args[0]->IsNumber()) {
+ int value = static_cast<char>(args[0]->Int32Value());
@bnoordhuis
bnoordhuis Jun 15, 2013 Node.js Foundation member

args[0]->Int32Value() & 255?

@bnoordhuis
bnoordhuis Jun 15, 2013 Node.js Foundation member

Make that Uint32Value(). Binary operations on signed integers are implementation-defined (though in this case it'd work out okay.)

@trevnorris
trevnorris Jun 17, 2013 collaborator

cool, will do. got that technique from the previous code. :)

@bnoordhuis
Member

Some comments but mostly LGTM.

@trevnorris
Collaborator

@bnoordhuis all fixed. also changed how ParseArrayIndex works.

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 17, 2013
src/node_buffer.cc
#include <limits.h>
-#define MIN(a,b) ((a) < (b) ? (a) : (b))
+#define MIN(a, b) ((a) < (b) ? (a) : (b))
+
+#define CHECK_OOB(r) \
+ if (r) return ThrowRangeError("out of range index");
@bnoordhuis
bnoordhuis Jun 17, 2013 Node.js Foundation member

Wrap in a do { ... } while (0) block without a trailing semi-colon.

@trevnorris
trevnorris Jun 17, 2013 collaborator

done

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 17, 2013
src/node_buffer.cc
- size_t length = args[0]->Uint32Value();
- if (length > Buffer::kMaxLength) {
- return ThrowRangeError("length > kMaxLength");
- }
- new Buffer(args.This(), length);
+ Handle<Value> argv[2];
+ argv[0] = Undefined(node_isolate);
@bnoordhuis
bnoordhuis Jun 17, 2013 Node.js Foundation member

Maybe repeat that comment here from a few lines up. Ditto for the similar code below.

@trevnorris
trevnorris Jun 17, 2013 collaborator

done

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 17, 2013
src/smalloc.h
+
+namespace node {
+
+/**
+ * Simple memory allocator.
+ *
+ * Utilities for external memory allocation management. Is an abstraction for
+ * v8's external array data handling to simplify and centralize how this is
+ * managed.
+ */
+namespace smalloc {
+
+// mirrors deps/v8/src/objects.h
+NODE_EXTERN static const unsigned int kMaxLength = 0x3fffffff;
+
+NODE_EXTERN typedef void (*freeCallback)(char* data, void* hint);
@bnoordhuis
bnoordhuis Jun 17, 2013 Node.js Foundation member

s/freeCallback/FreeCallback/

@trevnorris
trevnorris Jun 17, 2013 collaborator

heh :P

@trevnorris
trevnorris Jun 17, 2013 collaborator

done

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 17, 2013
src/node_buffer.cc
HandleScope scope(node_isolate);
- Local<Value> arg = Integer::NewFromUnsigned(0, node_isolate);
- Local<Object> obj = constructor_template->GetFunction()->NewInstance(1, &arg);
+ if (length > kMaxLength)
+ return Local<Object>::New(node_isolate,
+ ThrowRangeError("length > kMaxLength").As<Object>());
@bnoordhuis
bnoordhuis Jun 17, 2013 Node.js Foundation member

This does not work like you think it does. ThrowRangeError() only schedules the exception, it doesn't return it. It's actual return value is Undefined which, when cast to Object, will produce undefined (pun intended) behavior.

@trevnorris
trevnorris Jun 17, 2013 collaborator

@bnoordhuis ah, well... um. ok, i'm at a lose what should be done then.

@trevnorris
trevnorris Jun 17, 2013 collaborator

changed to assert()

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 17, 2013
src/node_internals.h
@@ -137,6 +143,26 @@ inline bool IsBigEndian() {
return GetEndianness() == kBigEndian;
}
+// parse index for external array data
+inline bool ParseArrayIndex(v8::Handle<v8::Value> arg,
+ size_t def,
+ size_t* ret) MUST_USE_RESULT;
+
+bool ParseArrayIndex(v8::Handle<v8::Value> arg, size_t def, size_t* ret) {
+ if (arg->IsUndefined()) {
+ *ret = def;
+ return false;
@bnoordhuis
bnoordhuis Jun 17, 2013 Node.js Foundation member

Returning false on success and true on error is kind of counter-intuitive.

@trevnorris
trevnorris Jun 17, 2013 collaborator

fixed. thanks.

@bnoordhuis bnoordhuis commented on an outdated diff Jun 17, 2013
src/node_buffer.cc
- Local<Value> argv[1] = { Local<Value>::New(node_isolate, string) };
- Local<Object> instance = b->NewInstance(1, argv);
+bool HasInstance(Handle<Value> val) {
+ if (!val->IsObject())
+ return false;
+ return HasInstance(val.As<Object>());
@bnoordhuis
bnoordhuis Jun 17, 2013 Node.js Foundation member

Could be simplified to return val->IsObject() && HasInstance(val.As<Object>());

@bnoordhuis bnoordhuis commented on an outdated diff Jun 17, 2013
src/node_buffer.cc
- return scope.Close(instance);
+bool HasInstance(Handle<Object> obj) {
+ if (!obj->HasIndexedPropertiesInExternalArrayData())
+ return false;
+ v8::ExternalArrayType type = obj->GetIndexedPropertiesExternalArrayDataType();
+ if (type != v8::kExternalUnsignedByteArray)
+ return false;
+ return true;
@bnoordhuis
bnoordhuis Jun 17, 2013 Node.js Foundation member

Could be simplified to return type == v8::kExternalUnsignedByteArray;

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 18, 2013
src/node_buffer.cc
#include <limits.h>
-#define MIN(a,b) ((a) < (b) ? (a) : (b))
+#define MIN(a, b) ((a) < (b) ? (a) : (b))
+
+#define CHECK_NOT_OOB(r) \
+ do { if (!r) return ThrowRangeError("out of range index"); } while (0)
@bnoordhuis
bnoordhuis Jun 18, 2013 Node.js Foundation member

if (!(r)). Right now, CHECK_NOT_OOB(end > end_max && end > 0) gets evaluated as if (!end > end_max && ...) which presumably is not what you want.

Also, the line before it is too long (81 characters.) The backslash should preferably go on the 79th column.

@trevnorris
trevnorris Jun 18, 2013 collaborator

oy. two n00b mistakes. thanks.

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 18, 2013
src/node_buffer.cc
+ Local<Function> bv = args[0].As<Function>();
+
+ p_buffer_fn = Persistent<Function>::New(node_isolate, bv);
+
+ Local<Object> proto = bv->Get(String::New("prototype")).As<Object>();
@bnoordhuis
bnoordhuis Jun 18, 2013 Node.js Foundation member

I would add bv->IsFunction() and proto->IsObject() sanity checks here.

@trevnorris
trevnorris Jun 18, 2013 collaborator

bv is checked above as args[0]->IsFunction(). i'll add the assert on proto, thanks for pointing that out.

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 18, 2013
src/node_buffer.h
-} // namespace node buffer
+namespace Buffer {
+
+NODE_EXTERN static const unsigned int kMaxLength = smalloc::kMaxLength;
@bnoordhuis
bnoordhuis Jun 18, 2013 Node.js Foundation member

I don't think this needs to be declared NODE_EXTERN. A static const variable has no linkage as long as you don't take its address.

/cc @piscisaureus - Is that true on Redmond OS as well?

@piscisaureus
piscisaureus Jun 18, 2013 Node.js Foundation member

It works on windows I think. dllimport/export on static variables acts as a shorthand to automatically do the indirection. e.g.

__declspec(dllexport) int bar = 5;

is roughtly equivalent to

int bar_value = 5;
__declspec(dllexport) int* bar = &bar_value;

and

__declspec(dllimport) int bar;
bar++

becomes

__declspec(dllimport) int* bar;
*bar++
@bnoordhuis
Member

Nearly there, Trevor. :-)

@trevnorris
Collaborator

@bnoordhuis thanks dude. those 3 things are fixed. :)

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 18, 2013
src/smalloc.cc
+ size_t length = args[4]->Uint32Value();
+ size_t source_length = source->GetIndexedPropertiesExternalArrayDataLength();
+ size_t dest_length = dest->GetIndexedPropertiesExternalArrayDataLength();
+ char* source_data = static_cast<char*>(
+ source->GetIndexedPropertiesExternalArrayData());
+ char* dest_data = static_cast<char*>(
+ dest->GetIndexedPropertiesExternalArrayData());
+
+ assert(source_data != NULL);
+ assert(dest_data != NULL);
+ // necessary to check in case (source|dest)_start _and_ length overflow
+ assert(length <= source_length);
+ assert(length <= dest_length);
+ // now we can guarantee these will catch oob access and *_start overflow
+ assert(source_start + length <= source_length);
+ assert(dest_start + length <= dest_length);
@bnoordhuis
bnoordhuis Jun 18, 2013 Node.js Foundation member

I think there's a logic bug here. Assuming size_t has 32 bits, then if dest_start=0xfffffffe, length=10 and dest_length=20 then (length < dest_length) and (dest_start + length <= dest_length) both hold but dest_data + dest_start will point outside the buffer.

@trevnorris
trevnorris Jun 18, 2013 collaborator

added the two necessary checks. thanks for pointing this out.

@bnoordhuis bnoordhuis commented on an outdated diff Jun 18, 2013
src/node_buffer.cc
- return scope.Close(Integer::New(to_copy, node_isolate));
-}
+ // optimize single ascii character case
+ if (at_length == 1) {
+ int value = static_cast<int>((*at)[0]);
+ if (value <= 127) {
@bnoordhuis
bnoordhuis Jun 18, 2013 Node.js Foundation member

This is always true when char is signed: (int) (signed char) 128 == -128

@bnoordhuis bnoordhuis and 1 other commented on an outdated diff Jun 18, 2013
src/node_buffer.cc
#include <limits.h>
-#define MIN(a,b) ((a) < (b) ? (a) : (b))
+#define MIN(a, b) ((a) < (b) ? (a) : (b))
+
+#define CHECK_NOT_OOB(r) \
+ do { if (!r) return ThrowRangeError("out of range index"); } while (0)
@bnoordhuis
bnoordhuis Jun 18, 2013 Node.js Foundation member

Please put the parentheses around the expression here (i.e. if (!(r))) rather than adding parens on a case-by-case basis.

@trevnorris
trevnorris Jun 18, 2013 collaborator

thanks. done.

trevnorris added some commits Apr 17, 2013
@trevnorris trevnorris smalloc: initial implementation
smalloc is a simple utility for quickly allocating external memory onto
js objects. This will be used to centralize how memory is managed in
node, and will become the backer for Buffers. So in the future crypto's
SlabBuffer, stream's SlabAllocator will be removed.

Note on the js API: because no arguments are optional the order of
arguments have been placed to match their cc counterparts as closely as
possible.
8f3f9f7
@trevnorris trevnorris smalloc: add api to manually dispose Persistent
If the user knows the allocation is no longer needed then the memory can
be manually released.

Currently this will not ClearWeak the Persistent, so the callback will
still run.

If the user passed a ClearWeak callback, and then disposed the object,
the buffer callback argument will == NULL.
252cdfa
@trevnorris trevnorris buffer: use smalloc as backing data store
Memory allocations are now done through smalloc. The Buffer cc class has
been removed completely, but for backwards compatibility have left the
namespace as Buffer.

The .parent attribute is only set if the Buffer is a slice of an
allocation. Which is then set to the alloc object (not a Buffer).

The .offset attribute is now a ReadOnly set to 0, for backwards
compatibility. I'd like to remove it in the future (pre v1.0).

A few alterations have been made to how arguments are either coerced or
thrown. All primitives will now be coerced to their respective values,
and (most) all out of range index requests will throw.

The indexes that are coerced were left for backwards compatibility. For
example: Buffer slice operates more like Array slice, and coerces
instead of throwing out of range indexes. This may change in the future.

The reason for wanting to throw for out of range indexes is because
giving js access to raw memory has high potential risk. To mitigate that
it's easier to make sure the developer is always quickly alerted to the
fact that their code is attempting to access beyond memory bounds.

Because SlowBuffer will be deprecated, and simply returns a new Buffer
instance, all tests on SlowBuffer have been removed.

Heapdumps will now show usage under "smalloc" instead of "Buffer".

ParseArrayIndex was added to node_internals to support proper uint
argument checking/coercion for external array data indexes.

SlabAllocator had to be updated since handle_ no longer exists.
3a2f273
@trevnorris trevnorris buffer: reimplement Buffer pools
While the new Buffer implementation is much faster we still have the
necessity of using Buffer pools. This is undesirable because it may
still lead to unwanted memory retention, but for the time being this is
the best solution.

Because of this re-introduction, and since there is no more SlowBuffer
type, the SlowBuffer method has been re-purposed to return a non-pooled
Buffer instance. This will be helpful for developers to store data for
indeterminate lengths of time without introducing a memory leak.

Another change to Buffer pools was that they are only allocated if the
requested chunk is < poolSize / 2. This was done because allocations are
much quicker now, and it's a better use of the pool.
456942a
@trevnorris trevnorris buffer: deprecate legacy code
Several things are now no longer necessary. These have been deprecated,
and will be removed in v0.13.
f489649
@trevnorris trevnorris buffer: remove c-style casts 56869d9
@trevnorris trevnorris buffer: expose class methods alloc and dispose
Expose the ability for users to allocate and manually dispose data on
any object. These are user-safe versions of internal smalloc functions.
fb40da8
@trevnorris trevnorris buffer: implement new fill behavior
Old fill would take the char code of the first character and wrap around
the int to fit in the 127 range. Now fill will duplicate whatever string
is given through the entirety of the buffer.

Note: There is one bug around ending on a partial fill of any character
outside the ASCII range.
4b40358
@trevnorris trevnorris buffer: proper API export for Windows
So that Windows users can properly include smalloc and node_buffer,
NODE_EXTERN was added to the headers that export this functionality.
7373c4d
@bnoordhuis
Member

LGTM

@trevnorris trevnorris merged commit 7373c4d into nodejs:master Jun 18, 2013

1 check was pending

default
@edhemphill

I'm pretty late to the party and was hoping someone could catch me up... What's the correct way to wrap existing memory with a Buffer? I can't seem to get the free_callback to call with Buffer. I know there was some chatter on IRC quite a while back on getting rid of this. We're on the 0.10.x series but can move if necessary...

void free_test_cb(char *m,void *hint) {
    DBG_OUT("FREEING MEMORY.");
    free(m);
}

Handle<Value> WrapMemBufferTest(const Arguments& args) {
    HandleScope scope;
    char *mem = (char *) ::malloc(100);
    memset(mem,'A',100);
    node::Buffer *buf = node::Buffer::New(mem,100,free_test_cb,0);
    return scope.Close(buf->handle_);
}

But the free_test_cb() is just not getting called in a simple test program.
...and then I even tried throwing these in there:

void weak_cb(Persistent<Value> object, void* parameter) {
    object.Dispose();
}

Handle<Value> WrapMemBufferTest(const Arguments& args) {
    HandleScope scope;
    char *mem = (char *) ::malloc(100);
    memset(mem,'A',100);
    node::Buffer *buf = node::Buffer::New(mem,100,free_test_cb,0);
    buf->handle_.MakeWeak(NULL, weak_cb);  // new
    return scope.Close(buf->handle_);
}

Any advice appreciated.

@bnoordhuis
Member

@edhemphill Finalizers are deferred, they don't run inmmediately when the last reference to the object goes away. If your program is short-lived, they may not run. Theoretically, even in longer-lived programs, they may not run ever. If your resource needs a deterministic life cycle, you have to manage it yourself.

@edhemphill

@bnoordhuis Interesting... thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment