Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(js): Extract *info functions into individual functions #4751

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

phated
Copy link
Contributor

@phated phated commented Jun 24, 2022

This is a draft that builds upon #4750

Binaryen.ml provides separate functions for each of these utilities, but using the getMemorySegmentInfoByIndex helper function does a lot of extra work on each function call. By providing these under a namespace, the individual functions can be provided and then used in the "info" helper.

I'd like to do this to the other "info" helpers in the JS wrapper, but wanted to gather feedback first.

@dcodeIO
Copy link
Contributor

dcodeIO commented Jun 24, 2022

Reminds me of the existing classes for Expression, Block, Function etc., which provide both namespaced static functions like Function.getName and convenient wrapper classes with getters for Function#name etc. through deriveWrapperInstanceMembers. Not all of the elements have a class yet, but perhaps here there could be a MemorySegment class (perhaps on top of a MemorySegmentRef) utilizing the existing concept?

MemorySegment
 .getOffset
 .setOffset
 .getData
 .setData
 .isPassive
 .setPassive
 #offset [get/set]
 #data [get/set]
 #passive [get/set]

Here's what's done for functions, for example:

// Function wrapper
Module['Function'] = (() => {
// Closure compiler doesn't allow multiple `Function`s at top-level, so:
function Function(func) {
if (!(this instanceof Function)) {
if (!func) return null;
return new Function(func);
}
if (!func) throw Error("function reference must not be null");
this[thisPtr] = func;
}
Function['getName'] = function(func) {
return UTF8ToString(Module['_BinaryenFunctionGetName'](func));
};
Function['getParams'] = function(func) {
return Module['_BinaryenFunctionGetParams'](func);
};
Function['getResults'] = function(func) {
return Module['_BinaryenFunctionGetResults'](func);
};
Function['getNumVars'] = function(func) {
return Module['_BinaryenFunctionGetNumVars'](func);
};
Function['getVar'] = function(func, index) {
return Module['_BinaryenFunctionGetVar'](func, index);
};
Function['getNumLocals'] = function(func) {
return Module['_BinaryenFunctionGetNumLocals'](func);
};
Function['hasLocalName'] = function(func, index) {
return Boolean(Module['_BinaryenFunctionHasLocalName'](func, index));
};
Function['getLocalName'] = function(func, index) {
return UTF8ToString(Module['_BinaryenFunctionGetLocalName'](func, index));
};
Function['setLocalName'] = function(func, index, name) {
preserveStack(() => {
Module['_BinaryenFunctionSetLocalName'](func, index, strToStack(name));
});
};
Function['getBody'] = function(func) {
return Module['_BinaryenFunctionGetBody'](func);
};
Function['setBody'] = function(func, bodyExpr) {
Module['_BinaryenFunctionSetBody'](func, bodyExpr);
};
deriveWrapperInstanceMembers(Function.prototype, Function);
Function.prototype['valueOf'] = function() {
return this[thisPtr];
};
return Function;
})();

@phated
Copy link
Contributor Author

phated commented Jun 24, 2022

Yeah, I was wondering if doing something similar to Expressions was desirable. I'm fine implementing like that, but it'll add more work, which I'll have to find some time to do.

Is there a way to move forward with the straightforward "instance logic" that I did or would the PR only be accepted if the static/class implementation were completed?

@tlively
Copy link
Member

tlively commented Jun 28, 2022

cc @ashleynh, who has worked on the internal representation of data segments in support of multiple memory. @dcodeIO's idea to reuse that class utility sounds like a good goal to me, but intermediate progress might be useful to land as well.

Comment on lines +713 to +718
'module'() {
return UTF8ToString(Module['_BinaryenMemoryImportGetModule'](module));
},
'base'() {
return UTF8ToString(Module['_BinaryenMemoryImportGetBase'](module));
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be on something like self['memoryImport']?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok as it is.

Comment on lines 725 to 728
'max'() {
if (Module['_BinaryenMemoryHasMax'](module)) {
return Module['_BinaryenMemoryGetMax'](module);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't sure if this should have the if condition inside the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if there is no maximum then we should probably either error or return Infinity. Doing it here in JS makes the most sense to me (we couldn't return Infinity from C).

@phated phated marked this pull request as ready for review July 8, 2022 21:14
@phated
Copy link
Contributor Author

phated commented Jul 8, 2022

Thanks for the guidance folks! I tried to replicate patterns I had seen elsewhere in binaryen.js-post.js. Please take a look and let me know if I should be doing anything differently.

I've updated the getSomethingInfo functions that we want to remove in Grain but there are still more. Ideally, this PR can guide others in updating the rest (and of course I'll be updating things as we use them in Grain).

@phated phated changed the title feat: Extract memorySegment functions into individual functions feat: Extract *info functions into individual functions Jul 8, 2022
@phated phated changed the title feat: Extract *info functions into individual functions feat(js): Extract *info functions into individual functions Jul 8, 2022
Comment on lines 725 to 728
'max'() {
if (Module['_BinaryenMemoryHasMax'](module)) {
return Module['_BinaryenMemoryGetMax'](module);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if there is no maximum then we should probably either error or return Infinity. Doing it here in JS makes the most sense to me (we couldn't return Infinity from C).

Comment on lines +713 to +718
'module'() {
return UTF8ToString(Module['_BinaryenMemoryImportGetModule'](module));
},
'base'() {
return UTF8ToString(Module['_BinaryenMemoryImportGetBase'](module));
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ok as it is.

// ElementSegment wrapper
Module['ElementSegment'] = (() => {
// Closure compiler doesn't allow multiple `Function`s at top-level, so:
function Function(func) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should all these Function be ElementSegment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably. I wasn't sure if these constructors should match the key or if Function was meant to be generic (due to the comment). I'll change them.

@phated
Copy link
Contributor Author

phated commented Jul 12, 2022

@kripken thanks for the feedback! I've made the updates you suggested

if (Module['_BinaryenMemoryHasMax'](module)) {
memoryInfo['max'] = Module['_BinaryenMemoryGetMax'](module);
var max = self['memory']['max']();
if (max) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this checking? If the max is 0?

I'd expect this to check if the max is Infinity, but maybe I'm missing something...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kripken I'm not sure, I copied this from the existing code. Who wrote that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kripken how about this?

Suggested change
if (max) {
if (max !== Infinity) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I guess the NFC value here would be undefined. Apart from this, if my other suggestions are agreed upon to be considered, there will be a HasMax function around that can be used instead, closely matching the previous code.

Comment on lines +713 to +730
'module'() {
return UTF8ToString(Module['_BinaryenMemoryImportGetModule'](module));
},
'base'() {
return UTF8ToString(Module['_BinaryenMemoryImportGetBase'](module));
},
'initial'() {
return Module['_BinaryenMemoryGetInitial'](module);
},
'shared'() {
return Boolean(Module['_BinaryenMemoryIsShared'](module));
},
'max'() {
if (Module['_BinaryenMemoryHasMax'](module)) {
return Module['_BinaryenMemoryGetMax'](module);
} else {
return Infinity;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intended that these are mixed with the instructions like memory.fill? Seems that it's quite easy to mistake them for instructions like memory.shared. Also, in light of multi memory, there'll likely need to be a memory wrapper anyhow to deal with any memory, so I wouldn't expect these to last for very long.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see where Alon said "I think it's ok as it is." above.

I'm just making these easier to use, if this hasn't been updated for multiple memory, than someone else will need to help make those changes.

Copy link
Contributor

@dcodeIO dcodeIO Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, has the following variant been considered?

Module#getMemoryModule
Module#getMemoryBase
Module#getMemoryInitial
Module#isMemoryShared
Module#hasMemoryMax
Module#getMemoryMax

Those would about match Module#setMemory for now, which is similarly subject to change, with all of them not conflicting with instructions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not considered anything (maybe others have) because I'm just converting what is here from "expensive grab-bag" to "individual functions". Further improvements can be done later, as @tlively mentioned in the primary thread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this is a misunderstanding. I am not suggesting any significant changes in the above variant, just to move / rename these from memory (which so far covers exclusively instructions) to the Module instance (where methods of this kind without wrappers would typically go). Basically next to setMemory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave that up to @kripken since he said it was fine to have these here when I asked.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point @dcodeIO , you're right that this would mix instructions like memory.fill() with helpers. We've put the helpers on the relevant JS objects using makeExpressionWrapper. So I agree that adding new getX/setX functions alongside those is more consistent. That seems best.

@phated Sorry for not reading this more in detail the first time! (I am not as up to date on the JS API as I used to be, since I've been busy with other things...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can someone else pick up these changes? We really need these changes in Grain and I don't have time to continue to make these large changes, as I'm actively needing to interview for jobs.

Comment on lines +4862 to +4863
Module['GlobalImport'] = (() => {
// Closure compiler doesn't allow multiple `GlobalImport`s at top-level, so:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IIFE here is not necessary. Only affects JS built-ins like Function.

Comment on lines +4886 to +4887
Module['Global'] = (() => {
// Closure compiler doesn't allow multiple `Global`s at top-level, so:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, the IIFE here is not necessary. Only affects Function.

Comment on lines +4916 to +4917
Module['FunctionImport'] = (() => {
// Closure compiler doesn't allow multiple `FunctionImport`s at top-level, so:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As well: The IIFE here is not necessary. Only affects JS built-ins like Function.

Comment on lines +4835 to +4836
Module['Export'] = (() => {
// Closure compiler doesn't allow multiple `Export`s at top-level, so:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Comment on lines +4802 to +4803
Module['ElementSegment'] = (() => {
// Closure compiler doesn't allow multiple `ElementSegment`s at top-level, so:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same :)

Comment on lines +2542 to +2558
self['memorySegment'] = {
'offset'(id) {
return Module['_BinaryenGetMemorySegmentByteOffset'](module, id);
},
'data'(id) {
const size = Module['_BinaryenGetMemorySegmentByteLength'](module, id);
const ptr = _malloc(size);
Module['_BinaryenCopyMemorySegmentData'](module, id, ptr);
const res = new Uint8Array(size);
res.set(new Uint8Array(buffer, ptr, size));
_free(ptr);
return res.buffer;
},
'passive'(id) {
return Boolean(Module['_BinaryenGetMemorySegmentPassive'](module, id));
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another suggestion to make this more uniform with other code. Move / rename these to

Module#getMemorySegmentByteOffset
Module#getMemorySegmentData
Module#isMemorySegmentPassive

Then becomes

myModule.getMemorySegmentByteOffset(segmentIndex)

instead of

myModule.memorySegment.offset(id)

Apart from adhering to surrounding code style, this also has the advantage that these won't conflict with eventual setters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants