Skip to content

Additional C-API utilities#1261

Closed
dcodeIO wants to merge 16 commits intoWebAssembly:masterfrom
dcodeIO:js-check-expression-types
Closed

Additional C-API utilities#1261
dcodeIO wants to merge 16 commits intoWebAssembly:masterfrom
dcodeIO:js-check-expression-types

Conversation

@dcodeIO
Copy link
Copy Markdown
Contributor

@dcodeIO dcodeIO commented Nov 2, 2017

This adds some utility to the C-API to check ids (kinds) and types of expressions, here especially globals.

My use case is that I'd like to reuse the "precompute" pass so that I don't have to reimplement constant evaluation in the compiler. Hence, I add globals with complex initializers (i.e. binary expressions that are not supported by the MVP, for example a global / enum value 1 << 2) and rely on Binaryen to make these constants through precompute. But this might fail, so I need a way to determine if it worked or not (BinaryenGlobalGetInit -> BinaryenExpressionGetId) to generate proper error messages.

In general, there are a few places where getters and setters like these might make sense (one more example is to rename function types, i.e. if reading a binary that does not have names, here: BinaryenGetFunctionTypeBySignature -> BinaryenSetFunctionTypeName or something like that). I am open to adding more.

I didn't put too much effort into this yet and would like to have your opinion on how to name or structure these first. For example, at some places functions are named BinaryenSubjectOperation (i.e. BinaryenExpressionPrint) while at others it's BinaryenOperationSubject (i.e. BinaryenAddFunction). Figured that these are most likely named after the first argument. If it is an expression, it is BinaryenExpressionXY etc., correct?

@dcodeIO
Copy link
Copy Markdown
Contributor Author

dcodeIO commented Nov 2, 2017

The latest commit will most likely break tests because of the addition of a tracing globals map similar to those used for expressions and functions. I also figured that some form of BinaryenExpressionPrecompute would be ideal for my use case (though I haven't yet understood how Flow works) and in turn would make some (but not all) of the additions nice-to-have but not immediately useful for me anymore.

Copy link
Copy Markdown
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review carefully yet, but yes, this looks like overall like a good direction.

Comment thread src/binaryen-c.h Outdated

// Print an expression to stdout. Useful for debugging.
void BinaryenExpressionPrint(BinaryenExpressionRef expr);
BinaryenIndex BinaryenExpressionGetId(BinaryenExpressionRef expr);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should return BinaryenExpressionId?

Comment thread src/binaryen-c.h Outdated
@@ -356,6 +390,12 @@ void BinaryenRemoveExport(BinaryenModuleRef module, const char* externalName);
// Globals

BinaryenImportRef BinaryenAddGlobal(BinaryenModuleRef module, const char* name, BinaryenType type, int8_t mutable_, BinaryenExpressionRef init);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this was wrong before, AddGlobal should return a global, not an import. Let's fix that here and in the new places it is used in this PR.

@dcodeIO
Copy link
Copy Markdown
Contributor Author

dcodeIO commented Nov 3, 2017

Not sure about setting names, though. Is it necessary to update the respective lookup maps (i.e. Module#functionsMap) on name changes as well? These are indexed by Name, and the updates perform a Name#set.

@dcodeIO
Copy link
Copy Markdown
Contributor Author

dcodeIO commented Nov 3, 2017

The latest commit also changes BinaryenRemoveImport to BinaryenRemoveImportByName and BinaryenRemoveExport to BinaryenRemoveExportByName for clarity (there might be multiple ways to reference an export, not just by name but also by reference, in the future). While this probably won't be much of an issue (it's relatively new and I'm probably the only one who was ever using it), I also feel that changing this now is better than doing it later. Once updated, this can also become transparent to binaryen.js users.

In general, a clean solution would also require renaming all the methods that take a Module as their first argument to BinaryenModuleXY, with BinaryenRemoveImport becoming BinaryenModuleRemoveImportByName. Currently, both notations are used - sometimes with, sometimes without Module. Not immediately necessary, but might become an issue in the future, so mentioning.

@dcodeIO dcodeIO changed the title WIP: Added expression utility to C-API Additional C-API utilities Nov 3, 2017
@dcodeIO dcodeIO force-pushed the js-check-expression-types branch from cd5c6c5 to ed1e52e Compare November 3, 2017 23:56
Comment thread src/binaryen-c.h Outdated
BinaryenFunctionTypeRef BinaryenAddFunctionType(BinaryenModuleRef module, const char* name, BinaryenType result, BinaryenType* paramTypes, BinaryenIndex numParams);
// Gets the function type at the specified index. Returns `NULL` when there are no more function types.
BinaryenFunctionTypeRef BinaryenGetFunctionTypeAt(BinaryenModuleRef module, BinaryenIndex index);
// Gets the name of the specified function.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment should say function type

Comment thread src/binaryen-c.h Outdated
BinaryenFunctionTypeRef BinaryenGetFunctionTypeAt(BinaryenModuleRef module, BinaryenIndex index);
// Gets the name of the specified function.
const char* BinaryenFunctionTypeGetName(BinaryenFunctionTypeRef ftype);
// Sets the name of the specified function.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Comment thread src/binaryen-c.h
// Add a new function type. This is thread-safe.
// Note: name can be NULL, in which case we auto-generate a name
BinaryenFunctionTypeRef BinaryenAddFunctionType(BinaryenModuleRef module, const char* name, BinaryenType result, BinaryenType* paramTypes, BinaryenIndex numParams);
// Gets the function type at the specified index. Returns `NULL` when there are no more function types.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you want to get function types by their index? can't you use their names?

Copy link
Copy Markdown
Contributor Author

@dcodeIO dcodeIO Nov 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea here is to have a simple way to iterate over all function types (by calling this function with 0, 1, etc. until it returns NULL) and then being able to use the getters and setters below to evaluate and maybe modify them. Could be used, for example, when loading an existing module with the intention to unify function type names to combinations of "i", "I", "f", "F" and "v" in their debug info.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. For iteration, perhaps another option is something like a JS API that returns an array of the names, and a C API underneath that does something similar (mallocs a list that must later be freed, and returns it)? Just a suggestion, I'm not sure what is best, but overall I'd like to avoid using indexes (as we have names intentionally, and also, the indexes are internal and do not necessarily match the wasm binary indexes, which could be confusing).

If you want, you can split this into smaller PRs, as we can land the obvious stuff already, but the areas where the API design isn't yet clear may take longer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, going to split it up then - and taking the array idea into account.

Comment thread src/binaryen-c.h Outdated
// Gets the import at the specified index from the specified module. Returns `NULL` when there are no more imports.
BinaryenImportRef BinaryenGetImportAt(BinaryenModuleRef module, BinaryenIndex index);
// Gets the internal name of the specified import. This is the name referenced by other internal elements.
const char* BinaryenImportGetInternalName(BinaryenImportRef import);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can use just Name, as that's the field name in the code. But the comment might mention it is the internal name.

Comment thread src/binaryen-c.h Outdated
// Sets the internal name of the specified import. This is the name referenced by other internal elements.
void BinaryenImportSetInternalName(BinaryenImportRef import, const char* newName);
// Gets the external module name of the specified import. This is the name of the module being imported from.
const char* BinaryenImportGetExternalModuleName(BinaryenImportRef import);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can also remove the External from these, I think.

@dcodeIO dcodeIO force-pushed the js-check-expression-types branch from fddc0df to 0b39ea7 Compare November 7, 2017 00:25
@dcodeIO
Copy link
Copy Markdown
Contributor Author

dcodeIO commented Nov 7, 2017

Closing in favor of splitting this up into multiple PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants