Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for func_id rollover #94

Merged
merged 11 commits into from Jan 3, 2023
Merged

Conversation

swimmadude66
Copy link
Contributor

Since func_id is read from magic as a 16-bit signed integer, the callFunction method was receiving it in two's complement. This meant that any func id over 32,767 wrapped to a negative number, and thus could not be found in the context's fnMap.

This PR handles this by re-typing magic to a uint32_t, allowing over 4B func ids to be generated before overflow. There is a TS-imposed limit before this, though, as the fnMap itself only supports 2^24 keys. I've added a test that all values up to that number work, and to expect a rangeError when the map is overfull. (This test takes a LONG time, so feel free to remove. just wanted a test case that failed with the current version, and passed with the uint32_t change)

This change addresses our issues, but in general, I'd like to request some way to clean up the fnMap. Even once a function is dispose or set to undefined, the fnMap retains it, and the fnIndex increases forever. I'm not sure of the best fix for this, but it means eventually even this fix will cause us to get invalid function calls.

@swimmadude66
Copy link
Contributor Author

@justjake please let me know if there's any more info you need around this, or if there's any other work needed. I know we have an odd usecase that can even register > 32K functions on a context, but its a real blocker for us.

Copy link
Owner

@justjake justjake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need 2^24 functions registered at once, or can you make do with 2^16 functions if we can re-use function IDs after they’re dereferenced?

i would like to avoid modifying the QuickJS sources as much as possible. If the real fix here is re-using function IDs, I would prefer to do that than merge the current approach.

As is, it seems like a lot of the issue could be solved by starting the function ID at the minimum int16_t, or by casting from signed to unsigned in our C interface code without modifying the interpreter.

ts/context.ts Outdated Show resolved Hide resolved
ts/context.ts Outdated Show resolved Hide resolved
ts/generated/ffi.WASM_DEBUG_ASYNCIFY.ts Outdated Show resolved Hide resolved
ts/generated/ffi.WASM_RELEASE_ASYNCIFY.ts Outdated Show resolved Hide resolved
@swimmadude66
Copy link
Contributor Author

swimmadude66 commented Dec 24, 2022

Do you need 2^24 functions registered at once, or can you make do with 2^16 functions if we can re-use function IDs after they’re dereferenced?

i would like to avoid modifying the QuickJS sources as much as possible. If the real fix here is re-using function IDs, I would prefer to do that than merge the current approach.

As is, it seems like a lot of the issue could be solved by starting the function ID at the minimum int16_t, or by casting from signed to unsigned in our C interface code without modifying the interpreter.

If function Ids could be re-used, I think that addresses most of our needs. 2^16 would be an improvement over the current situation, but more is definitely better. I agree that the minimum solution is treating magic as an unsigned int (16, 32, w/e) because the current issue is that newFunction registers as unsigned, but callFunction calls in as a signed int, so the ids do not match

I've removed the modifications to quickJS, and instead only alter the type in c/interface.c. This still seems to work, and has a much smaller footprint

@justjake
Copy link
Owner

It’s possible to define a custom class from C code, and define a “finalizer” that’s called when an instance of the class is freed by the GC. I suggest investigating if you can define a custom class that behaves identically to a Function. Then in the finalizar, you can do some callback to “free” the function ID.

@swimmadude66
Copy link
Contributor Author

It’s possible to define a custom class from C code, and define a “finalizer” that’s called when an instance of the class is freed by the GC. I suggest investigating if you can define a custom class that behaves identically to a Function. Then in the finalizar, you can do some callback to “free” the function ID.

We actually are using FinalizationRegistry to do something similar, but it seems as though GC is not reliably firing. Even still, it appears functionIDs are purely incremented, so we'd still need some way to assign the next unused function ID. I think increasing the bit space for functions from approx 2^15 to 2^32 combined with some optimizations on our side will have us sorted for quite a bit

@swimmadude66
Copy link
Contributor Author

Commenting to add that I was apparently incorrect before. Upon removing the changes from quickJS, we are still getting the conversion to negative numbers. I'm not sure exactly where that casting needs to be done, but its apparently missing here

@justjake
Copy link
Owner

It may be most simple to start with an initial ID of the minimum int16. Then no casting necessary?

@swimmadude66
Copy link
Contributor Author

It may be most simple to start with an initial ID of the minimum int16. Then no casting necessary?

That feels somewhat cloogier than simply using unsigned integers, no?

@justjake
Copy link
Owner

justjake commented Dec 24, 2022

With regard to re-allocating function IDs, how do you know when a function is no longer needed by the guest? I was suggesting using finalizers inside the guest. Are you using FinalizationRegistry on the host, or inside the guest? In the case of the guest, there may be an API to explicitly run a GC cycle. In any case, you could push the no-longer-in-use function ID onto a “free-list”, and have the “get new function ID” operation on the host prefer to pop from the free-list instead of bumping the counter.

@swimmadude66
Copy link
Contributor Author

With regard to re-allocating function IDs, how do you know when a function is no longer needed by the guest? I was suggesting using finalizers inside the guest. Are you using FinalizationRegistry on the host, or inside the guest? In the case of the guest, there may be an API to explicitly run a GC cycle. In any case, you could push the no-longer-in-use function ID onto a “free-list”, and have the “get new function ID” operation prefer to pop from the free-list instead of bumping the counter.

we are using FinalizationRegistry in the JS-land, and registering proxies to quickjs handles, using the same logic as the Scope.

I will look into the reused function ids as a fix for the future, but I think that is substantially more complex (determining when is a function ready to be freed) than handling this negative rollover at the moment. I'm looking now to compare the effort of casting vs starting functionID at minimum signed int16

@justjake
Copy link
Owner

It may be most simple to start with an initial ID of the minimum int16. Then no casting necessary?

That feels somewhat cloogier than simply using unsigned integers, no?

Diverging from upstream is a bigger cludge i think.

@swimmadude66
Copy link
Contributor Author

It may be most simple to start with an initial ID of the minimum int16. Then no casting necessary?

That feels somewhat cloogier than simply using unsigned integers, no?

Diverging from upstream is a bigger cludge i think.

Fair enough. I'll make the min starting point change here, and look into a PR upstream to re-type magic to an unsigned int

@swimmadude66
Copy link
Contributor Author

I've simplified the changeset and opened an issue with quickjs. Please let me know if you'd like me to make any other changes before this can land. Definitely looking forward to using the full address space for functions

Copy link
Owner

@justjake justjake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@justjake justjake enabled auto-merge (squash) January 3, 2023 04:53
@justjake
Copy link
Owner

justjake commented Jan 3, 2023

By the way, to work around the issue entirely, you could use a mapping table in host user space backed by a UUID. Instead of binding 2^N functions via newFunction, just make one called dispatch, and then call dispatch.bind(uuid) inside the VM. On the outside of the VM, put your host function handler into the Map<UUID, Function>.

@justjake
Copy link
Owner

justjake commented Jan 3, 2023

@swimmadude66 it looks like the tests failed. Can you get them to pass?

@justjake justjake disabled auto-merge January 3, 2023 05:13
@swimmadude66
Copy link
Contributor Author

@swimmadude66 it looks like the tests failed. Can you get them to pass?

I've pushed a "fix" which skips the massive max funcID test in debug mode, as it was timing out (and takes ~5min locally). I can re-add it with an absurd timeout if you'd like, but I think its probably ok to skip this one in debug

@justjake
Copy link
Owner

justjake commented Jan 3, 2023

Sounds good

@justjake justjake enabled auto-merge (squash) January 3, 2023 17:33
auto-merge was automatically disabled January 3, 2023 18:09

Head branch was pushed to by a user without write access

@swimmadude66
Copy link
Contributor Author

@justjake sorry, I missed a place for my fix. re-pushed, but the tests don't seem to re-run automatically

@justjake justjake enabled auto-merge (squash) January 3, 2023 19:18
auto-merge was automatically disabled January 3, 2023 20:50

Head branch was pushed to by a user without write access

@justjake justjake enabled auto-merge (squash) January 3, 2023 21:06
@justjake justjake merged commit f9785f3 into justjake:main Jan 3, 2023
menduz pushed a commit to decentraland/quickjs-emscripten that referenced this pull request May 3, 2023
* 32767 functions good, 32768 functions BAD

* change `magic` to uin16_t (avoids signed intereger overflow)

* type magic as uint32_t, add simple test

* re-enable all tests

* remove missed test code

* address PR issues

* switch to a map of maps for fnMap

* update fnId to start at min value

* skip max funcID tests for debug mode

* missed a flag

* run prettier
menduz added a commit to decentraland/quickjs-emscripten that referenced this pull request May 4, 2023
* 0.23.0

* update changelog for 0.21.1

* Bump qs from 6.5.2 to 6.5.3 (justjake#89)

Bumps [qs](https://github.com/ljharb/qs) from 6.5.2 to 6.5.3.
- [Release notes](https://github.com/ljharb/qs/releases)
- [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md)
- [Commits](ljharb/qs@v6.5.2...v6.5.3)

---
updated-dependencies:
- dependency-name: qs
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump decode-uri-component from 0.2.0 to 0.2.2 (justjake#83)

Bumps [decode-uri-component](https://github.com/SamVerschueren/decode-uri-component) from 0.2.0 to 0.2.2.
- [Release notes](https://github.com/SamVerschueren/decode-uri-component/releases)
- [Commits](SamVerschueren/decode-uri-component@v0.2.0...v0.2.2)

---
updated-dependencies:
- dependency-name: decode-uri-component
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix for func_id rollover (justjake#94)

* 32767 functions good, 32768 functions BAD

* change `magic` to uin16_t (avoids signed intereger overflow)

* type magic as uint32_t, add simple test

* re-enable all tests

* remove missed test code

* address PR issues

* switch to a map of maps for fnMap

* update fnId to start at min value

* skip max funcID tests for debug mode

* missed a flag

* run prettier

* Bump http-cache-semantics from 4.1.0 to 4.1.1 (justjake#97)

Bumps [http-cache-semantics](https://github.com/kornelski/http-cache-semantics) from 4.1.0 to 4.1.1.
- [Release notes](https://github.com/kornelski/http-cache-semantics/releases)
- [Commits](kornelski/http-cache-semantics@v4.1.0...v4.1.1)

---
updated-dependencies:
- dependency-name: http-cache-semantics
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* 0.21.2

* scripts/emcc.sh: cache lto .a files in build/emsdk-cache

* unknown change to .map file

* Emit async imports for variants (justjake#100)

* upgrade typescript

* attempt to use nodenext tsc

* revert enabling noUnused{Locals,Parameters}

* fix imports to use real extensions

* update mocha for ESM

* awkward async import issue

* npm run build

* examples/website: update for newer NPM, typescript

* fix behavior for webpack build

* bump emsdk to 3.1.31

* fix function signature

* cache another item

* use EMSDK 3.1.32 native, 3.1.31 in Docker

* add some extra default timeout

* Fix breakage caused by upgrade

* rebuild

* More precise/strict compilation (drop support for node<16)

* rebuild

* add note about MINIMAL_RUNTIME

* tested website

* ?

* pretty

* BigInt (-DCONFIG_BIGNUM) support (justjake#104)

* enable CONFIG_BIGNUM

* recompile for bignum

* bigint basics

* vm.dump for bigint

* fix bigint call

* update changelog

* dump

* Extended Symbol support (justjake#105)

* Symbol utilities

* rebuild

* update CHANGLOG.md

* rebuild docs

* update changelog

* 0.22.0

* Increase ASYNCIFY_STACK_SIZE (justjake#114)

* Increase ASYNCIFY_STACK_SIZE

* Add dynamic asyncify stack size

* update docs & changelog

* rebuild docs

* 0.23.0

* Makefile: use emscripten/emsdk:3.1.35 from docker

* rebuild

* update smoketest

* feat: BigNum (#3)

* Add vim swapfiles to gitignore

* Use local emcc binary

* Build for emscripten web target

* Enable QuickJS bignum extensions

* Update generated files

* Update README.md

* test

* build

* build

* new package name

Co-authored-by: Ben Sidhom <bsidhom@gmail.com>
Co-authored-by: menduz <github@menduz.com>

* feat: add opcode instructions counter (#1)

* add opcode counter

* remove static

* update generated

* fix

* fix prettier

* remove prepack

* feat: move opcode counters to uint64 (#4)

* feat: move the counter to 64bit

* fix test

* increase mocha timeout

* fix prettier

* rebuild

* fix lock

* fix package.json

* prettier

* ignore emsdk-cache in prettier

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Jake Teton-Landis <just.1.jake@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Adam Yost <swimmadude66@users.noreply.github.com>
Co-authored-by: yar2001 <yar2001@163.com>
Co-authored-by: Lean Mendoza <leandro@decentraland.org>
Co-authored-by: Ben Sidhom <bsidhom@gmail.com>
menduz pushed a commit to decentraland/quickjs-emscripten that referenced this pull request May 4, 2023
* 32767 functions good, 32768 functions BAD

* change `magic` to uin16_t (avoids signed intereger overflow)

* type magic as uint32_t, add simple test

* re-enable all tests

* remove missed test code

* address PR issues

* switch to a map of maps for fnMap

* update fnId to start at min value

* skip max funcID tests for debug mode

* missed a flag

* run prettier
menduz pushed a commit to decentraland/quickjs-emscripten that referenced this pull request May 4, 2023
* 32767 functions good, 32768 functions BAD

* change `magic` to uin16_t (avoids signed intereger overflow)

* type magic as uint32_t, add simple test

* re-enable all tests

* remove missed test code

* address PR issues

* switch to a map of maps for fnMap

* update fnId to start at min value

* skip max funcID tests for debug mode

* missed a flag

* run prettier
@swimmadude66 swimmadude66 deleted the max-func-id-rollover branch March 29, 2024 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants