Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebIDL-wrapped C++ objects are never garbage collected #19438

Open
skelly-energid opened this issue May 24, 2023 · 4 comments
Open

WebIDL-wrapped C++ objects are never garbage collected #19438

skelly-energid opened this issue May 24, 2023 · 4 comments

Comments

@skelly-energid
Copy link

skelly-energid commented May 24, 2023

Please include the following in your bug report:

Version of emscripten/emsdk:
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.39 (36f8718)
clang version 17.0.0 (https://github.com/llvm/llvm-project c672c3fe05adbb590abc99da39143b55ad510538)
Target: wasm32-unknown-emscripten
Thread model: posix

Repro

This shell script generates bindings using WebIDL, starts a simple http server and creates a webpage to open:

cat > index.html <<EOF
<!DOCTYPE html>
<html lang="en">

<head>
    <title>Testcase</title>
    <meta charset="utf-8">
</head>

<body>

    <input type="radio" id="doDestroy" name="destruction">
    <label for="doDestroy">Destroy</label><br>
    <input type="radio" id="doNotDestroy" name="destruction" checked="checked">
    <label for="doNotDestroy">Do not Destroy</label><br>

    <input type="radio" id="doRun" name="state" checked="checked">
    <label for="doRun">Run</label><br>
    <input type="radio" id="doStop" name="state">
    <label for="doStop">Stop</label><br>

    <p id="reclaimbutton">Reclaim</p>
    <script type="text/javascript" src="./myBindings.js"></script>
    <script>
        let MyModule;
        let allocateTimerId;

        let interval = 1000;

        function createVector() {
            return new MyModule.MyVector(1, 1, 1);
        }

        function createVectors() {
            for (let ii = 0; ii < 10; ++ii)
            {
                let mat = createVector();
                if (document.getElementById('doDestroy').checked)
                    MyModule.destroy(mat);
            }
        }

        function initModule(isStopped)
        {
            if (allocateTimerId !== undefined) {
                clearInterval(allocateTimerId);
                allocateTimerId = undefined;
            }
            delete MyModule;
            MyModule = null;

            Module().then(instance => {
                MyModule = instance;

                if (!isStopped)
                {
                    allocateTimerId = setInterval(() => {
                        createVectors();
                    }, interval);
                }
            });
        }
        initModule(false);

        let handleStateToggle = event => {
            if (document.getElementById('doRun').checked)
            {
                allocateTimerId = setInterval(() => {
                    createVectors();
                }, interval);
            } else {
                clearInterval(allocateTimerId);
                allocateTimerId = undefined;
            }
        };

        document.getElementById('doRun').addEventListener('change', handleStateToggle);
        document.getElementById('doStop').addEventListener('change', handleStateToggle);

        document.getElementById('reclaimbutton').addEventListener('click', event => {
            initModule(allocateTimerId === undefined);
        });

    </script>
</body>

</html>
EOF


cat > myBindings.idl <<EOF
interface MyVector {
   void MyVector();
   void MyVector(double x, double y, double z);
   [Const] double y();
   [Const] double z();
   [Const] double x();
};
EOF


cat > myBindings.cpp <<EOF
class MyVector
{
public:
   MyVector() {}
   MyVector(double x, double y, double z) {}
   double x() {
    return 4.2;
   }
   double y() {
    return 4.2;
   }
   double z() {
    return 4.2;
   }

private:
    char someBulk[1024 * 32];
};

#include "mybindings.cpp"

int main(int, char**)
{
    return 0;
}
EOF

GENERATOR=${EMSDK}/upstream/emscripten/tools/webidl_binder.py
# GENERATOR=${PWD}/tools/webidl_binder.py

python3 $GENERATOR myBindings.idl mybindings

em++ myBindings.cpp \
 --memoryprofiler \
 -sINITIAL_MEMORY=4MB \
 -sMODULARIZE=1 \
  -s EXPORTED_FUNCTIONS="['_malloc', '_free']" \
 --post-js $PWD/mybindings.js \
 -o myBindings.js

python3 -m http.server

Open http://localhost:8000/?trackbytes=1000&trackcount=100 in the browser.

After a few seconds of operation, the tab crashes in Chrome and FF due to OOM.

From this documentation:

https://emscripten.org/docs/porting/connecting_cpp_and_javascript/WebIDL-Binder.html#using-c-classes-in-javascript

JavaScript will automatically garbage collect any of the wrapped C++ objects when there are no more references. If the C++ object doesn’t require specific clean up (i.e. it doesn’t have a destructor) then no other action needs to be taken.

This test case shows that that is not the case. The wrapped objects are never garbage collected. The wrapped type does not have a destructor which needs to be invoked.

The documentation also notes:

You will usually need to destroy the objects which you create, but this depends on the library being ported.

This is confusing and contradictory.

If the radio button is selected to destroy objects after creating them, then destroy is called and memory stops growing. This is not a reasonable workaround as the code becomes unmaintainable very quickly.

If this is not fixed, at least the documentation should be updated to indicate that all wrapped objects must be explicitly destroyed.

@skelly-energid
Copy link
Author

skelly-energid commented May 24, 2023

The "Reclaim" text when clicked should delete the Module and re-create it. This messes up the memoryprofiler which then flickers between the old modules, which I'm guessing it is keeping alive.

I also tried this reclaim without the memoryprofiler option in use. Using the heap snapshot feature of the memory tab of chrome devtools, it is clear that the MyVector instances remain, even after deleting the Module. This is easier to see if the size of MyVector is reduced and more of them are created each time createVectors is invoked.

I notice that the generated mybindings.js file has a getCache thing. I tried removing that and still got the same behavior when using the memoryprofiler. When not using the memoryprofiler, the chrome devtools seems to show that the MyVector objects are not surviving. However, after running the test case for a while, it does still crash with OOM.

@skelly-energid
Copy link
Author

skelly-energid commented May 24, 2023

Some links:

Possibly EmBind has better handling of this than WebIDL?

Also - https://discourse.threejs.org/t/ammo-js-with-three-js/12530/82

It is looking like this is a known issue and the documentation for WebIDL is just very wrong.

It is also very inconvenient to have to explicitly destroy objects. This is not maintainable as the function inevitably grown new conditions and exit conditions:

function (something)
{
   let thing = new CppStuff.Thing();
   // ...
   if (cond1) {
      // ...
      CppStuff.destroy(thing);
      return;
   }
   if (cond2 || cond3) {
      // ... 
      // don't return today. Maybe someone adds a return tomorrow.
      // hope they remember to add the destroy!
   }
   if (cond2 && cond3) {
      // ...
      CppStuff.destroy(thing);
      return;
   }
  CppStuff.destroy(thing);
}

@skelly-energid
Copy link
Author

skelly-energid commented May 24, 2023

Given how long this has been an issue for, I'm guessing it is not trivial to resolve.

I'm experimenting with ways to work around this. Most of my types that are created frequently are value types, so something like

class ValueSingletons
{
public:
    ValueSingletons() {}

    MyVector updateStaticVectorSingleton(double x, double y, double z)
    {
        return MyVector(x, y, z);
    }
    char data = 0;
};

can be used like

function updateStaticVectorSingleton(ii) {
    return theValueSingletons.updateStaticVectorSingleton(ii, 1, 1); 
}
// ... 
let mat = updateStaticVectorSingleton(ii);
console.log(mat.x());

but no matter how I name things, someone will do this and not understand why it fails:

let mat1 = updateStaticVectorSingleton(ii);
let mat2 = updateStaticVectorSingleton(ii + 1);

console.log(mat1.x(), mat2.x());

Also, It is normal to need to have two MyVectors alive at once, so the singleton is a non-runner.

So, this is also not a good solution.

Another idea is to create a utility to scope the use of an instance, which at least means we can use two instances at once:


        function destroyAfterUse(instanceData, callback)
        {
            let Module = instanceData.shift();
            let Type = instanceData.shift();
            let instance = new Module[Type](...instanceData);
            callback(instance);
            Module.destroy(instance);
        }

        function createVectors() {
            for (let ii = 0; ii < 1000; ++ii)
            {
                destroyAfterUse([MyModule, "MyVector", ii, 1, 0], (cppVec)=> {
                    console.log(cppVec.x());
                });
                destroyAfterUse([MyModule, "MyVector", ii, 1, 0], (cppVec1)=> {
                    destroyAfterUse([MyModule, "MyVector", ii + 1, 1, 0], (cppVec2)=> {
                        console.log(cppVec1.x(), cppVec2.x());
                    });
                });
            }
        }

It could also potentially be improved by implementing it with a freelist etc. The big downsides are

  • not being able to have any APIs which return cpp-wrapped types
  • the constructors are necessarily invokable so users could just bypass the destroyAfterUse.

This all seems to be a case of "Choose your bad". What do others do about this? @juj @kripken this is stuff you hit in ammo/unity surely?

@kripken
Copy link
Member

kripken commented May 24, 2023

I think those docs are confusing, yeah. A PR to improve them would be welcome. What I think is the truth is that the JS wrapper objects are automatically GC'd by the JS VM, like any JS object. But the allocations in the C++ heap must be manually freed.

It is possible to improve this using JS finalizers (which would destroy the C++ object automatically), but I'm not sure how much work that would be. Embind has some support for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants