Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emscripten_dlopen fails on Safari #21571

Open
Honya2000 opened this issue Mar 20, 2024 · 12 comments
Open

emscripten_dlopen fails on Safari #21571

Honya2000 opened this issue Mar 20, 2024 · 12 comments

Comments

@Honya2000
Copy link

Honya2000 commented Mar 20, 2024

Hello,

I recently implemented asynchronous dynamic modules support for my WASM project.
Loading is quite simple:

  1. I download the module using fetch API.
  2. store downloaded binary to virtual emscripten FS file.
  3. asynchronously load and instantiate wasm .so file using function emscripten_dlopen.

Since loading is asynchronous - there are usually several modules in the fly, depending on assets.
The system works fine in chrome, but doesn't work in Safari (both MacOS and iOS).
It reports unresolved exported symbols in side modules. But problem is the symbols aren't related to those modules at all. They are located in other side modules.

For example it throws error for DdsImporter.so module:
bad export type for '_ZTVN4Tmrw4Data16GltfImporterE": undefined.
This symbol is just virtual functions table of Gltfimporter class which is not part of DdsImporter at all, and DdsImporter is not using it.
GltfImporter class is instantiated in another side module.

Safari always mixes the modules randomly during resolving symbols. Always throwing different resolving issues for different modules. There is no determinism.

There is example url to check the issue:
http://honya.myftp.org:88/wrooms/index.html?native_debug=true

@sbc100
Copy link
Collaborator

sbc100 commented Mar 20, 2024

Can you try load the module just one at a time to see if that makes the problem go away? i.e. can you wait for the result of one emscripten_dlopen before trying another?

Also, can you share that full set of link flags that you are using?

@Honya2000
Copy link
Author

I'm currently using RTLD_NOW flag ony. But i think i tried to use all the possible combination of flags.

Tomorrow will try to load modules one by one. But even if this would fix the Safari issue, i doubt i will accept this as a workaround solution...

Btw seems like the same issue happens in windows Firefox as well.

@sbc100
Copy link
Collaborator

sbc100 commented Mar 20, 2024

I'm currently using RTLD_NOW flag ony. But i think i tried to use all the possible combination of flags.

Tomorrow will try to load modules one by one. But even if this would fix the Safari issue, i doubt i will accept this as a workaround solution...

Sorry, I didn't mean to suggest that as the final solution, just as an aid to debugging the issue.

i.e. is the root cause of the issue that async nature of the code loading and the interleaving of the async work?

Btw seems like the same issue happens in windows Firefox as well.

@Honya2000
Copy link
Author

How to properly wait in fetch API onsuccess callback?

i tried blocked wait approach:
while (g_nModulesInFly > 0)
{
emscripten_sleep(1);
}
g_nModulesInFly++;

, but looks like in this case it never exits from the loop.

dlopen code looks like this:

    while (g_nModulesInFly > 0)
    {
        emscripten_sleep(1);
    }
    g_nModulesInFly++;

    emscripten_dlopen(context->filePath.c_str(), RTLD_NOW, context,
        [](void* ctx, void* handle)
        {
            DLContext* context = static_cast<DLContext*>(ctx);
            //const char* err = dlerror();
            context->callback(handle, PreparePluginResult::Ok);
            delete context;

            g_nModulesInFly--;
        },
        [](void* ctx)
        {
            DLContext* context = static_cast<DLContext*>(ctx);
            const char* err = dlerror();
            if (err)
            {
                Error{} << "dlopen error:" << err;
            }
            context->callback(nullptr, PreparePluginResult::DlopenError);
            delete context;

            g_nModulesInFly--;
        }
      );

@Honya2000
Copy link
Author

Well, i commented emscripten_sleep in the loop, and now it works...

But unfortunately this didn't fix the issue on Safari and Firefox.
The same random undefined symbols during loading of .so modules.

@Honya2000
Copy link
Author

Honya2000 commented Mar 21, 2024

Probably i have to execute dlopen from main thread?
Currently it executed from fetch callback which is not the main thread?

@Honya2000
Copy link
Author

Ok, i moved emscripten_dlopen to main thread and now the issue solved in Safari (both desktop and mobile) and Firefox!
So dlopen cannot be execited from fetch callback.

@sbc100
Copy link
Collaborator

sbc100 commented Mar 21, 2024

Ah, I should have asked initially if you were using threads.

Can you share the full set of emcc link flags you are using?

Can you describe more the sequence you follow in the broken case? e.g How are you using the fetch API and and thread API and dlopen API to trigger the issue?

@Honya2000
Copy link
Author

Honya2000 commented Mar 21, 2024

No, threads are disabled in this specific emscripten build.
Threads are used by the browser when it calls promises call-backs.

@sbc100
Copy link
Collaborator

sbc100 commented Mar 21, 2024

Threads are used by the browser when it calls promises call-backs.

Can you explain what you mean by this? My understanding is that browsers are single threaded and that all callback happen on the same thread.

@Honya2000
Copy link
Author

Honya2000 commented Mar 21, 2024

Above i mentioned the loop, which i executed in fetch onsuccess callback:

        emscripten_fetch_attr_t attr;
        emscripten_fetch_attr_init(&attr);
        strcpy(attr.requestMethod, "GET");
        attr.userData = context;
        attr.attributes = EMSCRIPTEN_FETCH_LOAD_TO_MEMORY;
        attr.onsuccess = [](emscripten_fetch_t* fetch)
        {
            CompDLContext* context = static_cast<CompDLContext*>(fetch->userData);

            //Debug{} << "Downloaded plugin:" << context->pluginPath.c_str() << ", size:" << fetch->numBytes;
            //Debug{} << "Saving plugin to virtual FS:" << context->filePath.c_str();
            FILE* f = fopen(context->filePath.c_str(), "wb");
            if (!f)
            {
                Error{} << "Failed to store plugin file to virtual file-system:" << context->filePath.c_str();

                context->callback(nullptr, CompPreparePluginResult::SaveError, fetch->numBytes, nullptr);
                delete context;
                return;
            }
            fwrite(fetch->data, fetch->numBytes, 1, f);
            fclose(f);

            emscripten_fetch_close(fetch);

            while (g_nModulesInFly > 0)
            {
               emscripten_sleep(1);
            }
            g_nModulesInFly++;

            emscripten_dlopen(context->filePath.c_str(), RTLD_NOW, context,
                [](void* ctx, void* handle)
                {
                    CompDLContext* context = static_cast<CompDLContext*>(ctx);
                    //const char* err = dlerror();
                    context->callback(handle, CompPreparePluginResult::Ok, context->size, nullptr);
                    delete context;

                    g_nModulesInFly--;
                },
                [](void* ctx)
                {
                    CompDLContext* context = static_cast<CompDLContext*>(ctx);
                    const char* err = dlerror();
                    context->callback(nullptr, CompPreparePluginResult::DlopenError, context->size, err);
                    delete context;

                    g_nModulesInFly--;
                });
        };

I mentioned that the loop never exits.
But the point is it didn't block the app at all. It loaded and running fine. That means that the main thread wasn't blocked. And callback executed in separate thread.

@Honya2000
Copy link
Author

Anyway the issue was fully solved as soon as i moved emscripten_dlopen outside of the callback, and processed it in main app thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants