Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory sharing #2415

Merged
merged 4 commits into from
Aug 4, 2023
Merged

Fix memory sharing #2415

merged 4 commits into from
Aug 4, 2023

Conversation

yamt
Copy link
Collaborator

@yamt yamt commented Aug 1, 2023

  • inherit shared memory from the parent instance, instead of
    trying to look it up by the underlying module. the old method
    works correctly only when every clusters use different modules.

  • reference count WASMMemoryInstance/AOTMemoryInstance directly

  • retire WASMSharedMemNode

  • for atomic instructions in interpreters, use a global lock for now.

  • update users (wasi-threads, lib-pthread, wasm_runtime_spawn_thread)

fixes #1962

todo:

  • aot
  • classic interpreter
  • lib-pthread
  • test wasm_runtime_spawn_thread
  • test lib-pthread

- inherit shared memory from the parent instance, instead of
  trying to look it up by the underlying module. the old method
  works correctly only when every clusters uses different modules.

- reference count WASMMemoryInstance/AOTMemoryInstance directly

- retire WASMSharedMemNode

- for atomic opcode implementations in the interpreters, use
  a global lock for now.

- update the internal API users.
  (wasi-threads, lib-pthread, wasm_runtime_spawn_thread)

Fixes bytecodealliance#1962
@yamt yamt marked this pull request as ready for review August 2, 2023 04:27
Comment on lines +2314 to +2315
if (module_inst->memory_count > 0)
shared_memory_lock(module_inst->memories[0]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wenyongh Do you remember why we need to lock the shared memory when setting exception?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #2407

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I remember that one instance's exception may be accessed by other threads when spreading non "wasi proc exit" exception. It was introduced in this PR #1995

In thread_manager.c, function set_exception_visitor:

        /* Only spread non "wasi proc exit" exception */
#if WASM_ENABLE_SHARED_MEMORY != 0
        WASMSharedMemNode *shared_mem_node = wasm_module_get_shared_memory(
            (WASMModuleCommon *)curr_wasm_inst->module);
        if (shared_mem_node)
            os_mutex_lock(&shared_mem_node->shared_mem_lock);
#endif
        if (!strstr(wasm_inst->cur_exception, "wasi proc exit")) {
            bh_memcpy_s(curr_wasm_inst->cur_exception,
                        sizeof(curr_wasm_inst->cur_exception),
                        wasm_inst->cur_exception,
                        sizeof(wasm_inst->cur_exception));
        }
#if WASM_ENABLE_SHARED_MEMORY != 0
        if (shared_mem_node)
            os_mutex_unlock(&shared_mem_node->shared_mem_lock);
#endif

Copy link
Contributor

@eloparco eloparco Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I remember that one instance's exception may be accessed by other threads when spreading non "wasi proc exit" exception. It was introduced in this PR #1995

Yes, that's the reason. When a thread runs into an exception it spreads it to other threads by using set_exception_visitor and that other threads could try to read it at the same time, causing concurrency issues.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

Comment on lines +2314 to +2315
if (module_inst->memory_count > 0)
shared_memory_lock(module_inst->memories[0]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wenyongh Do you remember why we need to lock the shared memory when setting exception?

core/iwasm/aot/aot_runtime.c Show resolved Hide resolved
core/iwasm/aot/aot_runtime.c Outdated Show resolved Hide resolved
@@ -710,28 +710,28 @@ trunc_f64_to_int(WASMModuleInstance *module, uint32 *frame_sp, float64 src_min,
CHECK_BULK_MEMORY_OVERFLOW(addr + offset, 1, maddr); \
CHECK_ATOMIC_MEMORY_ACCESS(); \
\
os_mutex_lock(&node->shared_mem_lock); \
shared_memory_lock(memory); \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An concern is about the performance, calling shared_memory_lock/unlock adds two function calls, it may impact the performance. How about getting the lock pointer, and here os_mutex_{lock|unlock}(shared_mem_lock)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it matter? if we really care the performance here, i guess we should use atomic ops as we do for aot.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not very sure, the performance may be not so important here, I just see an opportunity to refine the code😊. Using atomic operations may be also a choice in the future.
@loganek, any idea?

core/iwasm/interpreter/wasm_runtime.c Show resolved Hide resolved
core/iwasm/interpreter/wasm_runtime.h Outdated Show resolved Hide resolved
Copy link
Collaborator

@xujuntwt95329 xujuntwt95329 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@wenyongh wenyongh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wenyongh wenyongh merged commit 9159242 into bytecodealliance:main Aug 4, 2023
368 checks passed
victoryang00 pushed a commit to victoryang00/wamr-aot-gc-checkpoint-restore that referenced this pull request May 27, 2024
- Inherit shared memory from the parent instance, instead of
  trying to look it up by the underlying module. The old method
  works correctly only when every cluster uses different module.
- Use reference count in WASMMemoryInstance/AOTMemoryInstance
  to mark whether the memory is shared or not
- Retire WASMSharedMemNode
- For atomic opcode implementations in the interpreters, use
  a global lock for now
- Update the internal API users
  (wasi-threads, lib-pthread, wasm_runtime_spawn_thread)

Fixes bytecodealliance#1962
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Shared memory should only be shared inside cluster
4 participants