Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition during module compilation engine #2190

Closed
martinlakov opened this issue Apr 21, 2024 · 7 comments · Fixed by #2194
Closed

Race condition during module compilation engine #2190

martinlakov opened this issue Apr 21, 2024 · 7 comments · Fixed by #2194
Labels
bug Something isn't working

Comments

@martinlakov
Copy link

martinlakov commented Apr 21, 2024

Describe the bug

In the wazevo engine implementation there're multiple places where Go maps are used (not sync.Map). When trying to run modules in parallel (different runtime instances, shared cache) - race conditions occur and execution is aborted with fatal error: concurrent map read and map write

To Reproduce

Try and run the same command module in a for loop, creating a new runtime for each iteration. Delegate load & execution to a new Go routine for each execution attempt.

Expected behavior

No race condition / panics shall occur when running modules in parallel with a common/shared cache.

Screenshots

N/A

Environment (please complete the relevant information):

  • Go version: 1.22.1
  • wazero Version: v1.7.1
  • Host architecture: amd64
  • Runtime mode: compiler

Additional context

N/A

@martinlakov martinlakov added the bug Something isn't working label Apr 21, 2024
@mathetake
Copy link
Member

please provide the reproducer, otherwise we can't verify

@mathetake mathetake changed the title Race condition during module compilation with wazevo engine Race condition during module compilation engine Apr 21, 2024
@mathetake
Copy link
Member

closing until you give us the minimal repro. Note that executing a module instance concurrently is inherently unsafe and not possible unless you use the threading proposal enabled + your binary is compiled in that way.

@martinlakov
Copy link
Author

As described in this comment - a compilation engine is getting re-used/shared between runtimes.

The wazevo engine comes with an internal map named compiledModules. When creating runtimes & compiling modules happens concurrently - a race condition occurs with accessing the said map. This happens in here

@martinlakov
Copy link
Author

I'd be glad to provide example(s), propose a fix etc. Just want to see this one getting resolved 😊

@martinlakov
Copy link
Author

A bit of background to the story - issue has been observed while working on a high-level library that steps on top of wazero and takes care to hide away complexity associated with working with the linear memory etc. This one - we plan to open-source as soon we have the APIs somewhat final.

@martinlakov
Copy link
Author

martinlakov commented Apr 22, 2024

@mathetake - update of my previous comment (was on my phone, on my way to work)

As described in this comment - a compilation engine is getting re-used/shared between runtimes.

The wazevo engine comes with an internal map named compiledModules. When creating runtimes & compiling modules happens concurrently - a race condition occurs with accessing the said map. This happens like so:

As the "callstack" suggests - there's no logic to govern access to the said map and the said map is not thread-safe. As a result - a race condition occurs as soon as you try and use a single cache instance among multiple runtime instances and you try to load/execute modules concurrently.

It is evident that we're talking about a bug in here. If you take a look at the implementation of CompiledModuleCount for wazevo (see: here) - read-only access to the said map is governed appropriately.

@martinlakov
Copy link
Author

A potential fix for the problem is to create a "private" function to check if a module has been compiled already. Something like:

func (e *engine) isCompiled(m *wasm.Module) bool {
	e.mux.RLock()
	defer e.mux.RUnlock()

	_, ok := e.compiledModules[m.ID]
	return ok
}

and replace the direct access to the map in here with calling the said function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants