regression in thread-safety with data-race on mi->inInference field #53433

vtjnash · 2024-02-22T14:54:07Z

The changes in #53219 appears to have introduced a datarace on the mi->inInference field, which seems likely to lead to incorrect compilation and bad performance of threaded code. It was previously protected by the codegen lock.

Originally posted by @vtjnash in #53219 (comment)

Posting as an issue since the PR is already merged, and this is a regression in thread-safety

The text was updated successfully, but these errors were encountered:

Continuing from previous PRs to making CodeInstance the primary means of tracking compilation, this introduces an "engine" which keeps track externally of whether a particular inference result is in progress and where. At present, this handles unexpected cycles by permitting both threads to work on it. This is likely to be optimal most of the time currently, until we have the ability to do work-stealing of the results. Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item going into the global cache. Fixes #53433, as inInference is computed by the engine and protected by a lock, which also fixes #53680.

Continuing from previous PRs to making CodeInstance the primary means of tracking compilation, this introduces an "engine" which keeps track externally of whether a particular inference result is in progress and where. At present, this handles unexpected cycles by permitting both threads to work on it. This is likely to be optimal most of the time currently, until we have the ability to do work-stealing of the results. To assist with that, CodeInstance is now primarily allocated by `jl_engine_reserve`, which also tracks that this is being currently inferred. This creates a sort of per-(MI,owner) tuple lock mechanism, which can be used with the double-check pattern to see if inference was completed while waiting on that. The `world` value is not included since that is inferred later, so there is a possibility that a thread waits only to discover that the result was already invalid before it could use it (though this should be unlikely). The process then can notify when it has finished and wants to release the reservation lock on that identity pair. When doing so, it may also provide source code, allowing the process to potentially begin a threadpool to compile that result while the main thread continues with the job of inference. Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item going into the global cache. Fixes #53433, as inInference is computed by the engine and protected by a lock, which also fixes #53680.

vtjnash added regression Regression in behavior compared to a previous version compiler:codegen Generation of LLVM IR and native code multithreading Base.Threads and related functionality labels Feb 22, 2024

gbaraldi mentioned this issue Feb 27, 2024

Track issues with external abstract interpreter from the recent CodeInstance refactors #53498

Open

vtjnash mentioned this issue Jun 15, 2024

implement "engine" for managing inference/codegen #54816

Merged

vtjnash closed this as completed in #54816 Jun 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

regression in thread-safety with data-race on mi->inInference field #53433

regression in thread-safety with data-race on mi->inInference field #53433

vtjnash commented Feb 22, 2024 •

edited

Loading

regression in thread-safety with data-race on mi->inInference field #53433

regression in thread-safety with data-race on mi->inInference field #53433

Comments

vtjnash commented Feb 22, 2024 • edited Loading

vtjnash commented Feb 22, 2024 •

edited

Loading