-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regression in thread-safety with data-race on mi->inInference field #53433
Labels
compiler:codegen
Generation of LLVM IR and native code
multithreading
Base.Threads and related functionality
regression
Regression in behavior compared to a previous version
Comments
vtjnash
changed the title
This appears to have introduced a datarace on the
regression in thread-safety with data-race on mi->inInference field
Feb 22, 2024
mi->inInference
field, which seems likely to lead to incorrect compilation and bad performance of threaded code. It was previously protected by the codegen lock.
vtjnash
added
regression
Regression in behavior compared to a previous version
compiler:codegen
Generation of LLVM IR and native code
multithreading
Base.Threads and related functionality
labels
Feb 22, 2024
vtjnash
added a commit
that referenced
this issue
Jun 15, 2024
Continuing from previous PRs to making CodeInstance the primary means of tracking compilation, this introduces an "engine" which keeps track externally of whether a particular inference result is in progress and where. At present, this handles unexpected cycles by permitting both threads to work on it. This is likely to be optimal most of the time currently, until we have the ability to do work-stealing of the results. Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item going into the global cache. Fixes #53433, as inInference is computed by the engine and protected by a lock, which also fixes #53680.
vtjnash
added a commit
that referenced
this issue
Jun 15, 2024
Continuing from previous PRs to making CodeInstance the primary means of tracking compilation, this introduces an "engine" which keeps track externally of whether a particular inference result is in progress and where. At present, this handles unexpected cycles by permitting both threads to work on it. This is likely to be optimal most of the time currently, until we have the ability to do work-stealing of the results. To assist with that, CodeInstance is now primarily allocated by `jl_engine_reserve`, which also tracks that this is being currently inferred. This creates a sort of per-(MI,owner) tuple lock mechanism, which can be used with the double-check pattern to see if inference was completed while waiting on that. The `world` value is not included since that is inferred later, so there is a possibility that a thread waits only to discover that the result was already invalid before it could use it (though this should be unlikely). The process then can notify when it has finished and wants to release the reservation lock on that identity pair. When doing so, it may also provide source code, allowing the process to potentially begin a threadpool to compile that result while the main thread continues with the job of inference. Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item going into the global cache. Fixes #53433, as inInference is computed by the engine and protected by a lock, which also fixes #53680.
vtjnash
added a commit
that referenced
this issue
Jun 15, 2024
Continuing from previous PRs to making CodeInstance the primary means of tracking compilation, this introduces an "engine" which keeps track externally of whether a particular inference result is in progress and where. At present, this handles unexpected cycles by permitting both threads to work on it. This is likely to be optimal most of the time currently, until we have the ability to do work-stealing of the results. To assist with that, CodeInstance is now primarily allocated by `jl_engine_reserve`, which also tracks that this is being currently inferred. This creates a sort of per-(MI,owner) tuple lock mechanism, which can be used with the double-check pattern to see if inference was completed while waiting on that. The `world` value is not included since that is inferred later, so there is a possibility that a thread waits only to discover that the result was already invalid before it could use it (though this should be unlikely). The process then can notify when it has finished and wants to release the reservation lock on that identity pair. When doing so, it may also provide source code, allowing the process to potentially begin a threadpool to compile that result while the main thread continues with the job of inference. Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item going into the global cache. Fixes #53433, as inInference is computed by the engine and protected by a lock, which also fixes #53680.
vtjnash
added a commit
that referenced
this issue
Jun 15, 2024
Continuing from previous PRs to making CodeInstance the primary means of tracking compilation, this introduces an "engine" which keeps track externally of whether a particular inference result is in progress and where. At present, this handles unexpected cycles by permitting both threads to work on it. This is likely to be optimal most of the time currently, until we have the ability to do work-stealing of the results. To assist with that, CodeInstance is now primarily allocated by `jl_engine_reserve`, which also tracks that this is being currently inferred. This creates a sort of per-(MI,owner) tuple lock mechanism, which can be used with the double-check pattern to see if inference was completed while waiting on that. The `world` value is not included since that is inferred later, so there is a possibility that a thread waits only to discover that the result was already invalid before it could use it (though this should be unlikely). The process then can notify when it has finished and wants to release the reservation lock on that identity pair. When doing so, it may also provide source code, allowing the process to potentially begin a threadpool to compile that result while the main thread continues with the job of inference. Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item going into the global cache. Fixes #53433, as inInference is computed by the engine and protected by a lock, which also fixes #53680.
vtjnash
added a commit
that referenced
this issue
Jun 19, 2024
Continuing from previous PRs to making CodeInstance the primary means of tracking compilation, this introduces an "engine" which keeps track externally of whether a particular inference result is in progress and where. At present, this handles unexpected cycles by permitting both threads to work on it. This is likely to be optimal most of the time currently, until we have the ability to do work-stealing of the results. To assist with that, CodeInstance is now primarily allocated by `jl_engine_reserve`, which also tracks that this is being currently inferred. This creates a sort of per-(MI,owner) tuple lock mechanism, which can be used with the double-check pattern to see if inference was completed while waiting on that. The `world` value is not included since that is inferred later, so there is a possibility that a thread waits only to discover that the result was already invalid before it could use it (though this should be unlikely). The process then can notify when it has finished and wants to release the reservation lock on that identity pair. When doing so, it may also provide source code, allowing the process to potentially begin a threadpool to compile that result while the main thread continues with the job of inference. Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item going into the global cache. Fixes #53433, as inInference is computed by the engine and protected by a lock, which also fixes #53680.
vtjnash
added a commit
that referenced
this issue
Jun 22, 2024
Continuing from previous PRs to making CodeInstance the primary means of tracking compilation, this introduces an "engine" which keeps track externally of whether a particular inference result is in progress and where. At present, this handles unexpected cycles by permitting both threads to work on it. This is likely to be optimal most of the time currently, until we have the ability to do work-stealing of the results. To assist with that, CodeInstance is now primarily allocated by `jl_engine_reserve`, which also tracks that this is being currently inferred. This creates a sort of per-(MI,owner) tuple lock mechanism, which can be used with the double-check pattern to see if inference was completed while waiting on that. The `world` value is not included since that is inferred later, so there is a possibility that a thread waits only to discover that the result was already invalid before it could use it (though this should be unlikely). The process then can notify when it has finished and wants to release the reservation lock on that identity pair. When doing so, it may also provide source code, allowing the process to potentially begin a threadpool to compile that result while the main thread continues with the job of inference. Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item going into the global cache. Fixes #53433, as inInference is computed by the engine and protected by a lock, which also fixes #53680.
vtjnash
added a commit
that referenced
this issue
Jun 23, 2024
Continuing from previous PRs to making CodeInstance the primary means of tracking compilation, this introduces an "engine" which keeps track externally of whether a particular inference result is in progress and where. At present, this handles unexpected cycles by permitting both threads to work on it. This is likely to be optimal most of the time currently, until we have the ability to do work-stealing of the results. To assist with that, CodeInstance is now primarily allocated by `jl_engine_reserve`, which also tracks that this is being currently inferred. This creates a sort of per-(MI,owner) tuple lock mechanism, which can be used with the double-check pattern to see if inference was completed while waiting on that. The `world` value is not included since that is inferred later, so there is a possibility that a thread waits only to discover that the result was already invalid before it could use it (though this should be unlikely). The process then can notify when it has finished and wants to release the reservation lock on that identity pair. When doing so, it may also provide source code, allowing the process to potentially begin a threadpool to compile that result while the main thread continues with the job of inference. Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item going into the global cache. Fixes #53433, as inInference is computed by the engine and protected by a lock, which also fixes #53680.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
compiler:codegen
Generation of LLVM IR and native code
multithreading
Base.Threads and related functionality
regression
Regression in behavior compared to a previous version
The changes in #53219 appears to have introduced a datarace on the
mi->inInference
field, which seems likely to lead to incorrect compilation and bad performance of threaded code. It was previously protected by the codegen lock.Originally posted by @vtjnash in #53219 (comment)
Posting as an issue since the PR is already merged, and this is a regression in thread-safety
The text was updated successfully, but these errors were encountered: