Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression in thread-safety with data-race on mi->inInference field #53433

Closed
vtjnash opened this issue Feb 22, 2024 · 0 comments · Fixed by #54816
Closed

regression in thread-safety with data-race on mi->inInference field #53433

vtjnash opened this issue Feb 22, 2024 · 0 comments · Fixed by #54816
Labels
compiler:codegen Generation of LLVM IR and native code multithreading Base.Threads and related functionality regression Regression in behavior compared to a previous version

Comments

@vtjnash
Copy link
Member

vtjnash commented Feb 22, 2024

The changes in #53219 appears to have introduced a datarace on the mi->inInference field, which seems likely to lead to incorrect compilation and bad performance of threaded code. It was previously protected by the codegen lock.

Originally posted by @vtjnash in #53219 (comment)

Posting as an issue since the PR is already merged, and this is a regression in thread-safety

@vtjnash vtjnash changed the title This appears to have introduced a datarace on the mi->inInference field, which seems likely to lead to incorrect compilation and bad performance of threaded code. It was previously protected by the codegen lock. regression in thread-safety with data-race on mi->inInference field Feb 22, 2024
@vtjnash vtjnash added regression Regression in behavior compared to a previous version compiler:codegen Generation of LLVM IR and native code multithreading Base.Threads and related functionality labels Feb 22, 2024
vtjnash added a commit that referenced this issue Jun 15, 2024
Continuing from previous PRs to making CodeInstance the primary means of
tracking compilation, this introduces an "engine" which keeps track
externally of whether a particular inference result is in progress and
where. At present, this handles unexpected cycles by permitting both
threads to work on it. This is likely to be optimal most of the time
currently, until we have the ability to do work-stealing of the results.

Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item
going into the global cache.

Fixes #53433, as inInference is computed by the engine and
protected by a lock, which also fixes #53680.
vtjnash added a commit that referenced this issue Jun 15, 2024
Continuing from previous PRs to making CodeInstance the primary means of
tracking compilation, this introduces an "engine" which keeps track
externally of whether a particular inference result is in progress and
where. At present, this handles unexpected cycles by permitting both
threads to work on it. This is likely to be optimal most of the time
currently, until we have the ability to do work-stealing of the results.

To assist with that, CodeInstance is now primarily allocated by
`jl_engine_reserve`, which also tracks that this is being currently
inferred. This creates a sort of per-(MI,owner) tuple lock mechanism,
which can be used with the double-check pattern to see if inference was
completed while waiting on that. The `world` value is not included since
that is inferred later, so there is a possibility that a thread waits
only to discover that the result was already invalid before it could use
it (though this should be unlikely).

The process then can notify when it has finished and wants to release
the reservation lock on that identity pair. When doing so, it may also
provide source code, allowing the process to potentially begin a
threadpool to compile that result while the main thread continues with
the job of inference.

Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item
going into the global cache.

Fixes #53433, as inInference is computed by the engine and
protected by a lock, which also fixes #53680.
vtjnash added a commit that referenced this issue Jun 15, 2024
Continuing from previous PRs to making CodeInstance the primary means of
tracking compilation, this introduces an "engine" which keeps track
externally of whether a particular inference result is in progress and
where. At present, this handles unexpected cycles by permitting both
threads to work on it. This is likely to be optimal most of the time
currently, until we have the ability to do work-stealing of the results.

To assist with that, CodeInstance is now primarily allocated by
`jl_engine_reserve`, which also tracks that this is being currently
inferred. This creates a sort of per-(MI,owner) tuple lock mechanism,
which can be used with the double-check pattern to see if inference was
completed while waiting on that. The `world` value is not included since
that is inferred later, so there is a possibility that a thread waits
only to discover that the result was already invalid before it could use
it (though this should be unlikely).

The process then can notify when it has finished and wants to release
the reservation lock on that identity pair. When doing so, it may also
provide source code, allowing the process to potentially begin a
threadpool to compile that result while the main thread continues with
the job of inference.

Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item
going into the global cache.

Fixes #53433, as inInference is computed by the engine and
protected by a lock, which also fixes #53680.
vtjnash added a commit that referenced this issue Jun 15, 2024
Continuing from previous PRs to making CodeInstance the primary means of
tracking compilation, this introduces an "engine" which keeps track
externally of whether a particular inference result is in progress and
where. At present, this handles unexpected cycles by permitting both
threads to work on it. This is likely to be optimal most of the time
currently, until we have the ability to do work-stealing of the results.

To assist with that, CodeInstance is now primarily allocated by
`jl_engine_reserve`, which also tracks that this is being currently
inferred. This creates a sort of per-(MI,owner) tuple lock mechanism,
which can be used with the double-check pattern to see if inference was
completed while waiting on that. The `world` value is not included since
that is inferred later, so there is a possibility that a thread waits
only to discover that the result was already invalid before it could use
it (though this should be unlikely).

The process then can notify when it has finished and wants to release
the reservation lock on that identity pair. When doing so, it may also
provide source code, allowing the process to potentially begin a
threadpool to compile that result while the main thread continues with
the job of inference.

Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item
going into the global cache.

Fixes #53433, as inInference is computed by the engine and
protected by a lock, which also fixes #53680.
vtjnash added a commit that referenced this issue Jun 19, 2024
Continuing from previous PRs to making CodeInstance the primary means of
tracking compilation, this introduces an "engine" which keeps track
externally of whether a particular inference result is in progress and
where. At present, this handles unexpected cycles by permitting both
threads to work on it. This is likely to be optimal most of the time
currently, until we have the ability to do work-stealing of the results.

To assist with that, CodeInstance is now primarily allocated by
`jl_engine_reserve`, which also tracks that this is being currently
inferred. This creates a sort of per-(MI,owner) tuple lock mechanism,
which can be used with the double-check pattern to see if inference was
completed while waiting on that. The `world` value is not included since
that is inferred later, so there is a possibility that a thread waits
only to discover that the result was already invalid before it could use
it (though this should be unlikely).

The process then can notify when it has finished and wants to release
the reservation lock on that identity pair. When doing so, it may also
provide source code, allowing the process to potentially begin a
threadpool to compile that result while the main thread continues with
the job of inference.

Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item
going into the global cache.

Fixes #53433, as inInference is computed by the engine and
protected by a lock, which also fixes #53680.
vtjnash added a commit that referenced this issue Jun 22, 2024
Continuing from previous PRs to making CodeInstance the primary means of
tracking compilation, this introduces an "engine" which keeps track
externally of whether a particular inference result is in progress and
where. At present, this handles unexpected cycles by permitting both
threads to work on it. This is likely to be optimal most of the time
currently, until we have the ability to do work-stealing of the results.

To assist with that, CodeInstance is now primarily allocated by
`jl_engine_reserve`, which also tracks that this is being currently
inferred. This creates a sort of per-(MI,owner) tuple lock mechanism,
which can be used with the double-check pattern to see if inference was
completed while waiting on that. The `world` value is not included since
that is inferred later, so there is a possibility that a thread waits
only to discover that the result was already invalid before it could use
it (though this should be unlikely).

The process then can notify when it has finished and wants to release
the reservation lock on that identity pair. When doing so, it may also
provide source code, allowing the process to potentially begin a
threadpool to compile that result while the main thread continues with
the job of inference.

Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item
going into the global cache.

Fixes #53433, as inInference is computed by the engine and
protected by a lock, which also fixes #53680.
vtjnash added a commit that referenced this issue Jun 23, 2024
Continuing from previous PRs to making CodeInstance the primary means of
tracking compilation, this introduces an "engine" which keeps track
externally of whether a particular inference result is in progress and
where. At present, this handles unexpected cycles by permitting both
threads to work on it. This is likely to be optimal most of the time
currently, until we have the ability to do work-stealing of the results.

To assist with that, CodeInstance is now primarily allocated by
`jl_engine_reserve`, which also tracks that this is being currently
inferred. This creates a sort of per-(MI,owner) tuple lock mechanism,
which can be used with the double-check pattern to see if inference was
completed while waiting on that. The `world` value is not included since
that is inferred later, so there is a possibility that a thread waits
only to discover that the result was already invalid before it could use
it (though this should be unlikely).

The process then can notify when it has finished and wants to release
the reservation lock on that identity pair. When doing so, it may also
provide source code, allowing the process to potentially begin a
threadpool to compile that result while the main thread continues with
the job of inference.

Includes fix for #53434, by ensuring SOURCE_MODE_ABI results in the item
going into the global cache.

Fixes #53433, as inInference is computed by the engine and protected by
a lock, which also fixes #53680.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code multithreading Base.Threads and related functionality regression Regression in behavior compared to a previous version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant