Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix coverity issues. Do not re-throw worker thread error in the destructor. #3886

Merged
merged 6 commits into from
May 11, 2022

Conversation

stiepan
Copy link
Member

@stiepan stiepan commented May 10, 2022

Signed-off-by: Kamil Tokarski ktokarski@nvidia.com

Category:

Description:

Adds extra parameter to WaitForWork method to prevent it from re-throwing errors in the destructor.

Additional information:

Affected modules and functionalities:

Fixes possible deadlock on Shutdown if the worker thread quit with the execption

Key points relevant for the review:

Checklist

Tests

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: DALI-2736

@JanuszL JanuszL self-assigned this May 10, 2022
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4800120]: BUILD STARTED

std::unique_lock<std::mutex> lock(mutex_);
while (!work_complete_) {
completed_.wait(lock);
}

// Check for errors
if (!errors_.empty()) {
if (rethrow_worker_errors && !errors_.empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we okay not to break the execution if WaitForWork(false) and there is an error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was the idea, wasn't it? Ideally, we would break if there was something other than runtime_error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you refer to running_ = false and cv_.notify I don't think it does anything here: only the worker thread sets the errors and if so, it stops execution anyway. As to not throwing, I understood that it is the idea as @mzient says.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4800120]: BUILD PASSED

@mzient mzient self-assigned this May 11, 2022
@@ -89,10 +89,10 @@ class WorkerThread {
* When the destructor is called other things that work() is using may have been gone long
* before causing a hang. Now when Shutdown is called we are sure that all things around still exist.
*/
inline void Shutdown(void) {
inline void Shutdown(bool rethrow_worker_errors = true) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a legitimate case for rethrowing in shutdown? Isn't it ultimately always called from some kind of destructor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, one way or another it is always called from destructors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -119,14 +119,14 @@ class WorkerThread {
cv_.notify_one();
}

inline void WaitForWork() {
inline void WaitForWork(bool rethrow_worker_errors = true) {
std::unique_lock<std::mutex> lock(mutex_);
while (!work_complete_) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw. it seems that calling WaitForWork while worker actually does something and then it raises the error, may deadlock as the worker thread does not set&notify completed_.

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4807109]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [4807109]: BUILD PASSED

@stiepan stiepan merged commit 1c377d6 into NVIDIA:main May 11, 2022
cyyever pushed a commit to cyyever/DALI that referenced this pull request May 13, 2022
…uctor. (NVIDIA#3886)

* Prevents throwing exception from the worker thread in the destructor
* Fixes possible deadlock on Shutdown/WaitForWork

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
cyyever pushed a commit to cyyever/DALI that referenced this pull request Jun 7, 2022
…uctor. (NVIDIA#3886)

* Prevents throwing exception from the worker thread in the destructor
* Fixes possible deadlock on Shutdown/WaitForWork

Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants