Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low priority: Theoretical deadlock when dumping exit info #8498

Open
kren1 opened this issue May 21, 2024 · 3 comments · Fixed by #8521
Open

Low priority: Theoretical deadlock when dumping exit info #8498

kren1 opened this issue May 21, 2024 · 3 comments · Fixed by #8521
Assignees
Labels
bug Issue is reported as a bug team:VM Assigned to OTP team VM

Comments

@kren1
Copy link

kren1 commented May 21, 2024

Describe the bug
erts_exit can acquire lock in different order than other code, which might be a potential for deadlock

To Reproduce
I haven't observed this dynamically

Here's a theoretical example I see in the code:

Thread 1: lock(&smq_mtx) -> lock(&export_staging_lock)

This goes via the error condition here, where the dumping of information eventually calls export_table_sz())

Thread 2: lock(&export_staging_lock) -> lock(&smq_mtx)

This goes via index_put_entry, which can fail and lead to smq_mtx:

image

Expected behavior

Locks are acquired in the same order

Affected versions
Ran on OTP 26, 12509834ed

Additional context

This was found by Infer and I'm not familiar enough with the code to tell if there are some implicit invariants that prevent this sort of deadlock. There are ~242 examples of this all related to this exit dumping of info, maybe I haven't described the best one above. This seems quite unlikely to observe dynamically, so I thought you might be interested, but let me know if this is too niche/small sort of issue to be interesting.

report.json.zip
59004b2add47e12806aaaaa43834b8) is the full report of all the issue (you can download infer for a bit nicer exploring of issues). The issue I described is at index 4

@kren1 kren1 added the bug Issue is reported as a bug label May 21, 2024
@IngelaAndin IngelaAndin added the team:VM Assigned to OTP team VM label May 21, 2024
@sverker
Copy link
Contributor

sverker commented May 28, 2024

This scenario cannot happen as there are two threads doing crash dumping which there are protection against.

However, export_table_sz() should avoid locking during dump like export_info() does. Made a fix for that in #8521.

@sverker
Copy link
Contributor

sverker commented Jun 17, 2024

Searched for export_table_sz and "DEADLOCK" in report.json:

> grep export_table_sz report.json | wc
    235     940   13160
> grep \"DEADLOCK\" report.json | wc
    242     484    6776

Seems like 235 out of 242 reported deadlocks had to do with export_table_sz() fixed by #8521.

@sverker
Copy link
Contributor

sverker commented Jun 18, 2024

The remaining 7 was caused by erts_schedulers_state() with the same problem. Fixed by #8591.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug team:VM Assigned to OTP team VM
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants