Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sp_BlitzLock and parallel deadlocks #2636

Closed
erikdarlingdata opened this issue Oct 18, 2020 · 2 comments · Fixed by #2637
Closed

sp_BlitzLock and parallel deadlocks #2636

erikdarlingdata opened this issue Oct 18, 2020 · 2 comments · Fixed by #2637

Comments

@erikdarlingdata
Copy link
Contributor

Version of the script
SELECT @Version = '2.999', @VersionDate = '20201011';

What is the current behavior?
Complicated!

In newer versions of SQL Server, parallel deadlocks don't register unless they error out, with the message

Msg 1205, Level 13, State 18, Line 3
Transaction (Process ID 52) was deadlocked on lock | communication buffer resources with another process and has been chosen as the deadlock victim. Rerun the transaction.

This is a change that started with 2017 CU10, and also applies to SQL Server 2016 SP2, and SQL Server 2019 (to my knowledge):

FIX: Many xml_deadlock_report events are reported for one single intra-query deadlock occurrence in SQL Server 2016 and 2017

The way that the code was written in #1499 addressed normal parallel exchange spills (deadlocks), but not the errored out ones. In fact, a lot of the code around adding that detection in was defensive, because, like the KB article said, it would result in a lot of XML being generated.

The deadlocks that produce errors have slightly different characteristics, that we don't pick up on.
image

This messes up both the way that BlitzLock figures out if deadlocks are parallel, and the way that it grabs information about them. It also came to my attention while reviewing this that we don't have a roll-up warning about parallel deadlocks.

If the current behavior is a bug, please provide the steps to reproduce.
There's a repro for a normal exchange spill in #1499, the type that no longer produces a deadlock graph. A query that will produce one is like so:

DROP TABLE IF EXISTS #ohno

SELECT c.Id, c.UserId, c.CreationDate, c.PostId, c.Score 
INTO #ohno 
FROM dbo.Comments AS c WITH (TABLOCKX)

What is the expected behavior?
I think the best thing to do would be to add an additional check to the initial XML grab to figure out if XML fragments specific to parallel deadlocks exist, and use that to filter on. Otherwise, the query to filter regular vs parallel deadlocks becomes quite convoluted. And also add a count of parallel deadlocks to the roll-up. I know, I know, it's two different things, but they're closely related.

Code is just about ready to go.

Which versions of SQL Server and which OS are affected by this issue? Did this work in previous versions of our procedures?
Very all so many

@erikdarlingdata erikdarlingdata changed the title sp_BlitzLock, and parallel deadlocks sp_BlitzLock and parallel deadlocks Oct 18, 2020
@BrentOzar BrentOzar added this to the 2020-11 Release milestone Oct 21, 2020
@BrentOzar
Copy link
Member

Thanks sir! Merged into the dev branch, will be in the November release with credit to you in the release notes. I didn't even attempt to test this one, hahaha.

@erikdarlingdata
Copy link
Contributor Author

image

erikdarlingdata pushed a commit to erikdarlingdata/SQL-Server-First-Responder-Kit that referenced this issue Mar 18, 2021
This update attempts to fix a few issues, which I know is frowned upon, but one of them is minor and the other ones I discovered while testing changes.

Closes BrentOzarULTD#2800 - filters out the Initial Boot Probe SQL Agent lines

Closes BrentOzarULTD#2824 - fixes the way we output query text when outputting to Excel

Bugs I found along the way:
 * Filtering out parallel deadlocks got weird after I made fixes in BrentOzarULTD#2686 and BrentOzarULTD#2636
 * After I fixed BrentOzarULTD#2674 and dates were corrected, getting object names gathering broke
 * Fixes were applied across the regular output and the cross-server output
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants