# 1.5 The D.E.A.T.H. Method: Tuning Indexes for Specific Queries

We covered the D.E. parts of the D.E.A.T.H. Method, and if we were going in order, we’d tackle the A part next: using Clippy’s index recommendations from the missing index DMVs. However, Clippy can be a little misleading, so just for the purpose of training, we’re going to tackle the T first: tuning indexes for these specific queries.</p>


## Reminder - D.E.A.T.H Method
<img src="C:\Users\hartleyg\Desktop\Training\SQL Server\Brent Ozar\Mastering Index Tuning\1.5 DEATH - Tuning to Specific Queries\1.1.png" width=700></img>

**Dedupe & Eliminate** - Matter of hours of focused work

**Adding indexes** - Weekly, requires more thought
- Requires close examination of existing indexes
- Thinking about key order, selectivity
- Interpreting the ideas from SQL recommendations (don't take recommendation as gospel, but interpret the clues)


**Tuning indexes for specific queries** - Even more involved effort, typically 1-4 hours **per query**
- Finding the right queries to tune
- Ongoing monitoring (make sure it gets used)
- A/B testing for effectiveness
- Tuning the query itself


The following is an example using a query with only equality operators:

In [None]:
/* Both Equality Searches - Order doesnt matter*/
SELECT Id
  FROM dbo.Users
  WHERE DisplayName = 'Brent Ozar'
  AND WebsiteUrl = 'https://www.brentozar.com';
GO

Note that because both filters are equality searches, the order doesn't matter for this query. 

However, lets run an example using inequality operators... 

In [None]:
/* Turn on actual plans (control-M) and: */
SET STATISTICS IO, TIME ON;
GO

CREATE OR ALTER PROC [dbo].[usp_Q6925] @UserId INT AS
BEGIN
/* Source: http://data.stackexchange.com/stackoverflow/query/6925/newer-users-with-more-reputation-than-me */
 
    SELECT u.Id as [User Link], u.Reputation, u.Reputation - me.Reputation as Difference
    FROM dbo.Users me 
    INNER JOIN dbo.Users u 
        ON u.CreationDate > me.CreationDate
        AND u.Reputation > me.Reputation
    WHERE me.Id = @UserId
 
END
GO

EXEC usp_Q6925 @UserId = 26837
GO

<img src="C:\Users\hartleyg\Desktop\Training\SQL Server\Brent Ozar\Mastering Index Tuning\1.5 DEATH - Tuning to Specific Queries\1.3.png" width=900></img>

SQL Server starts with a Clustered Index Seek for the 'me' (PK_Users_Id) part of the join, directly finding the ID of the row that was specified.

<img src="C:\Users\hartleyg\Desktop\Training\SQL Server\Brent Ozar\Mastering Index Tuning\1.5 DEATH - Tuning to Specific Queries\1.4.png" width=900></img>

Now it scans the Users table, looking for everyone who has a higher Creation Date and higher Reputation than the specified user. 

The recommendation suggests that we add an index on CreationDate and Reputation to the Users table, but why is that? If we right-click the recommendation and scan the XML, we'll see the following:

<img src="C:\Users\hartleyg\Desktop\Training\SQL Server\Brent Ozar\Mastering Index Tuning\1.5 DEATH - Tuning to Specific Queries\1.5.png" width = 500></img>

The recommendation simply orders the key columns as they are in the table... which may or may not be right? For an equality search, this doesnt matter so much, but for an INEQUALITY search? Matters a lot...

Now if we go with Clippy's recommendation:

In [None]:
-- Clippy's recommendation
CREATE INDEX IX_CreationDate_Reputation ON dbo.Users(IX_CreationDate_Reputation);


We have the following outcomes:

1. Logical Reads with no indexes:<br>
<img src="C:\Users\hartleyg\Desktop\Training\SQL Server\Brent Ozar\Mastering Index Tuning\1.5 DEATH - Tuning to Specific Queries\1.6.png" width = 500></img>
2. Query plan with IX_CreatedDate_Reputation<br>
<img src="C:\Users\hartleyg\Desktop\Training\SQL Server\Brent Ozar\Mastering Index Tuning\1.5 DEATH - Tuning to Specific Queries\1.7.png" width = 700></img>
 <---- It's doing a scan on our index, which is great, but check out that chunky arrow! 
3. Number of rows read against our index<br>
<img src="C:\Users\hartleyg\Desktop\Training\SQL Server\Brent Ozar\Mastering Index Tuning\1.5 DEATH - Tuning to Specific Queries\1.8.png" width = 300></img>
4. Yikes, 8.9 mil reads from a 9mil table! That's quite a lot... No surprise the logical reads look like this:<br>
<img src="C:\Users\hartleyg\Desktop\Training\SQL Server\Brent Ozar\Mastering Index Tuning\1.5 DEATH - Tuning to Specific Queries\1.9.png" width = 500></img>
5. This is still a lot of logical reads, but we have reduced the number of reads compared to earlier. However, we can do better! Lets flip this ish!

In [None]:
/* Joan Jett don't give a damn about her Reputation... but we do ;) */
CREATE INDEX IX_Reputation_CreationDate ON dbo.Users(IX_CreationDate_Reputation)

1. Lets run the query again, and inspect the query plan:<br>
<img src="C:\Users\hartleyg\Desktop\Training\SQL Server\Brent Ozar\Mastering Index Tuning\1.5 DEATH - Tuning to Specific Queries\1.10.png" width = 800></img>
2. It decided to use our query plan (note: sometimes it doesn't...). Well check out the logical reads on this bad boy! <br>
<img src="C:\Users\hartleyg\Desktop\Training\SQL Server\Brent Ozar\Mastering Index Tuning\1.5 DEATH - Tuning to Specific Queries\1.11.png" width = 500></img>
3. So why did it choose us over Clippy? Selectivity. In this instance, Reputation is the more selective of the two fields:

In [None]:
SELECT * FROM dbo.Users WHERE Id =26837
SELECT COUNT(*) FROM dbo.Users WHERE CreationDate > '2008-10-10'    -- 8903829
SELECT COUNT(*) FROM dbo.Users WHERE Reputation > 11825             -- 11213

## Which should go first? 

In this instance, narrowing down the search space using Reputation is more effective. Keep in mind that different parameters can result in different indexes making sense (eg. ORDER BY, TOP operations can make indexes more or less effective).

Remember that only one query plan is generated, and then reused. Knowing what parameters are being used, and what need to be tuned are also important.

## Exercise
Find ONE index that can best accommodate BOTH stored procedures: 

In [None]:


CREATE OR ALTER   PROC [dbo].[usp_PostsByCommentCount] @PostTypeId INT
AS
SELECT TOP 10 CommentCount, Score, ViewCount
FROM dbo.Posts
WHERE PostTypeId = @PostTypeId
ORDER BY CommentCount DESC;
GO

CREATE OR ALTER   PROC [dbo].[usp_PostsByScore] @PostTypeId INT, @CommentCountMinimum INT
AS
SELECT TOP 10 Id, CommentCount, Score
FROM dbo.Posts
WHERE CommentCount >= @CommentCountMinimum
AND PostTypeId = @PostTypeId
ORDER BY Score DESC;
GO

/* Create one index to improve both of these: */
EXEC usp_PostsByCommentCount @PostTypeId = 2;
GO
EXEC usp_PostsByScore @PostTypeId = 2, @CommentCountMinimum = 2;
GO

## Considerations
- The index we come up with may not be the best index for each individual query. Our goal is to find 1 index that improves the performance of both, as best it can
- Selectivity matters

## Strategy
1. Select the simplest stored proc of the two - in this instance, usp_PostsByCommentCount
2. Generate a script of the ideal index for this query
3. Review the next stored proc, and make adjustments to suit
4. Be careful if you need to change the key order. Placing a more selective field at the front of the index may have a downstream effect on the previous stored proc

In [None]:
/*  Possible Keys: CommentCount or PostTypeId
    Possible includes: Score, ViewCount
*/
CREATE OR ALTER   PROC [dbo].[usp_PostsByCommentCount] @PostTypeId INT
AS
SELECT TOP 10 CommentCount, Score, ViewCount
FROM dbo.Posts
WHERE PostTypeId = @PostTypeId
ORDER BY CommentCount DESC;
GO

/*  Possible Keys: CommentCount or PostTypeId or Score
    Possible includes: none
*/
CREATE OR ALTER   PROC [dbo].[usp_PostsByScore] @PostTypeId INT, @CommentCountMinimum INT
AS
SELECT TOP 10 Id, CommentCount, Score
FROM dbo.Posts
WHERE CommentCount >= @CommentCountMinimum
AND PostTypeId = @PostTypeId
ORDER BY Score DESC;
GO

/* Create one index to improve both of these: */
EXEC usp_PostsByCommentCount @PostTypeId = 2;
GO
EXEC usp_PostsByScore @PostTypeId = 2, @CommentCountMinimum = 2;
GO

## Results

Our final index leads with CommentCount, then PostTypeId and Score. ViewCount is an INCLUDE in the index:

In [None]:
-- Final Index
CREATE INDEX CommentCount_PostTypeId_Score_Includes
ON dbo.Posts(CommentCount, PostTypeId, Score) INCLUDE (ViewCount)



### [dbo].[usp_PostsByCommentCount]

For our first stored proc, it will:
1. Scan the index with a reverse order on the CommentCount field.
2. Keep scanning until it finds the  10 records that match the @PostTypeId specified in the WHERE clause.

So how did it perform?

<img src="C:\Users\hartleyg\Desktop\Training\sqltraining\Brent Ozar\Mastering Index Tuning\1.5 DEATH Method - Tuning Indexes to Specific Queries\1.12.png" width = 500></img>

<img src="C:\Users\hartleyg\Desktop\Training\sqltraining\Brent Ozar\Mastering Index Tuning\1.5 DEATH Method - Tuning Indexes to Specific Queries\1.13.png" width = 800></img>

17 rows read (out of the 10 we asked for), 4 logical page reads - this isnt too bad! Given the parameters we provided it didn't have to scan too far to find the results we wanted, despite the fact it ran a so-called 'evil' Table Scan...

However, if we decided to change the parameters we used (eg. choosing a rarely-used PostTypeId), the order of the result set, or number of results we wanted, we start impacting the effectiveness of our index.


### [dbo].[usp_PostsByScore]

