Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sp_BlitzCache sometimes fails when there are readable secondaries on the server #2027

Closed
sivey42 opened this issue Apr 18, 2019 · 16 comments

Comments

Projects
None yet
3 participants
@sivey42
Copy link
Contributor

commented Apr 18, 2019

Version of the script
7.4, 20190320

What is the current behavior?
when I run the script as a scheduled job on a server that's an AG secondary, and has readable replica databases on it, I'll occasionally (~25% of job runs in my case) get a failure with the following message...

Msg 978, Level 14, State 1, Line 3
The target database ('MyDatabase') is in an availability group and is currently accessible for connections when the application intent is set to read only. For more information about application intent, see SQL Server Books Online.

If the current behavior is a bug, please provide the steps to reproduce.
run sp_BlitzFirst as you would from a scheduled job on a server with readable secondaries.

What is the expected behavior?
No errors.

Which versions of SQL Server and which OS are affected by this issue? Did this work in previous versions of our procedures?
13.0.5264.1

@BrentOzar

This comment has been minimized.

Copy link
Member

commented Apr 19, 2019

OK, can you try turning off the plan cache collection? Remove the @OutputTableNameBlitzCache parameter and let's see if that does it. We've had reports of sp_BlitzCache not working when some of the databases are inaccessible.

@BrentOzar BrentOzar added the bug label Apr 19, 2019

@sivey42

This comment has been minimized.

Copy link
Contributor Author

commented Apr 24, 2019

still trying to figure out a way to repro it, since it's pretty hit & miss. What you are suggesting does sound like the issue. In the job log, the print output immediately before the failure says it's getting plan information.

Collecting execution plan information. [SQLSTATE 01000] (Message 50000) The target database ('MyDatabase') is in an availability group...

Disabling the output for cache isn't a good long term solution for this, since we have both non-AG databases and AG readable secondaries on the same instance.

@BrentOzar

This comment has been minimized.

Copy link
Member

commented Apr 24, 2019

OK, perfect. In that case, give the dev branch of sp_BlitzFirst and sp_BlitzCache a shot too, just to see - I'd hate to have you troubleshoot something only to find out it's been fixed in the recent changes.

@BrentOzar BrentOzar changed the title sp_BlitzFirst fails when there are readable secondaries on the server sp_BlitzCache sometimes fails when there are readable secondaries on the server Apr 24, 2019

@sivey42

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

Update on this - it's been failing every time since I updated to the April release. Removing the @OutputTableNameBlitzCache does appear to solve the issue. I do have non-AG databases on the server too, so I'm throwing out collecting info on those databases too.

@BrentOzar

This comment has been minimized.

Copy link
Member

commented May 10, 2019

OK, can you give the dev branch of sp_BlitzFirst and sp_BlitzCache a shot too please? Thanks.

@sivey42

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

OK, can you give the dev branch of sp_BlitzFirst and sp_BlitzCache a shot too please? Thanks.

Tried dev branch, same issue...

Msg 978, Level 14, State 1, Line 120
The target database ('XXX') is in an availability group and is currently accessible for connections when the application intent is set to read only. For more information about application intent, see SQL Server Books Online.

@BrentOzar

This comment has been minimized.

Copy link
Member

commented May 10, 2019

OK, great, thanks!

@sivey42

This comment has been minimized.

Copy link
Contributor Author

commented May 10, 2019

I think have a fix for this. Will create pull request, and tag this. Should I use dev or some other branch?

sivey42 added a commit to sivey42/SQL-Server-First-Responder-Kit that referenced this issue May 10, 2019

@sivey42

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2019

This looks stalled. @BrentOzar - anything you need me to do to get it moving forward? I have a pull request submitted, but not sure if there is anything else I need to do to get that committed.

@BrentOzar

This comment has been minimized.

Copy link
Member

commented May 23, 2019

@sivey42 thanks for the contribution! I'm a little bit busy right now, but I'll work on it when I get time. If you'd like to chat with folks in the Slack channel (it's in the readme.md and the contributing guide), other folks may be able to help you test it sooner.

@sivey42

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2019

@sivey42 thanks for the contribution! I'm a little bit busy right now, but I'll work on it when I get time. If you'd like to chat with folks in the Slack channel (it's in the readme.md and the contributing guide), other folks may be able to help you test it sooner.

OK, no worries. Wasn't sure if I tied this back to the pull request correctly, so just wanted to make sure it wasn't getting lost in the shuffle.

@Adedba

This comment has been minimized.

Copy link
Contributor

commented Jun 4, 2019

Just given it a test as we are experiencing the same issue at the moment, Looks like is_primary_replica is not available as a column in sys.dm_hadr_database_replica_states for SQL 2012. To test further I replaced the new code with SELECT DBs.database_id FROM sys.databases DBs INNER JOIN sys.availability_replicas Replicas ON DBs.replica_id = Replicas.replica_id WHERE replica_server_name NOT IN (SELECT DISTINCT primary_replica FROM sys.dm_hadr_availability_group_states States) AND Replicas.secondary_role_allow_connections_desc = 'READ_ONLY' AND replica_server_name = @@SERVERNAME

This seemed to work fine on my SQL2012 instance which has the issue , not suggesting you need to use this code , just thought I would add it as it may help to re write for compatibility :)

@sivey42

This comment has been minimized.

Copy link
Contributor Author

commented Jun 7, 2019

Just given it a test as we are experiencing the same issue at the moment, Looks like is_primary_replica is not available as a column in sys.dm_hadr_database_replica_states for SQL 2012. To test further I replaced the new code with SELECT DBs.database_id FROM sys.databases DBs INNER JOIN sys.availability_replicas Replicas ON DBs.replica_id = Replicas.replica_id WHERE replica_server_name NOT IN (SELECT DISTINCT primary_replica FROM sys.dm_hadr_availability_group_states States) AND Replicas.secondary_role_allow_connections_desc = 'READ_ONLY' AND replica_server_name = @@SERVERNAME

This seemed to work fine on my SQL2012 instance which has the issue , not suggesting you need to use this code , just thought I would add it as it may help to re write for compatibility :)

that makes sense - I haven't been able to test with anything but 2016 AG.

BrentOzar added a commit that referenced this issue Jul 2, 2019

ignore readable secondaries in sp_blitzcache (#2049)
* ignore readable secondaries in sp_blitzcache

this is a fix for issue #2027

* Checking for is_primary_replica column

In sys.dm_hadr_database_replica_states.
@BrentOzar

This comment has been minimized.

Copy link
Member

commented Jul 2, 2019

Great work! I fixed it up to make it work with 2008-2012. Thanks, will get this into the July release and credit you.

@BrentOzar BrentOzar closed this Jul 2, 2019

BrentOzar added a commit that referenced this issue Jul 2, 2019

2019_07_release (#2039)
* #2005 sp_Blitz version numbers (#2006)

Working on #2005.

* #2005 easier merges for versions

Putting Version, VersionDate variables on a single line. Closes #2005.

* #2010 sp_Blitz paused online index operations (#2011)

Add warning for rows in sys.index_resumable_operations. Closes #2010.

* Fix collation error. (#2014)

* #2015 sp_BlitzCache sorting in Azure SQL DB (#2016)

Instead of checking version numbers, look at sys.all_columns for the grant and spill columns. Closes #2015.

* #2024 sp_BlitzQueryStore s.is_cursor error (#2025)

Changed alias on a temp table to match other nearby aliases. Closes #2024.

* First responder consistency check (#1997)

* sp_Blitz update compile and runs on MSSQL 2014

* Update sp_Blitz.sql

Added @debug handling - bug found for missing mandatory component

* Update sp_Blitz.sql

* Update sp_Blitz.sql


CheckId familly:

226 - FRK consistency

Messages:
2260 - First Responder kit consistency check has been unexpectedly modified (check names)
2261 - First Responder kit consistency check has been unexpectedly modified (dynamic query failure)
2262 - First Responder kit mandatory component called %s is missing
2263 - First Responder kit consistency check has been unexpectedly modified (checks ordering)
2264 - Component %s is not at the minimum version required to run this procedure
2265 - First Responder kit consistency check (Failed dynamic SP call to %s)
2266 - First Responder kit consistency: outdated component (%s)

removed "everything ok message"

* Resolves almost all Brent's comments except CHAR(92) change

removed 2264 - Component %s is not at the minimum version required to run this procedure
changed detailled finding message for previous 2264 message

Checked compilation on SQL Server 2014

* #1994 sp_Blitz version checking (#2028)

Changed wording from "consistency check" to "version check," updated check IDs, removed mandatory scripts. Closes #1994.

* #1944 sp_BlitzIndex ignore databases (#2029)

Adds new IgnoreDatabases parameter, and skips databases with >100 partitions. Closes #1944.

* #2009 sp_BlitzFirst call BC differently (#2030)

Use SkipAnalysis = 1, SortOrder = all when calling sp_BlitzCache. Closes #2009.

* #2026 sp_BlitzCache show multiple plan count (#2031)

Closes #2026.

* #2017 sp_Blitz Evaluation Edition expiration check (#2032)

Closes #2017.

* #2018 sp_BlitzCache arith overflow (#2033)

Changing MONEY on totals to BIGINT. Closes #2018.

* #2019 sp_BlitzCache air_quote_actual plans (#2034)

Closes #2019.

* Add END in the right spot for IF @Help = 1 (#2035)

The END for IF @Help = 1 was in the wrong place, making the procedure a no-op *unless* you also specify @Help = 1.

* #2037 sp_BlitzLock support for AWS RDS (#2038)

Skips update stats with options that RDS doesn't support. Closes #2037.

* 2019_04_release_prep

Bumping version numbers, building installation scripts.

* Moved SQL header build outside of If block (#2040)

* #2042 implicit transaction troubleshooting (#2043)

Clarify warnings in sp_Blitz, sp_BlitzFirst. Closes #2042.

* edit hyperlink (#2056)

* add @OutputType = 'XML' (#2048)

* Update sp_blitzcache to avoid arithmetic overflow errors (#2046)

Updated sp_blitzcache to avoid arithmetic overflow errors - tabel definitions had changed to BIGINT but the actual logic in this section still casted values as money which was breaking the ceiling for this on our setup.  I've pretty much blanket chantged everything except averages to be bigint and have been running for a few days with no failrues since so looks ok and worth someone who knows what they're doing casting an eye over it!

* Update sp_ineachdb.sql (#2061)

Added 2 Variables 
@SQLVersion: This was implemented so that large organizations with SQL Server 2008 and R2 can use this sproc.  Currently without this change SQL Server versions 2008 and 2008R2 will error with the following.  Msg 208, Level 16, State 1, Procedure sp_ineachdb, Line 176
Invalid object name 'sys.dm_hadr_database_replica_states'.

@ServerName:  This was added to account for VMware SnapShots at times @@ServerName will either come up NULL or with old server name.  CONVERT(sysname, SERVERPROPERTY('ServerName')) is a better option and reduces minor nuisances.  

Feel free to blend in the code if you feel this brings value to the project.

* ignore readable secondaries in sp_blitzcache (#2049)

* ignore readable secondaries in sp_blitzcache

this is a fix for issue #2027

* Checking for is_primary_replica column

In sys.dm_hadr_database_replica_states.

* 2019_07 Release

Bumping version numbers and updating install scripts.
@Adedba

This comment has been minimized.

Copy link
Contributor

commented Jul 3, 2019

Unfortunately looks like the new version is still hitting some issues - I am looking into it to see if I can get to the root of the problem - I will update as soon as I know more.

@BrentOzar

This comment has been minimized.

Copy link
Member

commented Jul 3, 2019

Yeah, not surprised. If you find something with the new code in the July release, can you start a new issue? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.