Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLServer - Fixes sqlserver_process_cpu calculation #8549

Merged
merged 2 commits into from
Mar 23, 2021

Conversation

ClaudioESSilva
Copy link
Contributor

fixes #8548

Required for all PRs:

  • Signed CLA.
  • Associated README.md updated.
  • Has appropriate unit tests.

cc / @denzilribeiro @Trovalo

@Trovalo
Copy link
Collaborator

Trovalo commented Dec 14, 2020

I had a look at it but tbh I've never had problems with the values returned by the query, the sum has always been 100. (looks like all the servers I see have only one physical CPU tho, so I can't replicate this yet).
I'll look into it soon by finding a server with more than one Physical CPU.

as a note, cpu_count / hyperthread_ratio are both integers and the decimal part is going to be truncated, is hyperthread_ratio always a submultiple of cpu_count or are we risking to lose something?

@ClaudioESSilva
Copy link
Contributor Author

Thanks for the feedback @Trovalo.

I haven't seen it before either. Until I have added this bigger server to the monitoring.

Regarding the calculation, I have removed the decimal part so we keep the same behaviour as before. We are also just getting integers.

I think we are not risking to lose something. Per MS documentation:
cpu_count - specifies the number of logical CPUs in the system
hyoerthread_ratio - specifies the ratio of the number of logical or physical cores that are exposed by one physical processor package

So, it seems to be always a multiple.

@sjwang90 sjwang90 added area/sqlserver fix pr to fix corresponding bug labels Dec 15, 2020
@Trovalo
Copy link
Collaborator

Trovalo commented Dec 15, 2020

@ClaudioESSilva I've run the query on some instances with more than one physical CPUs without issues.
here are the results, in order:

  1. current query
  2. Physical CPU Count
  3. Proposed query (which doesn't make any difference since there is no anomaly)

(I've removed the TOP 1 to fetch older rows)
image

I've tried it on SQL 2005, 2008R2, 2012, 2016, 2017 without issues.
Which edition are you running against?

@ClaudioESSilva
Copy link
Contributor Author

It is a 2016 enterprise.
I have also run against a couple of 2019 with no problems.

@Trovalo
Copy link
Collaborator

Trovalo commented Dec 19, 2020

Maybe something's wrong with that instance then.
I had some anomalies too in the past on a pair of instances, which when under heavy load messed up some data (percentages over 100% and similar stuff).
Always talking about broken data, in the past I had some instances with broken performance counters (totally missing or with a fixed 0 value), so sometimes there is just something wrong on SQL Server or in the OS.

@ClaudioESSilva
Copy link
Contributor Author

I don't think so. But, I need to get (ou ask someone) another server with similar config.

I found that Solarwinds also discovered this
https://support.solarwinds.com/SuccessCenter/s/article/CPU-utilization-is-not-displayed-properly-in-DPA?language=en_US

@Trovalo
Copy link
Collaborator

Trovalo commented Dec 19, 2020

In any case, this is good to go and will fix this (hopefully) rare issue.
If you find out something more let me know, I'm curious about it.

@helenosheaa
Copy link
Member

@ClaudioESSilva What versions have you tested this on currently?

@ClaudioESSilva
Copy link
Contributor Author

Hi @helenosheaa
I have run the fix to test on SQL Server 2012, 2016 and 2019.
@Trovalo on 2005, 2008R2, 2012 and 2017

Copy link
Member

@helenosheaa helenosheaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks @ClaudioESSilva looks good.

@helenosheaa helenosheaa merged commit 9aaaf72 into influxdata:master Mar 23, 2021
reimda pushed a commit that referenced this pull request Apr 7, 2021
jblesener pushed a commit to jblesener/telegraf that referenced this pull request Apr 18, 2021
@fbannister
Copy link

This fails on SQL 2008 due to formatting in the https://github.com/influxdata/telegraf/blob/master/plugins/inputs/sqlserver/sqlqueriesV2.go file in the sqlServerCPUV2 string section. I suspect an inadvertent paste. The semi-colon need to move to after the last END - Refer to below as this works on both:
const sqlServerCPUV2 string = /*The ring buffer has a new value every minute*/ IF SERVERPROPERTY('EngineEdition') IN (2,3,4) /*Standard,Enterpris,Express*/ BEGIN WITH utilization_cte AS ( SELECT [SQLProcessUtilization] AS [sqlserver_process_cpu] ,[SystemIdle] AS [system_idle_cpu] ,100 - [SystemIdle] - [SQLProcessUtilization] AS [other_process_cpu] FROM ( SELECT TOP 1 [record_id] ,[SQLProcessUtilization] ,[SystemIdle] FROM ( SELECT record.value('(./Record/@id)[1]', 'int') AS [record_id] ,record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') AS [SystemIdle] ,record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') AS [SQLProcessUtilization] ,[TIMESTAMP] FROM ( SELECT [TIMESTAMP] ,convert(XML, [record]) AS [record] FROM sys.dm_os_ring_buffers WHERE [ring_buffer_type] = N'RING_BUFFER_SCHEDULER_MONITOR' AND [record] LIKE '%<SystemHealth>%' ) AS x ) AS y ORDER BY [record_id] DESC ) AS z ), processor_Info_cte AS ( SELECT (cpu_count / hyperthread_ratio) as number_of_physical_cpus FROM sys.dm_os_sys_info ) SELECT 'sqlserver_cpu' AS [measurement] ,REPLACE(@@SERVERNAME,'\',':') AS [sql_instance] ,[sqlserver_process_cpu] ,[system_idle_cpu] ,100 - [system_idle_cpu] - [sqlserver_process_cpu] AS [other_process_cpu] FROM ( SELECT (case when [other_process_cpu] < 0 then [sqlserver_process_cpu] / a.number_of_physical_cpus else [sqlserver_process_cpu] end) as [sqlserver_process_cpu] ,[system_idle_cpu] FROM utilization_cte CROSS APPLY processor_Info_cte a ) AS b END;

@Trovalo
Copy link
Collaborator

Trovalo commented May 21, 2021

It was fixed for the new queries in #9130
the problem was caused by some whitespace in the query, which is not a valid char for SQL 2008.

With new queries I mean database_type = "SQLServer" which uses the queries in plugins/inputs/sqlserver/sqlserverqueries.go.
Consider queriesV2 out of support, as those are no longer maintained or developed but feel free to open a PR to fix this.

arstercz pushed a commit to arstercz/telegraf that referenced this pull request Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/sqlserver fix pr to fix corresponding bug
Projects
None yet
5 participants