Skip to content

[Microsoft SQL Server] metrics for Always On Availability Groups#16759

Merged
jakubgalecki0 merged 16 commits intoelastic:mainfrom
jakubgalecki0:mssql-always-on
Feb 6, 2026
Merged

[Microsoft SQL Server] metrics for Always On Availability Groups#16759
jakubgalecki0 merged 16 commits intoelastic:mainfrom
jakubgalecki0:mssql-always-on

Conversation

@jakubgalecki0
Copy link
Contributor

@jakubgalecki0 jakubgalecki0 commented Jan 5, 2026

Proposed commit message

Add new data_stream for health metrics on Always On Availability Groups.

TSDB test
You're testing with version 8.19.0.

Testing data stream metrics-microsoft_sqlserver.availability_groups-default.

Index being used for the documents is .ds-metrics-microsoft_sqlserver.availability_groups-default-2026.01.26-000001.
Index being used for the settings and mappings is .ds-metrics-microsoft_sqlserver.availability_groups-default-2026.01.26-000002.

The time series fields for the TSDB index are:
	- dimension (13 fields):
		- agent.id
		- cloud.account.id
		- cloud.availability_zone
		- cloud.instance.id
		- cloud.provider
		- cloud.region
		- container.id
		- host.name
		- mssql.metrics.group_id
		- mssql.metrics.name
		- mssql.metrics.primary_replica
		- mssql.metrics.server_name
		- service.address
	- gauge (3 fields):
		- mssql.metrics.primary_recovery_health
		- mssql.metrics.secondary_recovery_health
		- mssql.metrics.synchronization_health
	- routing_path (13 fields):
		- agent.id
		- cloud.account.id
		- cloud.availability_zone
		- cloud.instance.id
		- cloud.provider
		- cloud.region
		- container.id
		- host.name
		- mssql.metrics.group_id
		- mssql.metrics.name
		- mssql.metrics.primary_replica
		- mssql.metrics.server_name
		- service.address

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-microsoft_sqlserver.availability_groups-default-2026.01.26-000001 to tsdb-index-enabled...
All 29 documents taken from index .ds-metrics-microsoft_sqlserver.availability_groups-default-2026.01.26-000001 were successfully placed to index tsdb-index-enabled.

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.
  • I have verified that any added dashboard complies with Kibana's Dashboard good practices

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Screenshots

@elastic-vault-github-plugin-prod
Copy link

elastic-vault-github-plugin-prod bot commented Jan 5, 2026

🚀 Benchmarks report

Package microsoft_sqlserver 👍(1) 💚(1) 💔(1)

Expand to view
Data stream Previous EPS New EPS Diff (%) Result
performance 4975.12 3846.15 -1128.97 (-22.69%) 💔

To see the full report comment with /test benchmark fullreport

@andrewkroh andrewkroh added the Integration:microsoft_sqlserver Microsoft SQL Server label Jan 8, 2026
@andrewkroh andrewkroh added the documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. label Jan 9, 2026
- name: primary_recovery_health
type: long
metric_type: gauge
description: Primary replica recovery health (0 = ONLINE_IN_PROGRESS, 1 = ONLINE. NULL if not primary).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Is this field going to get NULL value? Should the field data type changed to accept NULL value?
  • same applies for seconday_recovery_health field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah values can be NULL depending on replica we are connected to (primary/secondary). Adjusted data type to keyword.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the Microsoft document, these fields are of type integer. I don't think we would get NULL values for these fields. Can you double check once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it according to the documentation, primary_recovery_health will be NULL if the host is a secondary replica, and likewise, secondary_recovery_health will be NULL if the host is the primary replica.

@jakubgalecki0 jakubgalecki0 marked this pull request as ready for review January 14, 2026 11:02
@jakubgalecki0 jakubgalecki0 requested a review from a team as a code owner January 14, 2026 11:02
@andrewkroh andrewkroh added the Team:Obs-InfraObs Observability Infrastructure Monitoring team [elastic/obs-infraobs-integrations] label Jan 14, 2026
@muthu-mps muthu-mps requested a review from alaudazzi January 16, 2026 10:55
@@ -1,4 +1,9 @@
# newer versions go on top
- version: "2.15.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- version: "2.15.1"
- version: "2.16.0"

Shall we increase the minor version instead of patch for the enhancement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated version

name: microsoft_sqlserver
title: "Microsoft SQL Server"
version: "2.15.0"
version: "2.15.1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
version: "2.15.1"
version: "2.16.0"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

example: "stretch"
description: >
OS codename, if any.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of the fields here are dimension fields. So we need not define them here.

title: Microsoft SQL Server Always On Availability Groups metrics
description: Collect Microsoft SQL Server Always On Availability Groups metrics. Monitors overall AG health, synchronization status, and primary/secondary recovery state.
elasticsearch:
index_mode: "time_series"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do TSDB testing for this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result of tsdb testing

You're testing with version 8.19.0.

Testing data stream metrics-microsoft_sqlserver.availability_groups-default.

Index being used for the documents is .ds-metrics-microsoft_sqlserver.availability_groups-default-2026.01.26-000001.
Index being used for the settings and mappings is .ds-metrics-microsoft_sqlserver.availability_groups-default-2026.01.26-000002.

The time series fields for the TSDB index are:
	- dimension (13 fields):
		- agent.id
		- cloud.account.id
		- cloud.availability_zone
		- cloud.instance.id
		- cloud.provider
		- cloud.region
		- container.id
		- host.name
		- mssql.metrics.group_id
		- mssql.metrics.name
		- mssql.metrics.primary_replica
		- mssql.metrics.server_name
		- service.address
	- gauge (3 fields):
		- mssql.metrics.primary_recovery_health
		- mssql.metrics.secondary_recovery_health
		- mssql.metrics.synchronization_health
	- routing_path (13 fields):
		- agent.id
		- cloud.account.id
		- cloud.availability_zone
		- cloud.instance.id
		- cloud.provider
		- cloud.region
		- container.id
		- host.name
		- mssql.metrics.group_id
		- mssql.metrics.name
		- mssql.metrics.primary_replica
		- mssql.metrics.server_name
		- service.address

Index tsdb-index-enabled successfully created.

Copying documents from .ds-metrics-microsoft_sqlserver.availability_groups-default-2026.01.26-000001 to tsdb-index-enabled...
All 29 documents taken from index .ds-metrics-microsoft_sqlserver.availability_groups-default-2026.01.26-000001 were successfully placed to index tsdb-index-enabled.

type: keyword
description: Server name of the current primary replica.
- name: synchronization_health
type: keyword
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the field type to int type. Reference - link

@@ -0,0 +1,31 @@
metricsets: ["query"]
# Specify hosts in the below format. TODO: hosts need to be updated to support multiple entries.
Copy link
Contributor

@gpop63 gpop63 Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment can be removed, its probably from beats

@muthu-mps muthu-mps requested a review from Copilot February 6, 2026 09:49
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new data stream for monitoring Always On Availability Groups in Microsoft SQL Server, providing health and synchronization metrics for high availability configurations.

Changes:

  • Added availability_groups data stream with metrics for AG health, synchronization status, and replica recovery states
  • Updated package version from 2.15.0 to 2.16.0
  • Added comprehensive documentation for the new data stream including prerequisites and field descriptions

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/microsoft_sqlserver/manifest.yml Bumped package version to 2.16.0
packages/microsoft_sqlserver/docs/README.md Added documentation for availability_groups data stream with prerequisites and field reference
packages/microsoft_sqlserver/data_stream/availability_groups/sample_event.json Provided sample JSON event showing AG metrics structure
packages/microsoft_sqlserver/data_stream/availability_groups/manifest.yml Defined data stream configuration with TSDB index mode
packages/microsoft_sqlserver/data_stream/availability_groups/fields/fields.yml Defined metric fields including dimensions and gauges for AG monitoring
packages/microsoft_sqlserver/data_stream/availability_groups/fields/ecs.yml Configured ECS dimension fields for TSDB compatibility
packages/microsoft_sqlserver/data_stream/availability_groups/fields/base-fields.yml Defined base data stream fields
packages/microsoft_sqlserver/data_stream/availability_groups/elasticsearch/ingest_pipeline/default.yml Created ingest pipeline for processing AG metrics
packages/microsoft_sqlserver/data_stream/availability_groups/agent/stream/stream.yml.hbs Defined SQL query template for collecting AG metrics
packages/microsoft_sqlserver/changelog.yml Added changelog entry for version 2.16.0
packages/microsoft_sqlserver/_dev/build/docs/README.md Updated generated documentation with AG data stream information

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

**Prerequisites**: To collect Availability Groups metrics, ensure the following:

1. Always On Availability Groups feature is enabled on the SQL Server instance.
2. The user account configured for the integration has `VIEW SERVER STATE` and `VIEW ANY DEFINITION` permissions. *Additionaly look at secion [Microsoft SQL Server permissions](#microsoft-sql-server-permissions)*.
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'Additionaly' to 'Additionally' and 'secion' to 'section'.

Suggested change
2. The user account configured for the integration has `VIEW SERVER STATE` and `VIEW ANY DEFINITION` permissions. *Additionaly look at secion [Microsoft SQL Server permissions](#microsoft-sql-server-permissions)*.
2. The user account configured for the integration has `VIEW SERVER STATE` and `VIEW ANY DEFINITION` permissions. *Additionally look at section [Microsoft SQL Server permissions](#microsoft-sql-server-permissions)*.

Copilot uses AI. Check for mistakes.
**Prerequisites**: To collect Availability Groups metrics, ensure the following:

1. Always On Availability Groups feature is enabled on the SQL Server instance.
2. The user account configured for the integration has `VIEW SERVER STATE` and `VIEW ANY DEFINITION` permissions. *Additionaly look at secion [Microsoft SQL Server permissions](#microsoft-sql-server-permissions)*.
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'Additionaly' to 'Additionally' and 'secion' to 'section'.

Suggested change
2. The user account configured for the integration has `VIEW SERVER STATE` and `VIEW ANY DEFINITION` permissions. *Additionaly look at secion [Microsoft SQL Server permissions](#microsoft-sql-server-permissions)*.
2. The user account configured for the integration has `VIEW SERVER STATE` and `VIEW ANY DEFINITION` permissions. *Additionally look at section [Microsoft SQL Server permissions](#microsoft-sql-server-permissions)*.

Copilot uses AI. Check for mistakes.
# newer versions go on top
- version: "2.16.0"
changes:
- description: Add health metrics for Always On Availability Groups.
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove trailing whitespace at the end of the description line.

Suggested change
- description: Add health metrics for Always On Availability Groups.
- description: Add health metrics for Always On Availability Groups.

Copilot uses AI. Check for mistakes.
title: "Microsoft SQL Server Always On Availability Groups metrics"
type: metrics
streams:
- input: sql/metrics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add enabled:false flag, so that this datastream is not enabled by default for data collection?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, Availability Groups metrics are disabled by default

image

Copy link
Contributor

@muthu-mps muthu-mps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@elasticmachine
Copy link

💚 Build Succeeded

History

@jakubgalecki0 jakubgalecki0 merged commit 641ca79 into elastic:main Feb 6, 2026
11 checks passed
@elastic-vault-github-plugin-prod

Package microsoft_sqlserver - 2.16.0 containing this change is available at https://epr.elastic.co/package/microsoft_sqlserver/2.16.0/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation. Applied to PRs that modify *.md files. Integration:microsoft_sqlserver Microsoft SQL Server Team:Obs-InfraObs Observability Infrastructure Monitoring team [elastic/obs-infraobs-integrations]

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Microsoft SQL Server] Add AlwaysOnAvailability Metricset

7 participants