Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x-pack/metricbeat/module/sql: Add option to execute SQL queries for all databases #35688

Merged
merged 4 commits into from
Aug 10, 2023

Conversation

shmsr
Copy link
Member

@shmsr shmsr commented Jun 5, 2023

What does this PR do?

Add support for executing a set of given queries for all databases present in a server. Currently, this feature is only supported for mssql driver.

It expects a field fetch_from_all_databases in the configuration to be set to true in order to enable the feature. This feature executes queries agnostic of database names (i.e., where database names are not already part of the query) for all databases in a server. On every call to Fetch, the database names are refreshed and to fire the given set of queries for a particular database, USE @command statement is used by prefixing the same to each query when iterating over database names.

merge_results feature merges results per database basis and not merging all results from all databases.

Please read the comments (NOTE) in between the code changes to know more about why mssql is only supported and more.

In this PR some doc-related updates are also there. Also, handle the exit of metricbeat gracefully by handling return value of reporter.Event.

Why is it important?

Currently, we allow users to manually enter the databases when they are setting up the integration and that allows them to execute the SQL queries required for fetching the desired metrics. But, assuming a user could have 100s of databases on their server and then it becomes cumbersome to add them manually. There should be some easy way to fetch metrics from all databases that are there on the server.

To summarize, we already have a solution in place which enables users to manually feed the user databases along with the already set system databases (as default). But to close this issue we’d ideally want to have a solution that gets all accessible databases on a server.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • Test with MSSQL server.
  • Test feature with fetch_from_all_databases set to false and true.
  • Merge queries works per database basis

How to test this PR locally

  • Have an instance running of Microsoft's SQL server
  • Configure metricbeat to connect to the SQL server instance by putting in the right credentials — user, password and host
  • Make relevant changes to the configuration of the SQL module to set fetch_from_all_dbs to true and make adjustments to the SQL query, if necessary.

Related issues

Logs

For an mssql instance with only 4 databases — master, model, msdb, tempdb; here are some sample documents:

  • When fetch_from_all_databases is false:

Configuration file:

- module: sql
  metricsets:
    - query
  period: 50s
  hosts: ["sqlserver://<user>:<password>@localhost"]
  raw_data.enabled: true
  driver: "mssql"
  sql_queries:
    - query: "SELECT @@servername AS server_name, @@servicename AS instance_name, name As 'database_name', database_id FROM sys.databases WHERE name='master';"
      response_format: table

Response:

{
    "@timestamp": "2023-07-16T19:32:30.906Z",
    "@metadata": {
        "beat": "metricbeat",
        "type": "_doc",
        "version": "8.10.0"
    },
    "metricset": {
        "name": "query",
        "period": 50000
    },
    "service": {
        "address": "localhost",
        "type": "sql"
    },
    "sql": {
        "driver": "mssql",
        "query": "SELECT @@servername AS server_name, @@servicename AS instance_name, name As 'database_name', database_id FROM sys.databases WHERE name='master';",
        "metrics": {
            "database_id": 1,
            "server_name": "857e71351450",
            "instance_name": "MSSQLSERVER",
            "database_name": "master"
        }
    },
    "agent": {
        "name": "host-machine",
        "type": "metricbeat",
        "version": "8.10.0",
        "ephemeral_id": "<redacted>",
        "id": "<redacted>"
    },
    "ecs": {
        "version": "8.0.0"
    },
    "host": {
        "hostname": "host-machine",
        "name": "host-machine",
        "architecture": "arm64",
        "os": {
            "name": "macOS",
            "kernel": "22.4.0",
            "build": "<redacted>",
            "type": "macos",
            "platform": "darwin",
            "version": "13.3.1",
            "family": "darwin"
        },
        "id": "<redacted>",
        "ip": [
            "<redacted>"
        ],
        "mac": [
            "<redacted>"
        ]
    },
    "event": {
        "dataset": "sql.query",
        "module": "sql",
        "duration": 27494708
    }
}
  • When fetch_from_all_databases is true:

Configuration file:

- module: sql
  metricsets:
    - query
  period: 50s
  hosts: ["sqlserver://<user>:<password>!@localhost"]
  raw_data.enabled: true
  # NOTE: fetch_from_all_databases is set to true
  fetch_from_all_databases: true
  driver: "mssql"
  sql_queries:
    # NOTE: We have tweaked the command a little compared to older config file but here it'd get the database name automatically.
    - query: SELECT @@servername AS server_name, @@servicename AS instance_name, DB_NAME() AS 'database_name', DB_ID() AS database_id;
      response_format: table

Response:

{
    "@timestamp": "2023-07-16T19:34:15.179Z",
    "@metadata": {
        "beat": "metricbeat",
        "type": "_doc",
        "version": "8.10.0"
    },
    "agent": {
        "name": "host-machine",
        "type": "metricbeat",
        "version": "8.10.0",
        "ephemeral_id": "<redacted>",
        "id": "<redacted>"
    },
    "service": {
        "address": "localhost",
        "type": "sql"
    },
    "event": {
        "duration": 65829541,
        "dataset": "sql.query",
        "module": "sql"
    },
    "metricset": {
        "name": "query",
        "period": 50000
    },
    "sql": {
        "metrics": {
            "database_id": 1,
            "server_name": "857e71351450",
            "instance_name": "MSSQLSERVER",
            "database_name": "master"
        },
        "driver": "mssql",
        "query": "USE [master]; SELECT @@servername AS server_name, @@servicename AS instance_name, DB_NAME() AS 'database_name', DB_ID() AS database_id;"
    },
    "ecs": {
        "version": "8.0.0"
    },
    "host": {
        "ip": [
            "<redacted>"
        ],
        "mac": [
            "<redacted>"
        ],
        "name": "host-machine",
        "hostname": "host-machine",
        "architecture": "arm64",
        "os": {
            "name": "macOS",
            "kernel": "22.4.0",
            "build": "<redacted>",
            "type": "macos",
            "platform": "darwin",
            "version": "13.3.1",
            "family": "darwin"
        },
        "id": "<redacted>"
    }
}
{
    "@timestamp": "2023-07-16T19:34:15.179Z",
    "@metadata": {
        "beat": "metricbeat",
        "type": "_doc",
        "version": "8.10.0"
    },
    "agent": {
        "version": "8.10.0",
        "ephemeral_id": "<redacted>",
        "id": "<redacted>",
        "name": "host-machine",
        "type": "metricbeat"
    },
    "sql": {
        "query": "USE [tempdb]; SELECT @@servername AS server_name, @@servicename AS instance_name, DB_NAME() AS 'database_name', DB_ID() AS database_id;",
        "metrics": {
            "database_name": "tempdb",
            "database_id": 2,
            "server_name": "857e71351450",
            "instance_name": "MSSQLSERVER"
        },
        "driver": "mssql"
    },
    "event": {
        "dataset": "sql.query",
        "module": "sql",
        "duration": 67029500
    },
    "metricset": {
        "period": 50000,
        "name": "query"
    },
    "service": {
        "type": "sql",
        "address": "localhost"
    },
    "ecs": {
        "version": "8.0.0"
    },
    "host": {
        "name": "host-machine",
        "id": "<redacted>",
        "ip": [
            "<redacted>"
        ],
        "mac": [
            "<redacted>"
        ],
        "hostname": "host-machine",
        "architecture": "arm64",
        "os": {
            "version": "13.3.1",
            "family": "darwin",
            "name": "macOS",
            "kernel": "22.4.0",
            "build": "<redacted>",
            "type": "macos",
            "platform": "darwin"
        }
    }
}
{
    "@timestamp": "2023-07-16T19:34:15.179Z",
    "@metadata": {
        "beat": "metricbeat",
        "type": "_doc",
        "version": "8.10.0"
    },
    "service": {
        "address": "localhost",
        "type": "sql"
    },
    "sql": {
        "driver": "mssql",
        "query": "USE [model]; SELECT @@servername AS server_name, @@servicename AS instance_name, DB_NAME() AS 'database_name', DB_ID() AS database_id;",
        "metrics": {
            "instance_name": "MSSQLSERVER",
            "database_name": "model",
            "database_id": 3,
            "server_name": "857e71351450"
        }
    },
    "event": {
        "dataset": "sql.query",
        "module": "sql",
        "duration": 68272250
    },
    "metricset": {
        "name": "query",
        "period": 50000
    },
    "agent": {
        "type": "metricbeat",
        "version": "8.10.0",
        "ephemeral_id": "<redacted>",
        "id": "<redacted>",
        "name": "host-machine"
    },
    "ecs": {
        "version": "8.0.0"
    },
    "host": {
        "architecture": "arm64",
        "os": {
            "kernel": "22.4.0",
            "build": "<redacted>",
            "type": "macos",
            "platform": "darwin",
            "version": "13.3.1",
            "family": "darwin",
            "name": "macOS"
        },
        "id": "<redacted>",
        "ip": [
            "<redacted>"
        ],
        "mac": [
            "<redacted>"
        ],
        "name": "host-machine",
        "hostname": "host-machine"
    }
}
{
    "@timestamp": "2023-07-16T19:34:15.179Z",
    "@metadata": {
        "beat": "metricbeat",
        "type": "_doc",
        "version": "8.10.0"
    },
    "metricset": {
        "name": "query",
        "period": 50000
    },
    "service": {
        "address": "localhost",
        "type": "sql"
    },
    "sql": {
        "metrics": {
            "database_name": "msdb",
            "database_id": 4,
            "server_name": "857e71351450",
            "instance_name": "MSSQLSERVER"
        },
        "driver": "mssql",
        "query": "USE [msdb]; SELECT @@servername AS server_name, @@servicename AS instance_name, DB_NAME() AS 'database_name', DB_ID() AS database_id;"
    },
    "event": {
        "dataset": "sql.query",
        "module": "sql",
        "duration": 69424666
    },
    "agent": {
        "type": "metricbeat",
        "version": "8.10.0",
        "ephemeral_id": "<redacted>",
        "id": "<redacted>",
        "name": "host-machine"
    },
    "ecs": {
        "version": "8.0.0"
    },
    "host": {
        "os": {
            "family": "darwin",
            "name": "macOS",
            "kernel": "22.4.0",
            "build": "<redacted>",
            "type": "macos",
            "platform": "darwin",
            "version": "13.3.1"
        },
        "name": "host-machine",
        "id": "<redacted>",
        "ip": [
            "<redacted>"
        ],
        "mac": [
            "<redacted>"
        ],
        "hostname": "host-machine",
        "architecture": "arm64"
    }
}

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jun 5, 2023
@mergify mergify bot assigned shmsr Jun 5, 2023
@mergify
Copy link
Contributor

mergify bot commented Jun 5, 2023

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @shmsr? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v8./d.0 is the label to automatically backport to the 8./d branch. /d is the digit

@elasticmachine
Copy link
Collaborator

elasticmachine commented Jun 5, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-08-10T18:40:59.580+0000

  • Duration: 72 min 24 sec

Test stats 🧪

Test Results
Failed 0
Passed 4648
Skipped 1020
Total 5668

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@shmsr shmsr force-pushed the draft/sql/fetch_from_all_dbs branch from 2593f3b to 3275971 Compare July 10, 2023 19:28
@shmsr shmsr marked this pull request as ready for review July 10, 2023 19:28
@shmsr shmsr requested review from a team as code owners July 10, 2023 19:28
@shmsr shmsr marked this pull request as draft July 10, 2023 19:51
Copy link
Contributor

@lalit-satapathy lalit-satapathy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Need description summary of fetch_from_all_db
  • If "fetch from all databases feature is not supported for driver" it has to be added to beats user document.
  • Need to add a beats testcase for fetch_from_all_db for beats
  • Need to see a simple example at beats level what happens when fetch_from_all_db is enabled.
  • Interested to see the diff between the docs created for each DB, when fetch_from_all_db is enabled.

@shmsr shmsr changed the title [SQL] Add option to query same command(s) for all dbs x-pack/metricbeat/module/sql: Add option to query same command(s) for all dbs Jul 13, 2023
@shmsr shmsr changed the title x-pack/metricbeat/module/sql: Add option to query same command(s) for all dbs x-pack/metricbeat/module/sql: Add option to execute queries for all dbs Jul 13, 2023
@shmsr shmsr force-pushed the draft/sql/fetch_from_all_dbs branch from bac4c6a to e59d7d0 Compare July 16, 2023 22:24
@shmsr shmsr marked this pull request as ready for review July 16, 2023 22:24
@shmsr shmsr changed the title x-pack/metricbeat/module/sql: Add option to execute queries for all dbs x-pack/metricbeat/module/sql: Add option to execute SQL queries for all dbs Jul 17, 2023
@shmsr shmsr force-pushed the draft/sql/fetch_from_all_dbs branch 2 times, most recently from 74ee9db to 5c1a214 Compare July 17, 2023 20:37
@shmsr shmsr requested a review from a team as a code owner July 17, 2023 20:37
@shmsr shmsr requested review from gizas and constanca-m July 17, 2023 20:37
@shmsr
Copy link
Member Author

shmsr commented Jul 17, 2023

Cherry-picked the commit to fix CI errors temporarily. See #36091 for more details.

@shmsr shmsr force-pushed the draft/sql/fetch_from_all_dbs branch from df03fab to c3f4148 Compare July 18, 2023 04:49
@shmsr shmsr removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 2, 2023
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 2, 2023
@shmsr shmsr added Team:Service-Integrations Label for the Service Integrations team Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team labels Aug 2, 2023
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 2, 2023
@shmsr shmsr force-pushed the draft/sql/fetch_from_all_dbs branch from 60df722 to 7ae6948 Compare August 2, 2023 11:39
Copy link
Contributor

@lucian-ioan lucian-ioan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shmsr shmsr added the Team:Elastic-Agent Label for the Agent team label Aug 7, 2023
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@shmsr
Copy link
Member Author

shmsr commented Aug 7, 2023

I need someone from the elastic-agent-data-plane team to review this PR. Thanks!

@shmsr shmsr removed request for belimawr, fearful-symmetry and a team August 7, 2023 17:54
Copy link
Contributor

@lalit-satapathy lalit-satapathy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving as the comments here are resolved.

@shmsr shmsr force-pushed the draft/sql/fetch_from_all_dbs branch from ccaf445 to 7366a5f Compare August 9, 2023 18:39
@shmsr shmsr enabled auto-merge (squash) August 10, 2023 18:39
@shmsr shmsr merged commit 684d6cf into elastic:main Aug 10, 2023
29 checks passed
@shmsr shmsr deleted the draft/sql/fetch_from_all_dbs branch August 11, 2023 05:17
Scholar-Li pushed a commit to Scholar-Li/beats that referenced this pull request Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Team:Elastic-Agent Label for the Agent team Team:Obs-InfraObs Label for the Observability Infrastructure Monitoring team Team:Service-Integrations Label for the Service Integrations team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants