Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Metricbeat] [WIP] Add odbc module (#13106) #13257

Open
wants to merge 1 commit into
base: master
from

Conversation

@amandahla
Copy link
Contributor

commented Aug 15, 2019

This is a draft for a odbc module that you can set datasource and a query to generate metrics.

This way, you can get metrics from DB2 (#13106) and others databases.

PR is in progress and subject to changes.

@amandahla amandahla requested a review from elastic/integrations as a code owner Aug 15, 2019
@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Aug 15, 2019

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

1 similar comment
@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Aug 15, 2019

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@elasticcla

This comment has been minimized.

Copy link

commented Aug 15, 2019

Hi @amandahla, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in your Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile?

@jsoriano

This comment has been minimized.

Copy link
Member

commented Aug 15, 2019

Thanks @amandahla for starting this!

We have long wanted to have something like a generic SQL module, and now with light modules it can be even more useful.

I see this is in progress, but my main concern with current approach is the use of ODBC. According to the package README it needs CGO, what can be an impediment. I guess you have chosen it because there is no go-native way to connect with a DB2 database?

Also, I think that we could make this one a more generic module if we also allow to configure the driver.

metricbeat.modules:
- module: sql
  metricsets: ["query"]
  hosts: ["localhost"]
  driver: "sqldriver"
  datasource: "sqldsn"
  query: "select metric from mymetrics"

And then we could open the connection with:

 	db, err := sqlx.Open(m.Driver, m.Datasource)

The benefit of this is that we could have an even more generic sql module that could use the drivers we already have (mysql, postgresql...), and we could also postpone the decission of including the cgo ODBC driver. I would also rename the module from odbc to sql.

We will also have to discuss how to store the collected metrics. In generic modules we are tending towards storing all the metrics under the same object, for example all Prometheus metrics are stored under prometheus.metrics.*, we could think in something like this for this SQL metrics. The main difference here is that metrics are typed, so maybe we should store them under sql.metrics.<type>.*.

One last thing, we are adding most of new modules under the x-pack directory, would you mind to move the module to x-pack/metricbeat/module?

@amandahla

This comment has been minimized.

Copy link
Contributor Author

commented Aug 16, 2019

@jsoriano

  1. I guess you have chosen it because there is no go-native way to connect with a DB2 database?
    Unfortunately, yes. I have used this list as reference and didn't find any other driver, for ODBC or just for DB2, without CGO. :-(

  2. Include driver in configuration and rename to sql.
    Sure, good idea!

  3. "We will also have to discuss how to store the collected metrics. In generic modules we are tending towards storing all the metrics under the same object, for example all Prometheus metrics are stored under prometheus.metrics., we could think in something like this for this SQL metrics. The main difference here is that metrics are typed, so maybe we should store them under sql.metrics..."
    I'll have to study that but agree.

  4. One last thing, we are adding most of new modules under the x-pack directory, would you mind to move the module to x-pack/metricbeat/module?
    Oops...I didn't notice that, I'll change this.

@amandahla

This comment has been minimized.

Copy link
Contributor Author

commented Aug 16, 2019

  1. Include driver in configuration and rename to sql
    Thinking better about it...Since no database drivers are included in the Go standard library, I guess I would have to import a lot of drivers here to allow this configuration to be anything as odbc, postgresql, mysql...
@jsoriano

This comment has been minimized.

Copy link
Member

commented Aug 16, 2019

Thinking better about it...Since no database drivers are included in the Go standard library, I guess I would have to import a lot of drivers here to allow this configuration to be anything as odbc, postgresql, mysql...

We already import several drivers in Metricbeat for other modules, I think it should be possible to use them here. And later we can decide the best way of including more drivers.

@amandahla

This comment has been minimized.

Copy link
Contributor Author

commented Aug 27, 2019

@jsoriano should be something like this?

event: {
  "@timestamp": "2019-08-27T15:08:59.696Z",
  "@metadata": {
    "beat": "metricbeat",
    "type": "_doc",
    "version": "8.0.0"
  },
  "host": {
    "name": "metricbeat"
  },
  "agent": {
    "ephemeral_id": "31aae8f5-499f-42ef-9dd2-4373832ec81f",
    "hostname": "1565454",
    "id": "05155846-1ed3-4c71-be89-31491e81a15b",
    "version": "8.0.0",
    "name": "metricbeat",
    "type": "metricbeat"
  },
  "sql": {
    "query": {
      "metrics": [
        {
          "TBSPACE": "SYSTOOLSPACE"
        },
        {
          "PAGESIZE": "4096"
        },
        {
          "EXTENTSIZE": "4"
        }
      ]
    }
  },
  "event": {
    "duration": 1637527231,
    "dataset": "sql.query",
    "module": "sql"
  },
  "metricset": {
    "name": "query",
    "period": 10000
  },
  "service": {
    "address": "http://127.0.0.1",
    "type": "sql"
  },
  "ecs": {
    "version": "1.1.0"
  }
}
@jsoriano

This comment has been minimized.

Copy link
Member

commented Aug 27, 2019

@amandahla yes, something like this, some extra suggestions:

  • We should avoid using arrays of objects as it is difficult to use them well in Elasticsearch, and we don't use them in Beats. See the note about this in the Array datatype docs. In this case I think it should be easy to convert them to plain fields as they are objects with an only element.
  • Not very important, but we could lowercase the field names, maybe with an option enabled by default.
  • We could consider adding some extra fields as metadata, for example we could add the statement executed, and the driver used.
  • If we can distinguish the type of the fields in the response we could use different paths per type category, so we can provide default mappings. On an initial version we could distinguish only numeric fields that can be stored as double, and stored the fields of the rest of types as keywords.

Summarizing, something like this:

  ...
  "sql": {
    "driver": "odbc",
    "query": "select ..."
    "metrics": {
      "numeric": {
        "pagesize": 4096,
        "extentsize": 4
      },
      "string": {
        "tbspace": "SYSTOOLSPACE"
      }
    }
  },
  ...
@amandahla amandahla force-pushed the estaleiro:iss13106 branch from 6eb9039 to 4ddc4c0 Aug 29, 2019
@amandahla

This comment has been minimized.

Copy link
Contributor Author

commented Aug 29, 2019

Almost there :-)

"sql": {
    "query": {
      "metrics": {
        "numeric": {
          "extentsize": 4,
          "pagesize": 4096
        },
        "string": {
          "tbspace": "SYSTOOLSPACE"
        }
      },
      "driver": "odbc",
      "query": "select TBSPACE,PAGESIZE,EXTENTSIZE from syscat.tablespaces"
    }
  },
@jsoriano

This comment has been minimized.

Copy link
Member

commented Aug 29, 2019

Great! Could you remove the query level?

    "sql": {
      "metrics": {
        "numeric": {
          "extentsize": 4,
          "pagesize": 4096
        },
        "string": {
          "tbspace": "SYSTOOLSPACE"
        }
      },
      "driver": "odbc",
      "query": "select TBSPACE,PAGESIZE,EXTENTSIZE from syscat.tablespaces"
    },
dataMetrics.Put("string", stringValue)

report.Event(mb.Event{
MetricSetFields: data,

This comment has been minimized.

Copy link
@jsoriano

jsoriano Sep 2, 2019

Member

To remove the query part from the fields path you can do here something like this:

Suggested change
MetricSetFields: data,
RootFields: common.MapStr{"sql": data},

It would be also possible to do it like this:

Suggested change
MetricSetFields: data,
ModuleFields: data,

But I'd prefer the first option because this way we would always have sql-related fields under the same sql namespace, even if the module is later used as light module. We did the same with the prometheus collector metricset in #12400

This comment has been minimized.

Copy link
@amandahla

amandahla Sep 2, 2019

Author Contributor

Thanks!

@dedemorton

This comment has been minimized.

Copy link
Contributor

commented Sep 13, 2019

Can you ping me when the documentation for this module is ready for review? Right now, the content is boiler plate text. We need meaningful descriptions. Thanks!

@amandahla amandahla force-pushed the estaleiro:iss13106 branch from 4984b90 to caca459 Sep 20, 2019
@amandahla

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2019

@dedemorton I'm not sure if I did it right, I just wrote the files:

x-pack/metricbeat/module/sql/_meta/docs.asciidoc
x-pack/metricbeat/module/sql/query/_meta/docs.asciidoc

Is that ok?

@amandahla

This comment has been minimized.

Copy link
Contributor Author

commented Oct 9, 2019

@dedemorton @jsoriano is there anything that I could improve/change here?

@jsoriano

This comment has been minimized.

Copy link
Member

commented Oct 9, 2019

Hey @amandahla, sorry for not getting back to this earlier, there seems to be some errors in CI related to the new added doc files, you will need to run mage fmt update and commit the changes, you may need to run it from x-pack/metricbeat and from metricbeat directories.

@amandahla amandahla force-pushed the estaleiro:iss13106 branch from caca459 to 360080e Oct 9, 2019
@jsoriano

This comment has been minimized.

Copy link
Member

commented Oct 10, 2019

@amandahla now CI is complaining about NOTICE.txt being outdated, this uses to happen when new dependencies are vendored, you will need to run make notice and commit the changes.

@amandahla amandahla force-pushed the estaleiro:iss13106 branch from 360080e to 77d95fe Oct 22, 2019
@amandahla

This comment has been minimized.

Copy link
Contributor Author

commented Oct 22, 2019

@jsoriano Now CI build has failed with "../x-pack/metricbeat/module/sql/vendor/github.com/alexbrainman/odbc/api/api_unix.go:14:11: fatal error: 'sql.h' file not found"

I believe that a solution would be to add unixodbc and unixodbc-dev packages in .travis.yml. Is that ok?

@jsoriano

This comment has been minimized.

Copy link
Member

commented Oct 22, 2019

Hey @amandahla, I think that adding the odbc library is going to need further discussions, what do you think about doing it in a separate review, and work by now in adding the generic sql module?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.