Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mongo dbstats #3228

Merged
merged 38 commits into from Jan 11, 2017
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
9274966
initial dbstats implementation and integration tests
Dec 19, 2016
f28f347
implemented fields.yml for dbstats metricset
Dec 19, 2016
c811c3e
generated documentation and templates via make update
Dec 20, 2016
590b9e2
added dbstats documentation
Dec 20, 2016
b6f8da3
new implementation that establishes direct connetions to mongo hosts
Dec 22, 2016
74f9ee0
successful reporting using direct node connections
Dec 28, 2016
dd895b5
debugged field name. integration tests pass
Dec 29, 2016
2c92245
debugged avg_obj_size field for proper mapping
Dec 29, 2016
5a9ae7f
set FailFast = True for direct connections to prevent a nonresponsive…
Dec 29, 2016
8e1934c
provide mongodb host parser to AddMetricSet()
Dec 29, 2016
1f420d7
implementing multi-node direct reporting for mongodb.serverStatus
Dec 29, 2016
65d8d89
changed database response back to map[string]interface{}{}. otherwis…
Dec 29, 2016
a4a6e75
debugging test assertion values so integration test suite passes
Dec 29, 2016
3c6d97c
debugged integration tests to work with metricbeats testsuite. added …
Dec 29, 2016
aafaafe
copied original metricbeat.yml. Accidentally over-wrote it
Dec 29, 2016
663b4da
Merge pull request #1 from scottcrespo/metricbeat-mongodb-persistent-…
Dec 29, 2016
99bc01f
added documentation and additional log entry in mongodb module
Dec 30, 2016
82be711
more info logs to debug travis ci failures
Dec 30, 2016
98b86a0
additional logging to debug CI. removed host parameter to new metricset
Dec 30, 2016
f207fac
updated docstring. add mongo url parser back to new metric set call
Dec 30, 2016
2b3f230
added build integration comment to integration test modules
Dec 30, 2016
abf8973
eliminated multi-node concurrency patterns because it is handled by t…
Dec 30, 2016
5bde2ee
implementing eventfetcher, not eventsfetcher
Dec 30, 2016
976294a
updated module config to include dbstats metricset
Dec 31, 2016
7b0a38f
ran make update
Dec 31, 2016
827348c
updated metricbeat mongo modules template configuration to list metri…
Jan 8, 2017
2de87a8
udated mongobeat dbstats fields to use byte format where appropriate
Jan 8, 2017
54009f4
handle mb field
Jan 8, 2017
594085b
update import list order
Jan 8, 2017
48f9a53
updating mongodb metricset field format
Jan 8, 2017
757dbd0
added experimental flag to mongodb dbstats metricset
Jan 8, 2017
06b7220
removed redundant Printf statement
Jan 8, 2017
e4b627a
added debugf variable back into metricset modules for easy debug prin…
Jan 8, 2017
28e51fd
ran make generate-json for dbstats metricset
Jan 8, 2017
1c80f5e
removed dialinfo from mongodb metricsets because we have the mongoses…
Jan 8, 2017
a8450fb
updated data mapping to make mmapv1-specific fields optional
Jan 8, 2017
35ac943
updated changelog to include latest changes to Metricbeat's MongoDB m…
Jan 10, 2017
a9dd25f
Merge branch 'master' of github.com:elastic/beats into mongo-dbstats
Jan 10, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
57 changes: 57 additions & 0 deletions metricbeat/docs/fields.asciidoc
Expand Up @@ -2059,6 +2059,63 @@ MongoDB metrics.



[float]
== dbstats Fields

dbstats provides an overview of a particular mongo database. This document is most concerned with data volumes of a database.



[float]
=== mongodb.dbstats.avg_object_size

type: long

[float]
=== mongodb.dbstats.collections

type: integer

[float]
=== mongodb.dbstats.data_size

type: long

[float]
=== mongodb.dbstats.db

type: keyword

[float]
=== mongodb.dbstats.file_size

type: long

[float]
=== mongodb.dbstats.index_size

type: long

[float]
=== mongodb.dbstats.indexes

type: long

[float]
=== mongodb.dbstats.num_extents

type: long

[float]
=== mongodb.dbstats.objects

type: long

[float]
=== mongodb.dbstats.storage_size

type: long

[float]
== status Fields

Expand Down
4 changes: 4 additions & 0 deletions metricbeat/docs/modules/mongodb.asciidoc
Expand Up @@ -88,7 +88,11 @@ metricbeat.modules:

The following metricsets are available:

* <<metricbeat-metricset-mongodb-dbstats,dbstats>>

* <<metricbeat-metricset-mongodb-status,status>>

include::mongodb/dbstats.asciidoc[]

include::mongodb/status.asciidoc[]

1 change: 1 addition & 0 deletions metricbeat/include/list.go
Expand Up @@ -23,6 +23,7 @@ import (
_ "github.com/elastic/beats/metricbeat/module/kafka"
_ "github.com/elastic/beats/metricbeat/module/kafka/partition"
_ "github.com/elastic/beats/metricbeat/module/mongodb"
_ "github.com/elastic/beats/metricbeat/module/mongodb/dbstats"
_ "github.com/elastic/beats/metricbeat/module/mongodb/status"
_ "github.com/elastic/beats/metricbeat/module/mysql"
_ "github.com/elastic/beats/metricbeat/module/mysql/status"
Expand Down
36 changes: 36 additions & 0 deletions metricbeat/metricbeat.template-es2x.json
Expand Up @@ -1054,6 +1054,42 @@
},
"mongodb": {
"properties": {
"dbstats": {
"properties": {
"avg_object_size": {
"type": "long"
},
"collections": {
"type": "long"
},
"data_size": {
"type": "long"
},
"db": {
"ignore_above": 1024,
"index": "not_analyzed",
"type": "string"
},
"file_size": {
"type": "long"
},
"index_size": {
"type": "long"
},
"indexes": {
"type": "long"
},
"num_extents": {
"type": "long"
},
"objects": {
"type": "long"
},
"storage_size": {
"type": "long"
}
}
},
"status": {
"properties": {
"asserts": {
Expand Down
35 changes: 35 additions & 0 deletions metricbeat/metricbeat.template.json
Expand Up @@ -1051,6 +1051,41 @@
},
"mongodb": {
"properties": {
"dbstats": {
"properties": {
"avg_object_size": {
"type": "long"
},
"collections": {
"type": "long"
},
"data_size": {
"type": "long"
},
"db": {
"ignore_above": 1024,
"type": "keyword"
},
"file_size": {
"type": "long"
},
"index_size": {
"type": "long"
},
"indexes": {
"type": "long"
},
"num_extents": {
"type": "long"
},
"objects": {
"type": "long"
},
"storage_size": {
"type": "long"
}
}
},
"status": {
"properties": {
"asserts": {
Expand Down
19 changes: 19 additions & 0 deletions metricbeat/module/mongodb/dbstats/_meta/data.json
@@ -0,0 +1,19 @@
{
"@timestamp":"2016-05-23T08:05:34.853Z",
"beat":{
"hostname":"beathost",
"name":"beathost"
},
"metricset":{
"host":"localhost",
"module":"mysql",
"name":"status",
"rtt":44269
},
"mongodb":{
"dbstats":{
"example": "dbstats"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example doc should be generated here with make generate-json. But that can still be done after merging the PR.

}
},
"type":"metricsets"
}
3 changes: 3 additions & 0 deletions metricbeat/module/mongodb/dbstats/_meta/docs.asciidoc
@@ -0,0 +1,3 @@
=== mongodb dbstats MetricSet

This is the dbstats metricset of the module mongodb.
35 changes: 35 additions & 0 deletions metricbeat/module/mongodb/dbstats/_meta/fields.yml
@@ -0,0 +1,35 @@
- name: dbstats
type: group
description: >
dbstats provides an overview of a particular mongo database. This document
is most concerned with data volumes of a database.
fields:
- name: avg_object_size
type: long

- name: collections
type: integer

- name: data_size
type: long

- name: db
type: keyword

- name: file_size
type: long

- name: index_size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is probably index_size.bytes and then you can add format: bytes to each block, so Kibana will automatically now that it is in bytes and can do some conversion.

type: long

- name: indexes
type: long

- name: num_extents
type: long

- name: objects
type: long

- name: storage_size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the unit for all the _size fields?

type: long
21 changes: 21 additions & 0 deletions metricbeat/module/mongodb/dbstats/data.go
@@ -0,0 +1,21 @@
package dbstats

import (
s "github.com/elastic/beats/metricbeat/schema"
c "github.com/elastic/beats/metricbeat/schema/mapstriface"
)

var schema = s.Schema{
"db": c.Str("db"),
"collection": c.Int("collections"),
"objects": c.Int("objects"),
"avg_object_size": c.Int("avgObjectSize"),
"data_size": c.Int("dataSize"),
"storage_size": c.Int("storageSize"),
"num_extents": c.Int("numExtents"),
"indexes": c.Int("indexes"),
"index_size": c.Int("indexSize"),
"file_size": c.Int("fileSize"),
}

var eventMapping = schema.Apply
90 changes: 90 additions & 0 deletions metricbeat/module/mongodb/dbstats/dbstats.go
@@ -0,0 +1,90 @@
package dbstats

import (
"errors"

"github.com/beats/libbeat/logp"
"github.com/elastic/beats/libbeat/common"
"github.com/elastic/beats/metricbeat/mb"
"gopkg.in/mgo.v2"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor detail: We normally put this in between the general and the beats imports, so it is:

	"errors"

	"gopkg.in/mgo.v2"
	
	"github.com/elastic/beats/libbeat/common"
	"github.com/elastic/beats/libbeat/logp"
	"github.com/elastic/beats/metricbeat/mb"
	"github.com/elastic/beats/metricbeat/module/mongodb"

)

// init registers the MetricSet with the central registry.
// The New method will be called after the setup of the module and before starting to fetch data
func init() {
if err := mb.Registry.AddMetricSet("mongodb", "dbstats", New); err != nil {
panic(err)
}
}

// MetricSet type defines all fields of the MetricSet
// As a minimum it must inherit the mb.BaseMetricSet fields, but can be extended with
// additional entries. These variables can be used to persist data or configuration between
// multiple fetch calls.
type MetricSet struct {
mb.BaseMetricSet
dialInfo *mgo.DialInfo
}

// New create a new instance of the MetricSet
// Part of new is also setting up the configuration by processing additional
// configuration entries if needed.
func New(base mb.BaseMetricSet) (mb.MetricSet, error) {
dialInfo, err := mgo.ParseURL(base.HostData().URI)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add an experimental flag here? We normally introduce new metricsets as experimental. This allows us to first get some real word feedback for it and still change the data structure if needed without having to wait for a major release. Have a look here: https://github.com/elastic/beats/blob/master/metricbeat/module/haproxy/stat/stat.go#L34

if err != nil {
return nil, err
}
dialInfo.Timeout = base.Module().Config().Timeout

return &MetricSet{
BaseMetricSet: base,
dialInfo: dialInfo,
}, nil
}

// Fetch methods implements the data gathering and data conversion to the right format
// It returns the event which is then forward to the output. In case of an error, a
// descriptive error must be returned.
func (m *MetricSet) Fetch() ([]common.MapStr, error) {

// establish connection to mongo
session, err := mgo.DialWithInfo(m.dialInfo)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin

This follows the pattern used by the status metricset here, but I'm not sure it makes sense to open and close the mongo connection each time fetch() is called.

According to the [docs],(https://www.elastic.co/guide/en/beats/metricbeat/current/metricset-details.html#_timeout_connections_to_services) connections should be established in the New() method and persist between fetches.

I'm going to begin implementing a pattern where the mongo session is persisted in each MetricSet instance to avoid constantly opening and closing connections. I would add that work to this existing PR. Or, would you prefer it's handled under a separate submission?

Lastly, based on this implementation I'm more confident that the mongodb module is not connecting directly to the individual instances, but would instead connect to the cluster. This means that any cluster member may answer the query and result in data inconsistencies. This issue is a bit more involved, and I don't know enough about the platform to be sure, so I may need to jump on a Zoom call or something to talk further.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on persisting the connection. It is probably best to extract the connection setup logic to the module level and just use it as a function.

About connecting to single instances or the cluster can be come tricky. We had a similar issue with Kafka. Our recommendation is to install metricbeat on each edge node (some exceptions exist here), as it gives you also additional metrics about the system etc. I'm don't know the details of the mongodb clustering but for Kafka we solved it the following way (generic description):

  • Individual node stats are reported by each mongo instance
  • Global stats about cluster / indices are only fetched from master

This requires that a metricbeat instance can detect if it is a master or not. Also I assume this must be dynamic, as master can change over time.

I assume dbStats response is the same for each node, so only the data from master should be fetched.

Happy to jump on a zoom call to go into more details.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin

Connecting to single instances

Tricky indeed. Fortunately, mgo (the mongo client) has a dialInfo.Direct parameter. In mongobeat, I initiate a master connection to the cluster, and then create individual node connections to each node. You can see it here

I create the individual connections in the "controller" here

I've begun implementing something similar for the mongodb module, minus node discovery. This is implemented at the module level to share logic and prevent redundant connections.

Detecting master/primary

Easily implemented with mgo. You can specify read preference to always read from primary. If the primary changes, your reads will automatically be directed to the new primary.

dbStats response the same

Technically dbStats response (and serverStatus) are not the same for each node. Sometimes the differences are trivial, other times they're not. I'll cover the cases where they're not the same:

replica sets

In the case of a replica set, the state of the secondaries lag behind the primary. Thus, seeing the stats for each node helps monitor replication lag and success/failure. This is especially helpful in the case of multi-datacenter replication, where larger discrepancies are usually present.

sharded clusters

In the case of a sharded cluster, the complexity is compounded because nodes contain different databases and collections, and a shard can also be a replica set! =o

Conclusion

Given these considerations, I'm still having a hard time understanding what the best implementation for metricbeat would be. Given the current implementation - the node that is read from is implicit and can change from period to period. For example, you might connect to mongo on localhost and expect that you're reading from that node. But under the hood, mgo is actually discovering the other nodes in the cluster and directing reads to any one of the member nodes. Thus, I think it's possible to receive unexpected and inconsistent results.

If we enforce that metricbeat only establishes a direct connection to a single mongo instance on localhost, then all of this complexity goes away and output is very predictable. Yay! But, it limits the flexibility of the system.

If we allow the agent to connect to mongo as a cluster, and permit configuration of multiple hosts, then I think there's a lot of different directions we could take.

Definitely a Zoom call will be helpful. I'm sure you can teach me a lot more about how metricbeat works under the hood, so I can understand the best direction.

I don't have your email, but can you contact me at sccrespo@gmail.com and we'll set up a time? Perhaps we could talk sometime Thursday afternoon (your time).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scottcrespo Sorry, I was mostly offline the last two days, so only saw the update now.

I always try to go with the simplest solution first, which I agree is the one only connecting to localhost. Because of docker I would probably rephrase this to: The mongodb module gets only the stats from the node it connects to, as it does not necessarly always have to be localhost on docker.

Could you update the PR so that it always only connects to the defined host (and not the cluster)?

About zoom call: Lets see how far we can take it here in the PR as I'm rarely only in the next days because of holidays. But I'm happy to answer any questions about metricbeat in more detail here if needed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin

No problem! I've been on vacation as well, but I have a few days to work on beats before taking some additional time off.

I'm in the process of updating the mongodb module to establish direct connections to the hosts listed in the config, and to persist those connections between calls. I should be able to update this PR within a few days!

if err != nil {
return nil, err
}
defer session.Close()

session.SetMode(mgo.Monotonic, true)

// Get the list of databases database names, which we'll use to call db.stats() on each
dbNames, err := session.DatabaseNames()
if err != nil {
logp.Err("Error retrieving database names from Mongo instance")
return []common.MapStr{}, err
}

// events is the list of events collected from each of the databases.
events := []common.MapStr{}

// for each database, call db.stats() and append to events
for _, dbName := range dbNames {
db := session.DB(dbName)

result := map[string]interface{}{}

err := db.Run("dbStats", &result)
if err != nil {
logp.Err("Failed to retrieve stats for db %s", dbName)
continue
}
events = append(events, eventMapping(result))
}

// if we failed to collect on any databases, return an error
if len(events) == 0 {
err = errors.New("Failed to fetch stats for all databases in mongo instance")
return []common.MapStr{}, err
}

return events, nil
}
47 changes: 47 additions & 0 deletions metricbeat/module/mongodb/dbstats/dbstats_integration_test.go
@@ -0,0 +1,47 @@
package dbstats

import (
"testing"

"github.com/beats/metricbeat/module/mongodb"
mbtest "github.com/elastic/beats/metricbeat/mb/testing"
"github.com/stretchr/testify/assert"
)

func TestFetch(t *testing.T) {
f := mbtest.NewEventsFetcher(t, getConfig())
events, err := f.Fetch()
if !assert.NoError(t, err) {
t.FailNow()
}

for _, event := range events {
t.Logf("%s/%s event: %+v", f.Module().Name(), f.Name(), event)

// Check a few event Fields
db := event["db"].(string)
assert.NotEqual(t, db, "")

dataSize := event["data_size"].(int64)
assert.True(t, dataSize > 0)

collections := event["collections"].(int64)
assert.True(t, collections > 0)
}
}

func TestData(t *testing.T) {
f := mbtest.NewEventsFetcher(t, getConfig())
err := mbtest.WriteEvents(f, t)
if err != nil {
t.Fatal("write", err)
}
}

func getConfig() map[string]interface{} {
return map[string]interface{}{
"module": "mongodb",
"metricsets": []string{"dbstats"},
"hosts": []string{mongodb.GetEnvHost() + ":" + mongodb.GetEnvPort()},
}
}