Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HealthCheck information for Metricbeat docker module #3357

Merged
merged 28 commits into from
Jan 25, 2017

Conversation

sebastienmusso
Copy link
Contributor

Simple health check information on container document.

docker.container.health.event_end_date ## last healthcheck execution end
docker.container.health.event_exit_code ## last healthcheck exit code
docker.container.health.event_output ## last healthcheck string output
docker.container.health.event_start_date ## last healthcheck execution start
docker.container.health.failingstreak ## concurent failed healthcheck execution
docker.container.health.status ## healthcheck status

I had Struct mannualy into container.go of fsouza/go-dockerclient.

This could be interresting for simple application monitoring and autoscalling

@elasticmachine
Copy link
Collaborator

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run.

1 similar comment
@elasticmachine
Copy link
Collaborator

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run.

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to think where this fits best. As far as I understand it is available when defined in the docker container or on start of the container. It is kind part of the container so it fits here. So far container provides some information about the container which is fairly constant. state or health on the other side changes over time. There could be an argument to put it into a separate metricset because of that. I'm mentioning state as there are potential more infos we could have under a state metricset, not only health? What else is available?

I would really appreciate your help / input here on what you would recommend.

@@ -99,6 +99,21 @@ func (p Port) Proto() string {
return parts[1]
}

// HealthCheck represents one check of health.
type HealthCheck struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the docker library is updated, also the commit id in the glide.yml file must be updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HealthCheck structure in fsouza/go-dockerclient library have been updated in october :
https://github.com/fsouza/go-dockerclient/commit/45c1a814bcd9b656bdb9813a7c21d0236fc301db

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit id must be updated here: https://github.com/elastic/beats/blob/master/metricbeat/module/docker/glide.yaml Please make sure to copy all changes from this commit to the vendor directory.

@@ -31,7 +32,26 @@ func eventMapping(cont *dc.APIContainers) common.MapStr {
"status": cont.Status,
}

// Check id container have health metrics configured
if strings.Contains(cont.Status, "(") && strings.Contains(cont.Status, ")") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which docker versions support this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Healthcheck is available since Docker 1.12, the only way to get bracket in container status is to configure healthcheck eaven in previous versions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cont.Status is available since a while now maybe docker 0.6 or even before so this field will always be available

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that means the data fetching will also work with older version as the state is optional also for >= 1.12.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly

@@ -40,3 +40,24 @@
type: long
description: >
Size of the files that have been created or changed since creation.
- name: health
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the field naming, please check our naming conventions: https://www.elastic.co/guide/en/beats/libbeat/5.1/event-conventions.html

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some more comments. Please make sure to also run make fmt before pushing to follow the coding style.

@@ -31,7 +32,26 @@ func eventMapping(cont *dc.APIContainers) common.MapStr {
"status": cont.Status,
}

// Check id container have health metrics configured
if strings.Contains(cont.Status, "(") && strings.Contains(cont.Status, ")") {
container, _ := m.dockerClient.InspectContainer(cont.ID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue I have with this is that for each container this does an additional http request. So if we have 100 containers, this can take quite some time. Is there a possibility to get the data already from the existing data we have? Otherwise we should parallel these requests and make container.health a config option so it can be enabled/disabled based on the needs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed if we have 100 containers with healthcheck enabled it could take some more time.
In order to get these Container Health information precisely, we have to inspect the container, it couldn't be fetch easily so the InspectContainer function. How clould we make container.health option ? as metricbeat docker module ? (like container, disk, cpu ...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just introduce a configuration option as mentioned above with container.health: false by default. Alternatively we can make it an additional metricset. I like the idea of a separate metricset and we can still merge it into container later if needed. 👍

@ruflin
Copy link
Member

ruflin commented Jan 13, 2017

Can you also add a line to the CHANGELOG?

@ruflin
Copy link
Member

ruflin commented Jan 16, 2017

@sebastienmusso Ping me when you think it is ready for a next review.

@sebastienmusso
Copy link
Contributor Author

sebastienmusso commented Jan 16, 2017 via email

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments. Please make sure to run make update and make fmt before pushing the changes, as otherwise CI will fail.

@@ -2,3 +2,6 @@ package: github.com/elastic/beats/metricbeat/module/docker
import:
- package: github.com/fsouza/go-dockerclient
version: 23c589186c2a92a8742da55f19aec4997dae5cbb
import:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docker client is already mentioned above. You must update the version and overwrite all files from the client with the new commit id to make sure the full client is updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing these two commits, there are many more changes: fsouza/go-dockerclient@23c5891...45c1a81 I will update the library in a separate commit so you don't have to update it yourself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working here (#3398) on updating the client. It is trickier then I thought. But as soon as this is merged you can rebase on top and you should not have to think about the vendor update part.

@@ -31,7 +32,26 @@ func eventMapping(cont *dc.APIContainers) common.MapStr {
"status": cont.Status,
}

// Check id container have health metrics configured
if strings.Contains(cont.Status, "(") && strings.Contains(cont.Status, ")") {
container, _ := m.dockerClient.InspectContainer(cont.ID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could just introduce a configuration option as mentioned above with container.health: false by default. Alternatively we can make it an additional metricset. I like the idea of a separate metricset and we can still merge it into container later if needed. 👍

@sebastienmusso
Copy link
Contributor Author

sebastienmusso commented Jan 17, 2017 via email

@sebastienmusso
Copy link
Contributor Author

sebastienmusso commented Jan 17, 2017 via email

@ruflin
Copy link
Member

ruflin commented Jan 18, 2017

If there is no health metrics for a container, no event should be sent. So a metricset should work quite well. Also it will be very similar to what we have now. Most docker metricsets add the container info to their data structure as you can see here: https://github.com/elastic/beats/blob/master/metricbeat/module/docker/cpu/data.go#L19 So health will also contain the container info too.

Let me know if I can help further with this PR.

@sebastienmusso
Copy link
Contributor Author

sebastienmusso commented Jan 18, 2017 via email

@ruflin
Copy link
Member

ruflin commented Jan 18, 2017

@sebastienmusso I would really appreciate if we could make it a metricset instead of part of the container metricset if that is ok for you.

@sebastienmusso
Copy link
Contributor Author

sebastienmusso commented Jan 18, 2017 via email

@ruflin
Copy link
Member

ruflin commented Jan 19, 2017

@sebastienmusso #3398 was merged. You should be able to rebase on top of master and not require any change to the dependencies (fingers crossed).

@ruflin
Copy link
Member

ruflin commented Jan 20, 2017

@sebastienmusso Did you rebase / merge master? Because your dependencies should now not be needed anymore.

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making it a metricset. I left some minor feedback.

@@ -1,4 +1,12 @@
package: github.com/elastic/beats/metricbeat/module/docker
import:
- package: github.com/fsouza/go-dockerclient
version: 23c589186c2a92a8742da55f19aec4997dae5cbb
version: e085edda407c05214cc6e71e4881de47667e77ec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No changes should be needed here anymore.

package healthcheck

import (
//"time"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be removed.

for _, container := range containersList {
returnevent := eventMapping(&container, m)
// Compare event to empty event
if !reflect.DeepEqual(emptyEvent, returnevent) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventMapping should just return nil if there is no event.

},
"status": container.State.Health.Status,
"failingstreak": container.State.Health.FailingStreak,
"event_start_date": common.Time(container.State.Health.Log[last_event].Start),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should map all event data under event.start_date, event.exit_code etc.

@@ -1,5 +1,5 @@
#- module: docker
#metricsets: ["cpu", "info", "memory", "network", "diskio", "container"]
#metricsets: ["cpu", "info", "memory", "network", "diskio", "container", "healthcheck"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should sort this alphabetically (not introduce by you).

@sebastienmusso
Copy link
Contributor Author

sebastienmusso commented Jan 20, 2017 via email

@ruflin
Copy link
Member

ruflin commented Jan 20, 2017

You still need to check for nil, but compare it to nil instead of an empty common.MapStr in your code.

Strangely it shows up in the diffs. Did you merge in master or did you rebase? Or just copy over?

@sebastienmusso
Copy link
Contributor Author

sebastienmusso commented Jan 20, 2017 via email

@ruflin
Copy link
Member

ruflin commented Jan 20, 2017

Can you please rebase on top of master and remove all the changes to glide.yaml and the addition of the dependencies? As these should not change anymore in this PR. Merging in master should also work if you prefer merging. But for the files which should not be there, use git checkout master .../file.name to reset it to master.

@sebastienmusso
Copy link
Contributor Author

sebastienmusso commented Jan 20, 2017 via email

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some minor comments. All looks good. I'm ok merging it as soon as the two vendor files are readded and doing afterwards some cleanup PR's.

"healthcheck": {
"failingstreak": 0,
"status": "healthy",
"healthcheck": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have an additional layer here?

@@ -1,128 +0,0 @@
// Copyright 2016 go-dockerclient authors. All rights reserved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should not be removed.

@@ -1,157 +0,0 @@
// Copyright 2016 go-dockerclient authors. All rights reserved.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should not be removed.

event = common.MapStr{
mb.ModuleData: common.MapStr{
"container": common.MapStr{
"name": docker.ExtractContainerName(cont.Names),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a possibility that we use the same semantics as we have in other docker metricsets to add the full container info?

for _, container := range containersList {
returnevent := eventMapping(&container, m)
// Compare event to empty event
if !reflect.ValueOf(returnevent).IsNil() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to just do returnevent != nil here instead of using reflect package.

@sebastienmusso
Copy link
Contributor Author

sebastienmusso commented Jan 23, 2017 via email

@sebastienmusso
Copy link
Contributor Author

sebastienmusso commented Jan 23, 2017 via email

@ruflin ruflin merged commit b8a49d1 into elastic:master Jan 25, 2017
@ruflin
Copy link
Member

ruflin commented Jan 25, 2017

@sebastienmusso Thanks a lot for your contribution. I merged it as it is. We can think in the future about how we add all container meta data to the event when we replace the fsouza library.

Let me know if I can help you somehow with the swarm module. I would recommend to build the swarm module if possible directly based on the official docker client.

@ruflin
Copy link
Member

ruflin commented Jan 25, 2017

Here is the follow up PR: #3463

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants