Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add monitoring information about functions to Functionbeat #14876

Merged
merged 7 commits into from
Jan 10, 2020

Conversation

kvch
Copy link
Contributor

@kvch kvch commented Dec 2, 2019

This PR adds a new counter which returns the number of triggered functions of Functionbeat.

Example snippet of the telemetry message containing info about function:

{
  "state": {
    "functionbeat": {
        "functions": {
             "count": 1,
        }
    }
}

Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data.

@kvch kvch added needs_backport PR is waiting to be backported to other branches. Functionbeat [zube]: In Review Team:Beats labels Dec 2, 2019
@kvch kvch changed the title Add telemetry to Functionbeat Add monitoring information about functions to Functionbeat Dec 2, 2019
@kvch kvch force-pushed the feature-functionbeat-add-monitorable-info branch from aaced7f to 1f87e84 Compare January 6, 2020 16:19
}

// AddTriggeredFunction adds a triggered function data to the registry.
func (t *telemetry) AddTriggeredFunction(typeName string, triggerCount, eventCount int64) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of triggerCount or eventCount? Telemetry is less about monitoring, we just want to know which features are used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was interested to see how many triggers users configure. I can remove both counters if we don't want those to be reported at all.

@@ -31,6 +33,7 @@ var (
"logstash",
"console", // for local debugging
}
metrics = monitoring.Default.NewRegistry("functionbeat")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libbeat declares that 'stat' registry for telemtry data. If it is used we will have the monitoring backend report metrics and telemetry information every now and so often.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only found a registry named "state". Is that what you mean?
https://github.com/elastic/beats/blob/master/libbeat/cmd/instance/beat.go#L338

Copy link
Contributor

@ycombinator ycombinator Jan 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @kvch is already using the stats registry here, by virtue of using monitoring.Default. See

// Default is the global default metrics registry provided by the monitoring package.
var Default = NewRegistry()
func init() {
GetNamespace("stats").SetRegistry(Default)
}
.

Copy link

@urso urso Jan 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean "state". The 'stats' registry is for metrics, the 'state' registry is for telemetry. monitoring.Default should not be used here.

@urso
Copy link

urso commented Jan 6, 2020

One tricky problem might be, that we have no control when telemetry data are reported, or how often. If we remove the trigger counts we still can collect information like: functionbeat being used, provider in use, trigger types configured, and configured output type. Once a function is installed with a provider, it is some rather static information.

Instead of relying on some monitoring background task that might be reporting information in time or not (the beat is put to halt once all events have been ACKed), we might want to implement slightly different strategy. e.g. On startup there is a license check. The license is cached once queried. The same time we query the license we could also attempt to send the telemetry info. Functionbeat would have to wait for events being published + telemetry being send. If there was no telemetry update for N minutes we could send the same info again.

@ph @ycombinator WDYT? I also wonder about feasbility (would we have to implement our own reporting or can we reuse code from the reporter as is?).

@urso urso self-assigned this Jan 6, 2020
@ycombinator
Copy link
Contributor

@urso I like your proposal about sending the monitoring data as soon as the license is queried. We can do that once and then continue with the periodic sending thereafter. Obviously this will require some refactoring in the libbeat monitoring code first. Specifically, it looks like currently the reporter's constructor is responsible for spawning a goroutine to periodically send monitoring data. Instead we might want to expose a Start method on the Reporter interface and require the reporter's client to call it after constructing the reporter. Then we can instantiate the reporter where it is being done today but call the Start method from the license check.

One clarifying question: I assume you'd want to make this change (start monitoring reporter right after license check) for all beats, not just Functionbeat?

@urso
Copy link

urso commented Jan 7, 2020

This sounds like a rather big change we should do in another PR.

Telemetry should not be reported by an independent process. If all events are ACKed before Telemetry did finish, then the information is lost.

One clarifying question: I assume you'd want to make this change (start monitoring reporter right after license check) for all beats, not just Functionbeat?

It could be for all Beats. But because Functionbeat is so different from the other Beats it is a requirement for Functionbeat. Reporting must not necesarrily be tied to the License check, but should happen if the function starts up first. Maybe again if X the function is invoked with a new Batch of events and the last report is X minutes in the past. => Although reporting is done in the background, the if and when to trigger it should be under control of Functionbeat, not libbeat.

// AddTriggeredFunction does nothing.
func (m *mock) AddTriggeredFunction(_ string, _ int64, _ int64) {
return
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want the mock to store invocations, so we can validate them in tests?

@ycombinator
Copy link
Contributor

@kvch I have a draft PR up for the Kibana-side changes here: elastic/kibana#54267. Once you've updated the monitoring snippet for Functionbeat telemetry with the final one in this PR's description above, I will proceed with making the corresponding changes in the Kibana PR.

@kvch kvch requested a review from urso January 9, 2020 13:00
@urso
Copy link

urso commented Jan 9, 2020

Are we missing a file? make check is failing with symbol errors.

@kvch
Copy link
Contributor Author

kvch commented Jan 9, 2020

@urso No, I just forgot to add : before =... :D

// AddTriggeredFunction adds a triggered function data to the registry.
func (t *telemetry) AddTriggeredFunction() {
t.Lock()
defer t.Unlock()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to a lock here. (*monitoring.Int).Inc() is atomic.

Copy link

@urso urso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all, LGTM. Please create a follow up issue allowing functionbeat to control when the telemetry reporter should send info.

@kvch kvch merged commit 9ce1c26 into elastic:master Jan 10, 2020
kvch added a commit to kvch/beats that referenced this pull request Jan 10, 2020
…4876)

This PR adds a new counter which returns the number of triggered functions of Functionbeat.

Example snippet of the telemetry message containing info about function:

```json
{
  "state": {
    "functionbeat": {
        "functions": {
             "count": 1,
        }
    }
}
```

Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data.
(cherry picked from commit 9ce1c26)
@kvch kvch added v7.6.0 and removed needs_backport PR is waiting to be backported to other branches. labels Jan 10, 2020
kvch added a commit to kvch/beats that referenced this pull request Jan 10, 2020
…4876)

This PR adds a new counter which returns the number of triggered functions of Functionbeat.

Example snippet of the telemetry message containing info about function:

```json
{
  "state": {
    "functionbeat": {
        "functions": {
             "count": 1,
        }
    }
}
```

Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data.
(cherry picked from commit 9ce1c26)
kvch added a commit to kvch/beats that referenced this pull request Jan 10, 2020
…4876)

This PR adds a new counter which returns the number of triggered functions of Functionbeat.

Example snippet of the telemetry message containing info about function:

```json
{
  "state": {
    "functionbeat": {
        "functions": {
             "count": 1,
        }
    }
}
```

Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data.
(cherry picked from commit 9ce1c26)
kvch added a commit to kvch/beats that referenced this pull request Jan 10, 2020
…4876)

This PR adds a new counter which returns the number of triggered functions of Functionbeat.

Example snippet of the telemetry message containing info about function:

```json
{
  "state": {
    "functionbeat": {
        "functions": {
             "count": 1,
        }
    }
}
```

Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data.
(cherry picked from commit 9ce1c26)
kvch added a commit to kvch/beats that referenced this pull request Jan 10, 2020
…4876)

This PR adds a new counter which returns the number of triggered functions of Functionbeat.

Example snippet of the telemetry message containing info about function:

```json
{
  "state": {
    "functionbeat": {
        "functions": {
             "count": 1,
        }
    }
}
```

Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data.
(cherry picked from commit 9ce1c26)
kvch added a commit to kvch/beats that referenced this pull request Jan 10, 2020
…4876)

This PR adds a new counter which returns the number of triggered functions of Functionbeat.

Example snippet of the telemetry message containing info about function:

```json
{
  "state": {
    "functionbeat": {
        "functions": {
             "count": 1,
        }
    }
}
```

Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data.
(cherry picked from commit 9ce1c26)
kvch added a commit to kvch/beats that referenced this pull request Jan 10, 2020
…4876)

This PR adds a new counter which returns the number of triggered functions of Functionbeat.

Example snippet of the telemetry message containing info about function:

```json
{
  "state": {
    "functionbeat": {
        "functions": {
             "count": 1,
        }
    }
}
```

Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data.
(cherry picked from commit 9ce1c26)
kvch added a commit that referenced this pull request Jan 10, 2020
… to Functionbeat (#15461)

* Add monitoring information about functions to Functionbeat (#14876)

This PR adds a new counter which returns the number of triggered functions of Functionbeat.

Example snippet of the telemetry message containing info about function:

```json
{
  "state": {
    "functionbeat": {
        "functions": {
             "count": 1,
        }
    }
}
```

Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data.
(cherry picked from commit 9ce1c26)
@kvch kvch unassigned urso Jan 27, 2020
@kvch kvch added the test-plan Add this PR to be manual test plan label Jan 27, 2020
@andresrc andresrc requested review from blakerouse and removed request for blakerouse January 27, 2020 14:37
@blakerouse blakerouse added the test-plan-ok This PR passed manual testing label Jan 27, 2020
@andresrc andresrc added the Team:Integrations Label for the Integrations team label Mar 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Functionbeat needs testing notes Team:Integrations Label for the Integrations team test-plan Add this PR to be manual test plan test-plan-ok This PR passed manual testing v7.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants