-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add monitoring information about functions to Functionbeat #14876
Add monitoring information about functions to Functionbeat #14876
Conversation
aaced7f
to
1f87e84
Compare
} | ||
|
||
// AddTriggeredFunction adds a triggered function data to the registry. | ||
func (t *telemetry) AddTriggeredFunction(typeName string, triggerCount, eventCount int64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of triggerCount or eventCount? Telemetry is less about monitoring, we just want to know which features are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was interested to see how many triggers users configure. I can remove both counters if we don't want those to be reported at all.
@@ -31,6 +33,7 @@ var ( | |||
"logstash", | |||
"console", // for local debugging | |||
} | |||
metrics = monitoring.Default.NewRegistry("functionbeat") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
libbeat declares that 'stat' registry for telemtry data. If it is used we will have the monitoring backend report metrics and telemetry information every now and so often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only found a registry named "state". Is that what you mean?
https://github.com/elastic/beats/blob/master/libbeat/cmd/instance/beat.go#L338
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @kvch is already using the stats
registry here, by virtue of using monitoring.Default
. See
beats/libbeat/monitoring/monitoring.go
Lines 50 to 55 in f35ffdc
// Default is the global default metrics registry provided by the monitoring package. | |
var Default = NewRegistry() | |
func init() { | |
GetNamespace("stats").SetRegistry(Default) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean "state". The 'stats' registry is for metrics, the 'state' registry is for telemetry. monitoring.Default
should not be used here.
One tricky problem might be, that we have no control when telemetry data are reported, or how often. If we remove the trigger counts we still can collect information like: functionbeat being used, provider in use, trigger types configured, and configured output type. Once a function is installed with a provider, it is some rather static information. Instead of relying on some monitoring background task that might be reporting information in time or not (the beat is put to halt once all events have been ACKed), we might want to implement slightly different strategy. e.g. On startup there is a license check. The license is cached once queried. The same time we query the license we could also attempt to send the telemetry info. Functionbeat would have to wait for events being published + telemetry being send. If there was no telemetry update for N minutes we could send the same info again. @ph @ycombinator WDYT? I also wonder about feasbility (would we have to implement our own reporting or can we reuse code from the reporter as is?). |
@urso I like your proposal about sending the monitoring data as soon as the license is queried. We can do that once and then continue with the periodic sending thereafter. Obviously this will require some refactoring in the libbeat monitoring code first. Specifically, it looks like currently the reporter's constructor is responsible for spawning a goroutine to periodically send monitoring data. Instead we might want to expose a One clarifying question: I assume you'd want to make this change (start monitoring reporter right after license check) for all beats, not just Functionbeat? |
This sounds like a rather big change we should do in another PR. Telemetry should not be reported by an independent process. If all events are ACKed before Telemetry did finish, then the information is lost.
It could be for all Beats. But because Functionbeat is so different from the other Beats it is a requirement for Functionbeat. Reporting must not necesarrily be tied to the License check, but should happen if the function starts up first. Maybe again if X the function is invoked with a new Batch of events and the last report is X minutes in the past. => Although reporting is done in the background, the if and when to trigger it should be under control of Functionbeat, not libbeat. |
// AddTriggeredFunction does nothing. | ||
func (m *mock) AddTriggeredFunction(_ string, _ int64, _ int64) { | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want the mock to store invocations, so we can validate them in tests?
@kvch I have a draft PR up for the Kibana-side changes here: elastic/kibana#54267. Once you've updated the monitoring snippet for Functionbeat telemetry with the final one in this PR's description above, I will proceed with making the corresponding changes in the Kibana PR. |
Are we missing a file? |
@urso No, I just forgot to add |
// AddTriggeredFunction adds a triggered function data to the registry. | ||
func (t *telemetry) AddTriggeredFunction() { | ||
t.Lock() | ||
defer t.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to a lock here. (*monitoring.Int).Inc()
is atomic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All in all, LGTM. Please create a follow up issue allowing functionbeat to control when the telemetry reporter should send info.
…4876) This PR adds a new counter which returns the number of triggered functions of Functionbeat. Example snippet of the telemetry message containing info about function: ```json { "state": { "functionbeat": { "functions": { "count": 1, } } } ``` Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data. (cherry picked from commit 9ce1c26)
…4876) This PR adds a new counter which returns the number of triggered functions of Functionbeat. Example snippet of the telemetry message containing info about function: ```json { "state": { "functionbeat": { "functions": { "count": 1, } } } ``` Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data. (cherry picked from commit 9ce1c26)
…4876) This PR adds a new counter which returns the number of triggered functions of Functionbeat. Example snippet of the telemetry message containing info about function: ```json { "state": { "functionbeat": { "functions": { "count": 1, } } } ``` Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data. (cherry picked from commit 9ce1c26)
…4876) This PR adds a new counter which returns the number of triggered functions of Functionbeat. Example snippet of the telemetry message containing info about function: ```json { "state": { "functionbeat": { "functions": { "count": 1, } } } ``` Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data. (cherry picked from commit 9ce1c26)
…4876) This PR adds a new counter which returns the number of triggered functions of Functionbeat. Example snippet of the telemetry message containing info about function: ```json { "state": { "functionbeat": { "functions": { "count": 1, } } } ``` Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data. (cherry picked from commit 9ce1c26)
…4876) This PR adds a new counter which returns the number of triggered functions of Functionbeat. Example snippet of the telemetry message containing info about function: ```json { "state": { "functionbeat": { "functions": { "count": 1, } } } ``` Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data. (cherry picked from commit 9ce1c26)
…4876) This PR adds a new counter which returns the number of triggered functions of Functionbeat. Example snippet of the telemetry message containing info about function: ```json { "state": { "functionbeat": { "functions": { "count": 1, } } } ``` Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data. (cherry picked from commit 9ce1c26)
… to Functionbeat (#15461) * Add monitoring information about functions to Functionbeat (#14876) This PR adds a new counter which returns the number of triggered functions of Functionbeat. Example snippet of the telemetry message containing info about function: ```json { "state": { "functionbeat": { "functions": { "count": 1, } } } ``` Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data. (cherry picked from commit 9ce1c26)
This PR adds a new counter which returns the number of triggered functions of Functionbeat.
Example snippet of the telemetry message containing info about function:
Follow-up work is required on Kibana side to process the reported information correctly to get telemetry data.