Do not merge: SLO solution (Cloudwatch embedded metric log format) proof of concept #696
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this change?
This is a proof of concept PR to demonstate the validity of using the AWS embedded metric format to capture metrics directly from the log processing without incurring the cost or potential blocking of a cloudwatch.putMetric() command.
This is a solution to a problem tracking historical SLI's for the mobile team. The root of the problem is that, as the metrics are pulled from the logs by a kibana or grafana dashboard, there is currently no way to persist the vaues beyond the lifetime of those log files.
The solution we propose is to log these values as cloudwatch metrics. These persist more or less permanently
and can then be accesed by dashboarding software - eg grafana - and displayed at will.
Normally metrics are sent to cloudwatch using a putMetric() request. However this is both a blocking call - so adds risk and slows things down - and has a specific cost of $US0.01 per 1000 requests.
AWS embedded metric format avoids both of these. The processing cost falls on AWS as it is handled as part of their log processing procedures and as we are not using a putMetric() request, we avoid the above mentioned cost.
This code contains a working example of a metric being sent as part of a log message. We have tested in code and confirmed that this metric is indeed created.
Full details on the format needed to take advantage of this is here
Detailed documentation on how to use Embedded metrics and their advantages and disadvantages can be found here
In addition an excellent study of using these metrics is in the first 10 minutes of this video