AWS Lambda function to parse S3 server log files and export metrics to AWS CloudWatch.
This AWS Lambda function will analyze and aggregate your S3 Server Access log files and graph extra metrics in AWS CloudWatch.
NOTE: AWS introduced S3 request metrics at re:Invent 2016. This makes parts of below README outdated as of December 2016 and it might not be the most optimal way of achieving your goals. Before reading further please check out AWS documentation: S3 Request Metrics. There are still extra, more specific metrics provided by s3logs-cloudwatch that the "official way" does not provide.
AWS S3 is a managed storage service. The only metrics available
in AWS CloudWatch for S3 are
BucketSizeBytes. In order
to understand your S3 usage better you need to do some extra work. Reasons can
be multiple: understanding your AWS bill, getting better idea of how your
deployed applications use S3, access auditing, security, performance
considerations or proactive monitoring because you love metrics.
AWS offers server access logging to track requests for access to your bucket. This is a feature that is disabled by default, but you can enable it free of charge (you will be charged for storage used for storing files and data transfer for access to delivered log files).
To find out more about Server Access Logging feature of S3, head over to Server Access Logging section in Amazon Simple Storage Service Developer Guide.
Extra S3 Metrics available
Custom metrics are sent to AWS CloudWatch under the namespace
You can choose a different namespace name in
configuration.ini file. Each
metric has a dimension
BucketName in case you would like to enable graphing
requests for multiple buckets.
Metrics enabled by default are listed below.
To enable or disable specific metric edit
|Metric name||Enabled (default)|
Self explanatory hopefully. Amount of requests.
The number of milliseconds the request was in flight from the server's perspective. This value is measured from the time your request is received to the time that the last byte of the response is sent. Measurements made from the client's perspective might be longer due to network latency.
The number of milliseconds that Amazon S3 spent processing your request. This value is measured from the time the last byte of your request was received until the time the first byte of the response was sent.
The total size of the object in question.
Follow below steps to enable CloudWatch metrics for bucket
Create a new S3 bucket for storing your S3 logs, for example
Enable logging on
com-companyname-s3logsas Target Bucket for log objects. Specify
com-companyname-mybucket/as Target Prefix. Target Prefix value must end with a
/for s3logs-cloudwatch Lambda function to properly set BucketName dimension in exported CloudWatch metrics
Clone this GitHub repository so you have it locally and prepare deployment package. Upload deployment package to S3.
$ git clone email@example.com:maginetv/s3logs-cloudwatch.git $ cd s3logs-cloudwatch/
configuration.inifile if you would like to adjust the settings or change which metrics will be exported to CloudWatch. Then zip the files and put resulting zip somewhere on S3.
$ zip s3logs-cloudwatch.zip lambda_function.py configuration.ini $ s3cmd cp s3logs-cloudwatch.zip s3://com-companyname-packages/lambda/s3logs-cloudwatch.zip
Use provided CloudFormation template to deploy s3logs-cloudwatch AWS Lambda function and IAM role that is required to run it.
Go to CloudFormation in AWS Console and create the stack using provided CloudFormation template
Last step is creating and enabling triggers (event sources) for deployed Lambda function. Head over to AWS Lambda in AWS console and find
s3logs-cloudwatchfunction in functions list. Deployed function is supposed to be run every time a new S3 log file is delivered to
com-companyname-s3logsbucket. Picture below shows the configuration that has to be in place for the trigger to work:
If you need to enable s3logs-cloudwatch to graph S3 metrics from multiple S3 buckets just add another trigger (with different prefix/suffix setting).
That's it. Your new Cloudwatch metrics will start appearing soon.
Q: I don't see metrics in Cloudwatch from the last hour, why?
A: AWS delivers log files to your logs bucket every once in a while. It's log file delivery that triggers the processing and exporting metrics to CloudWatch. Log files are usually delivered every 1 hour and that's when CloudWatch graphs for s3logs-cloudwatch metrics are updated.
Q: How much does it cost to run this function?
A: As with everything on AWS. You pay for what you use. The monthly price of
deploying this function on your AWS account will depend on: the volume of log
files that need to be processed, amount of resources that you assign to
s3logs-cloudwatch Lambda function and granularity od data aggregation
round_timestamp_to_min setting in
s3logs-cloudwatch aims to do everything in most cost effective way possible.
Q: My logs bucket (
com-companyname-s3logs) is growing really big, I do not
need my S3 logs to be stored for such a long time because it's expensive.
A: Enable S3 lifecycle rules for
com-companyname-s3logs bucket. You can
com-companyname-s3logs bucket to delete log files permanently after
certain period (for example 1 or 7 days). Check [Object Lifecycle Management]
Amazon S3 Developer Guide.
Pull requests are very welcome if you feel like you would like to improve or add any functionality. In order to contribute:
- Fork this repository on GitHub
- Create your own topic branch
- Once finished, submit a pull request
Copyright 2016 Magine AB Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Michal Gasek (michal.gasek at magine/com)