Skip to content

⚡ Logs archiving from CloudWatch to S3 through a Lambda function.

License

Notifications You must be signed in to change notification settings

ajardin/lambda-logs-archiving

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Logs archiving with a Lambda function

Build Status Codacy Badge License: MIT

Overview

The idea behind that project is to be able to easily archive logs from CloudWatch into an S3 bucket thanks to AWS features. It's designed to be used with a scheduled task running everyday in order to retrieve yesterday logs.

Behavior

  1. Retrieve flags value from either the command line or from environment variables.
  2. Identify which CloudWatch log streams must be downloaded.
  3. Download concurrently all logs with multiple goroutines.
  4. Create a ZIP archive with all these logs.
  5. Upload on an S3 bucket.

... That's all!

Usage

To use Go with a Lambda function, we need a Linux binary that we will compress into a ZIP archive.

# Build a binary that will run on Linux
GOOS=linux go build -o logs-archiving logs-archiving.go

# Put the binary into a ZIP archive 
zip logs-archiving.zip logs-archiving

Once the archive has been generated, you have to upload it on AWS.

Configuration

AWS credentials are automatically retrieved from the execution context. There is no additional configuration required.

Two environment variables must be configured on the Lambda function:

  • BUCKET_NAME, the S3 bucket name where logs will be archived.
  • ENVIRONMENT_NAME, the environment name from where logs have been generated.
  • TARGET_DATE (optional), the day on which the logs must be archived.

These values can also be passed manually outside AWS by using:

go run logs-archiving.go -bucket XXXXX -environment XXXXX (-target XXXXX)

Therefore if you want to use the script locally, you have to replace lambda.Start(LambdaHandler) by LambdaHandler(). The instruction is required by AWS, but it causes an infinite wait when the program is run outside a Lambda context.

Limitations

Because of the Lambda nature, limited execution time and limited resources, it can be problematic to archive access logs generated by a production infrastructure. A condition has been implemented in the process to avoid timeouts due to those files. It simply consists of bypassing log stream names which contain the string access.

That's not the nicest solution, but it covers most of our use cases (Apache and Nginx).

About

⚡ Logs archiving from CloudWatch to S3 through a Lambda function.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages