The idea behind that project is to be able to easily archive logs from CloudWatch into an S3 bucket thanks to AWS features. It's designed to be used with a scheduled task running everyday in order to retrieve yesterday logs.
- Retrieve flags value from either the command line or from environment variables.
- Identify which CloudWatch log streams must be downloaded.
- Download concurrently all logs with multiple goroutines.
- Create a ZIP archive with all these logs.
- Upload on an S3 bucket.
... That's all!
To use Go with a Lambda function, we need a Linux binary that we will compress into a ZIP archive.
# Build a binary that will run on Linux
GOOS=linux go build -o logs-archiving logs-archiving.go
# Put the binary into a ZIP archive
zip logs-archiving.zip logs-archiving
Once the archive has been generated, you have to upload it on AWS.
AWS credentials are automatically retrieved from the execution context. There is no additional configuration required.
Two environment variables must be configured on the Lambda function:
BUCKET_NAME
, the S3 bucket name where logs will be archived.ENVIRONMENT_NAME
, the environment name from where logs have been generated.TARGET_DATE
(optional), the day on which the logs must be archived.
These values can also be passed manually outside AWS by using:
go run logs-archiving.go -bucket XXXXX -environment XXXXX (-target XXXXX)
Therefore if you want to use the script locally, you have to replace lambda.Start(LambdaHandler)
by LambdaHandler()
.
The instruction is required by AWS, but it causes an infinite wait when the program is run outside a Lambda context.
Because of the Lambda nature, limited execution time and limited resources, it can be problematic to archive access
logs generated by a production infrastructure. A condition has been implemented in the process to avoid timeouts due to
those files. It simply consists of bypassing log stream names which contain the string access
.
That's not the nicest solution, but it covers most of our use cases (Apache and Nginx).