From 7e78e70adaec13dda39b7f66840416ec675cdc44 Mon Sep 17 00:00:00 2001 From: Tian Chu Date: Tue, 31 Dec 2019 11:46:04 -0500 Subject: [PATCH] v3.0.0 --- .github/workflows/lambdachecks.yml | 2 +- aws/logs_monitoring/.gitignore | 1 + aws/logs_monitoring/README.md | 203 ++------- aws/logs_monitoring/lambda_function.py | 106 +++-- aws/logs_monitoring/log-sam-template.yaml | 17 - aws/logs_monitoring/release.sh | 82 ++++ aws/logs_monitoring/template.yaml | 528 ++++++++++++++++++++++ 7 files changed, 708 insertions(+), 231 deletions(-) create mode 100644 aws/logs_monitoring/.gitignore delete mode 100644 aws/logs_monitoring/log-sam-template.yaml create mode 100755 aws/logs_monitoring/release.sh create mode 100644 aws/logs_monitoring/template.yaml diff --git a/.github/workflows/lambdachecks.yml b/.github/workflows/lambdachecks.yml index 59dbf9817..24814f52d 100644 --- a/.github/workflows/lambdachecks.yml +++ b/.github/workflows/lambdachecks.yml @@ -9,7 +9,7 @@ jobs: strategy: max-parallel: 4 matrix: - python-version: [2.7, 3.6, 3.7] + python-version: [3.7] steps: - uses: actions/checkout@v1 - name: Set up Python ${{ matrix.python-version }} diff --git a/aws/logs_monitoring/.gitignore b/aws/logs_monitoring/.gitignore new file mode 100644 index 000000000..c4c4ffc6a --- /dev/null +++ b/aws/logs_monitoring/.gitignore @@ -0,0 +1 @@ +*.zip diff --git a/aws/logs_monitoring/README.md b/aws/logs_monitoring/README.md index 975db6972..4f7f034b6 100644 --- a/aws/logs_monitoring/README.md +++ b/aws/logs_monitoring/README.md @@ -1,188 +1,49 @@ -**IMPORTANT NOTE: When upgrading, please ensure your forwarder Lambda function has [the latest Datadog Lambda Layer installed](https://github.com/DataDog/datadog-serverless-functions/tree/master/aws/logs_monitoring#3-add-the-datadog-lambda-layer).** - # Datadog Forwarder -AWS Lambda function to ship logs and metrics from ELB, S3, CloudTrail, VPC, CloudFront, and CloudWatch logs to Datadog +AWS Lambda function to ship logs from S3 and CloudWatch, custom metrics and traces from Lambda functions to Datadog. ## Features -- Forward logs through HTTPS (defaulted to port 443) -- Use AWS Lambda to re-route triggered S3 events to Datadog -- Use AWS Lambda to re-route triggered Kinesis data stream events to Datadog, only the Cloudwatch logs are supported -- Cloudwatch, ELB, S3, CloudTrail, VPC and CloudFront logs can be forwarded -- SSL Security -- JSON events providing details about S3 documents forwarded -- Structured meta-information can be attached to the events -- Scrubbing / Redaction rules -- Filtering rules (`INCLUDE_AT_MATCH` and `EXCLUDE_AT_MATCH`) -- Multiline Log Support (S3 Only) -- Forward custom metrics from logs -- Submit `aws.lambda.enhanced.*` Lambda metrics parsed from the AWS REPORT log: duration, billed_duration, max_memory_used, estimated_cost - -## Quick Start - -The provided Python script must be deployed into your AWS Lambda service to collect your logs and send them to Datadog. - -### 1. Create a new Lambda function - -1. [Navigate to the Lambda console](https://console.aws.amazon.com/lambda/home) and create a new function. -2. Select `Author from scratch` and give the function a unique name: `datadog-log-monitoring-function` -3. For `Role`, select `Create new role from template(s)` and give the role a unique name: `datadog-log-monitoring-function-role` -4. Under Policy templates, select `s3 object read-only permissions`. - -### 2. Provide the code - -1. Copy paste the code of the Lambda function from the `lambda_function.py` file. -2. Set the runtime to `Python 2.7`, `Python 3.6`, or `Python 3.7` -3. Set the handler to `lambda_function.lambda_handler` - -### 3. Add the Datadog Lambda Layer -The [Datadog Lambda Layer]((https://github.com/DataDog/datadog-lambda-layer-python)) **MUST** be added to the log forwarder Lambda function. Use the Lambda layer ARN below, and replace `` with the actual region (e.g., `us-east-1`), `` with the runtime of your forwarder (e.g., `Python27`), and `` with the latest version from the [CHANGELOG](https://github.com/DataDog/datadog-lambda-layer-python/blob/master/CHANGELOG.md). - -``` -arn:aws:lambda::464622532012:layer:Datadog-: -``` - -For example: - -``` -arn:aws:lambda:us-east-1:464622532012:layer:Datadog-Python37:8 -``` - - -### 4. Set your Parameters - -At the top of the script you'll find a section called `PARAMETERS`, that's where you want to edit your code, available paramters are: - -#### DD_API_KEY - -Set the Datadog API key for your Datadog platform, it can be found here: - -* Datadog US Site: https://app.datadoghq.com/account/settings#api -* Datadog EU Site: https://app.datadoghq.eu/account/settings#api - -There are 3 possibilities to set your Datadog API key: - -1. **KMS Encrypted key (recommended)**: Use the `DD_KMS_API_KEY` environment variable to use a KMS encrypted key. Make sure that the Lambda execution role is listed in the KMS Key user in https://console.aws.amazon.com/iam/home#encryptionKeys. -2. **Environment Variable**: Use the `DD_API_KEY` environment variable for the Lambda function. -3. **Manual**: Replace `` in the code: - - ```python - ## @param DD_API_KEY - String - required - default: none - ## The Datadog API key associated with your Datadog Account - ## It can be found here: - ## - ## * Datadog US Site: https://app.datadoghq.com/account/settings#api - ## * Datadog EU Site: https://app.datadoghq.eu/account/settings#api - # - DD_API_KEY = "" - ``` - -#### Custom Tags - -Add custom tags to all data forwarded by your function, either: - -* Use the `DD_TAGS` environment variable. Your tags must be a comma-separated list of strings with no trailing comma. -* Edit the lambda code directly: - - ```python - ## @param DD_TAGS - list of comma separated strings - optional -default: none - ## Pass custom tags as environment variable or through this variable. - ## Ensure your tags are a comma separated list of strings with no trailing comma in the envvar! - # - DD_TAGS = os.environ.get("DD_TAGS", "") - ``` - -#### Datadog Site - -Define your Datadog Site to send data to, `datadoghq.com` for Datadog US site or `datadoghq.eu` for Datadog EU site, either: - -* Use the `DD_SITE` environment variable. -* Edit the lambda code directly: - - ```python - ## @param DD_SITE - String - optional -default: datadoghq.com - ## Define the Datadog Site to send your logs and metrics to. - ## Set it to `datadoghq.eu` to send your logs and metrics to Datadog EU site. - # - DD_SITE = os.getenv("DD_SITE", default="datadoghq.com") - ``` - -#### Send logs through TCP or HTTP. - -By default, the forwarder sends logs using HTTPS through the port `443`. To send logs over a SSL encrypted TCP connection either: - -* Set the environment variable `DD_USE_TCP` to `true`. -* Edit the lambda code directly: - - ```python - ## @param DD_USE_TCP - boolean - optional -default: false - ## Change this value to `true` to send your logs and metrics using the HTTP network client - ## By default, it use the TCP client. - # - DD_USE_TCP = os.getenv("DD_USE_TCP", default="false").lower() == "true" - ``` - -#### Proxy - -Ensure that you disable SSL between the lambda and your proxy by setting `DD_NO_SSL` to `true` - -Two environment variables can be used to forward logs through a proxy: - -* `DD_URL`: Define the proxy endpoint to forward the logs to. -* `DD_PORT`: Define the proxy port to forward the logs to. - -#### DD_FETCH_LAMBDA_TAGS - -If the `DD_FETCH_LAMBDA_TAGS` env variable is set to `true` then the log forwarder will fetch Lambda tags using [GetResources](https://docs.aws.amazon.com/resourcegroupstagging/latest/APIReference/API_GetResources.html) API calls and apply them to the `aws.lambda.enhanced.*` metrics parsed from the REPORT log. For this to work the log forwarder function needs to be given the `tag:GetResources` permission. The tags are cached in memory so that they'll only be fetched when the function cold starts or when the TTL (1 hour) expires. The log forwarder increments the `aws.lambda.enhanced.get_resources_api_calls` metric for each API call made. - -### 5. Configure your function - -To configure your function: - -1. Set the memory to 1024 MB. -2. Also set the timeout limit. 120 seconds is recommended to deal with big files. -3. Hit the `Save` button. - -### 6. Test it - -Hit the `Test` button, and select `CloudWatch Logs` as the sample event. If the test "succeeded", you are all set! The test log doesn't show up in the platform. - -**Note**: For S3 logs, there may be some latency between the time a first S3 log file is posted and the Lambda function wakes up. +- Forward CloudWatch, ELB, S3, CloudTrail, VPC and CloudFront logs to Datadog +- Forward S3 events to Datadog +- Forward Kinesis data stream events to Datadog, only CloudWatch logs are supported +- Forward custom metrics from AWS Lambda functions via CloudWatch logs +- Forward traces from AWS Lambda functions via CloudWatch logs +- Generate and submit enhanced Lambda metrics (`aws.lambda.enhanced.*`) parsed from the AWS REPORT log: duration, billed_duration, max_memory_used, and estimated_cost -### 7. (optional) Scrubbing / Redaction rules +## Install -Multiple scrubbing options are available. `REDACT_IP` and `REDACT_EMAIL` match against hard-coded patterns, while `DD_SCRUBBING_RULE` allows users to supply a regular expression. -- To use `REDACT_IP`, add it as an environment variable and set the value to `true`. - - Text matching `\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}` is replaced with `xxx.xxx.xxx.xxx`. -- To use `REDACT_EMAIL`, add it as an environment variable and set the value to `true`. - - Text matching `[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+` is replaced with `xxxxx@xxxxx.com`. -- To use `DD_SCRUBBING_RULE`, add it as a environment variable, and supply a regular expression as the value. - - Text matching the user-supplied regular expression is replaced with `xxxxx`, by default. - - Use the `DD_SCRUBBING_RULE_REPLACEMENT` environment variable to supply a replacement value instead of `xxxxx`. -- Scrubbing rules are applied to the full JSON-formatted log, including any metadata that is automatically added by the Lambda function. -- Each instance of a pattern match is replaced until no more matches are found in each log. +1. Login AWS using a user/role with admin permissions. +1. Deploy the [datadog-serverless](https://console.aws.amazon.com/cloudformation/home#/stacks/new?stackName=datadog-serverless&templateURL=https://dd-log-sam.s3.amazonaws.com/templates/3.0.0.yaml) CloudFormation stack. +1. Fill in `DdApiKey` and select the appropriate `DdSite`. +1. All other parameters are optional, leave them as default. +1. You can find the installed Forwarder under the stack's "Resources" tab. +1. Set up triggers to the installed Forwarder either [manually](https://docs.datadoghq.com/integrations/amazon_web_services/?tab=allpermissions#manually-setup-triggers) or [automatically](https://docs.datadoghq.com/integrations/amazon_web_services/?tab=allpermissions#automatically-setup-triggers). +1. Repeat the above steps in another region if you operate in multiple AWS regions. -### 8. (optional) Filtering rules +## Update -Use the `EXCLUDE_AT_MATCH` OR `INCLUDE_AT_MATCH` environment variables to filter logs based on a regular expression match: +### Upgrade to a different version -- To use `EXCLUDE_AT_MATCH` add it as an environment variable and set its value to a regular expression. Logs matching the regular expression are excluded. -- To use `INCLUDE_AT_MATCH` add it as an environment variable and set its value to a regular expression. If not excluded by `EXCLUDE_AT_MATCH`, logs matching the regular expression are included. -- If a log matches both the inclusion and exclusion criteria, it is excluded. -- Filtering rules are applied to the full JSON-formatted log, including any metadata that is automatically added by the function. +1. Find the [datadog-serverless (if you didn't rename it)](https://console.aws.amazon.com/cloudformation/home#/stacks?filteringText=datadog) CloudFormation stack. +1. Update the stack using template `https://dd-log-sam.s3.amazonaws.com/templates/.yaml`. The latest version can be found in the [template.yaml](template.yaml). -### 9. (optional) Multiline Log support for s3 +### Adjust forwarder settings -If there are multiline logs in s3, set `DD_MULTILINE_LOG_REGEX_PATTERN` environment variable to the specified regex pattern to detect for a new log line. +1. Find the [datadog-serverless (if you didn't rename it)](https://console.aws.amazon.com/cloudformation/home#/stacks?filteringText=datadog) CloudFormation stack. +1. Update the stack using the current template. +1. Adjust parameter values. -- Example: for multiline logs beginning with pattern `11/10/2014`: `DD_MULTILINE_LOG_REGEX_PATTERN="\d{2}\/\d{2}\/\d{4}"` +Note: It's recommended to adjust forwarder settings through CloudFormation rather than directly editing the Lambda function. The description of settings can be found in the [template.yaml](template.yaml) and the CloudFormation stack creation user interface when you launch the stack. Feel free to submit a pull request to make additional settings adjustable through the template. -### 10. (optional) Disable log forwarding +## Troubleshoot -The datadog forwarder **ALWAYS** forwards logs by default. If you do NOT use the Datadog log management product, you **MUST** set environment variable `DD_FORWARD_LOG` to `False`, to avoid sending logs to Datadog. The forwarder will then only forward other observability data, such as metrics. +Set the environment variable `DD_LOG_LEVEL` to `debug` on the Forwarder Lambda function to enable detailed logging temporarily (don't forget to remove it). If the debug logs don't help, please contact [Datadog support](https://www.datadoghq.com/support/). -### 11. (optional) Disable SSL validation +## Notes -If you need to ignore SSL certificate validation when forwarding logs using HTTPS, you can set the environment variable `DD_SKIP_SSL_VALIDATION` to `True`. -This will still encrypt the traffic between the forwarder and the endpoint provided with `DD_URL` but will not check if the destination SSL certificate is valid. +* For S3 logs, there may be some latency between the time a first S3 log file is posted and the Lambda function wakes up. +* Currently, the forwarder has to be deployed manually to GovCloud and China, and supports only log forwarding. + 1. Create a Lambda function using `aws-dd-forwarder-.zip` from the latest [releases](https://github.com/DataDog/datadog-serverless-functions/releases). + 1. Save your Datadog API key in AWS Secrets Manager, and set environment variable `DD_API_KEY_SECRET_ARN` with the secret ARN on the Lambda function. + 1. Configure [triggers](https://docs.datadoghq.com/integrations/amazon_web_services/?tab=allpermissions#send-aws-service-logs-to-datadog). \ No newline at end of file diff --git a/aws/logs_monitoring/lambda_function.py b/aws/logs_monitoring/lambda_function.py index 94520cbbb..bc6d6d0e1 100644 --- a/aws/logs_monitoring/lambda_function.py +++ b/aws/logs_monitoring/lambda_function.py @@ -1,8 +1,3 @@ -# IMPORTANT NOTE: When upgrading, please ensure your forwarder Lambda function -# has the latest Datadog Lambda Layer installed. -# https://github.com/DataDog/datadog-serverless-functions/tree/master/aws/logs_monitoring#3-add-the-datadog-lambda-layer - - # Unless explicitly stated otherwise all files in this repository are licensed # under the Apache License Version 2.0. # This product includes software developed at Datadog (https://www.datadoghq.com/). @@ -42,7 +37,6 @@ try: from enhanced_lambda_metrics import parse_and_submit_enhanced_metrics - DD_ENHANCED_LAMBDA_METRICS = True except ImportError: DD_ENHANCED_LAMBDA_METRICS = False @@ -51,13 +45,14 @@ "will not be submitted. Ensure you've included the enhanced_lambda_metrics " "file in your Lambda project." ) +finally: + log.debug(f"DD_ENHANCED_LAMBDA_METRICS: {DD_ENHANCED_LAMBDA_METRICS}") try: # Datadog Lambda layer is required to forward metrics from datadog_lambda.wrapper import datadog_lambda_wrapper from datadog_lambda.metric import lambda_stats - DD_FORWARD_METRIC = True except ImportError: log.debug( @@ -65,20 +60,30 @@ ) # For backward-compatibility DD_FORWARD_METRIC = False +finally: + log.debug(f"DD_FORWARD_METRIC: {DD_FORWARD_METRIC}") try: # Datadog Trace Layer is required to forward traces from trace_forwarder.connection import TraceConnection - DD_FORWARD_TRACES = True except ImportError: # For backward-compatibility DD_FORWARD_TRACES = False +finally: + log.debug(f"DD_FORWARD_TRACES: {DD_FORWARD_TRACES}") -# Return the boolean environment variable corresponding to envvar -def get_bool_env_var(envvar, default): - return os.getenv(envvar, default=default).lower() == "true" +def get_env_var(envvar, default, boolean=False): + """ + Return the value of the given environment variable with debug logging. + When boolean=True, parse the value as a boolean case-insensitively. + """ + value = os.getenv(envvar, default=default) + if boolean: + value = value.lower() == "true" + log.debug(f"{envvar}: {value}") + return value ##################################### ############# PARAMETERS ############ @@ -97,20 +102,20 @@ def get_bool_env_var(envvar, default): ## Set this variable to `False` to disable log forwarding. ## E.g., when you only want to forward metrics from logs. # -DD_FORWARD_LOG = get_bool_env_var("DD_FORWARD_LOG", "true") +DD_FORWARD_LOG = get_env_var("DD_FORWARD_LOG", "true", boolean=True) ## @param DD_USE_TCP - boolean - optional -default: false ## Change this value to `true` to send your logs and metrics using the TCP network client ## By default, it uses the HTTP client. # -DD_USE_TCP = get_bool_env_var("DD_USE_TCP", "false") +DD_USE_TCP = get_env_var("DD_USE_TCP", "false", boolean=True) ## @param DD_USE_COMPRESSION - boolean - optional -default: true ## Only valid when sending logs over HTTP ## Change this value to `false` to send your logs without any compression applied ## By default, compression is enabled. # -DD_USE_COMPRESSION = get_bool_env_var("DD_USE_COMPRESSION", "true") +DD_USE_COMPRESSION = get_env_var("DD_USE_COMPRESSION", "true", boolean=True) ## @param DD_USE_COMPRESSION - integer - optional -default: 6 ## Change this value to set the compression level. @@ -123,37 +128,39 @@ def get_bool_env_var(envvar, default): ## Change this value to `true` to disable SSL ## Useful when you are forwarding your logs to a proxy. # -DD_NO_SSL = get_bool_env_var("DD_NO_SSL", "false") +DD_NO_SSL = get_env_var("DD_NO_SSL", "false", boolean=True) ## @param DD_SKIP_SSL_VALIDATION - boolean - optional -default: false ## Disable SSL certificate validation when forwarding logs via HTTP. # -DD_SKIP_SSL_VALIDATION = get_bool_env_var("DD_SKIP_SSL_VALIDATION", "false") +DD_SKIP_SSL_VALIDATION = get_env_var( + "DD_SKIP_SSL_VALIDATION", "false", boolean=True +) ## @param DD_SITE - String - optional -default: datadoghq.com ## Define the Datadog Site to send your logs and metrics to. ## Set it to `datadoghq.eu` to send your logs and metrics to Datadog EU site. # -DD_SITE = os.getenv("DD_SITE", default="datadoghq.com") +DD_SITE = get_env_var("DD_SITE", default="datadoghq.com") ## @param DD_TAGS - list of comma separated strings - optional -default: none ## Pass custom tags as environment variable or through this variable. ## Ensure your tags are a comma separated list of strings with no trailing comma in the envvar! # -DD_TAGS = os.environ.get("DD_TAGS", "") +DD_TAGS = get_env_var("DD_TAGS", "") if DD_USE_TCP: - DD_URL = os.getenv("DD_URL", default="lambda-intake.logs." + DD_SITE) + DD_URL = get_env_var("DD_URL", default="lambda-intake.logs." + DD_SITE) try: if "DD_SITE" in os.environ and DD_SITE == "datadoghq.eu": - DD_PORT = int(os.getenv("DD_PORT", default="443")) + DD_PORT = int(get_env_var("DD_PORT", default="443")) else: - DD_PORT = int(os.getenv("DD_PORT", default="10516")) + DD_PORT = int(get_env_var("DD_PORT", default="10516")) except Exception: DD_PORT = 10516 else: - DD_URL = os.getenv("DD_URL", default="lambda-http-intake.logs." + DD_SITE) - DD_PORT = int(os.getenv("DD_PORT", default="443")) + DD_URL = get_env_var("DD_URL", default="lambda-http-intake.logs." + DD_SITE) + DD_PORT = int(get_env_var("DD_PORT", default="443")) class ScrubbingRuleConfig(object): @@ -176,8 +183,8 @@ def __init__(self, name, pattern, placeholder): ), ScrubbingRuleConfig( "DD_SCRUBBING_RULE", - os.getenv("DD_SCRUBBING_RULE", default=None), - os.getenv("DD_SCRUBBING_RULE_REPLACEMENT", default="xxxxx"), + get_env_var("DD_SCRUBBING_RULE", default=None), + get_env_var("DD_SCRUBBING_RULE_REPLACEMENT", default="xxxxx"), ), ] @@ -202,13 +209,18 @@ def compileRegex(rule, pattern): # Filtering logs # Option to include or exclude logs based on a pattern match -INCLUDE_AT_MATCH = os.getenv("INCLUDE_AT_MATCH", default=None) +INCLUDE_AT_MATCH = get_env_var("INCLUDE_AT_MATCH", default=None) include_regex = compileRegex("INCLUDE_AT_MATCH", INCLUDE_AT_MATCH) -EXCLUDE_AT_MATCH = os.getenv("EXCLUDE_AT_MATCH", default=None) +EXCLUDE_AT_MATCH = get_env_var("EXCLUDE_AT_MATCH", default=None) exclude_regex = compileRegex("EXCLUDE_AT_MATCH", EXCLUDE_AT_MATCH) -if "DD_KMS_API_KEY" in os.environ: +if "DD_API_KEY_SECRET_ARN" in os.environ: + SECRET_ARN = os.environ["DD_API_KEY_SECRET_ARN"] + DD_API_KEY = boto3.client("secretsmanager").get_secret_value( + SecretId=SECRET_ARN + )["SecretString"] +elif "DD_KMS_API_KEY" in os.environ: ENCRYPTED = os.environ["DD_KMS_API_KEY"] DD_API_KEY = boto3.client("kms").decrypt( CiphertextBlob=base64.b64decode(ENCRYPTED) @@ -224,7 +236,7 @@ def compileRegex(rule, pattern): # DD_API_KEY must be set if DD_API_KEY == "" or DD_API_KEY == "": raise Exception( - "You must configure your Datadog API key using " "DD_KMS_API_KEY or DD_API_KEY" + "Missing Datadog API key" ) # Check if the API key is the correct number of characters if len(DD_API_KEY) != 32: @@ -245,10 +257,11 @@ def compileRegex(rule, pattern): "https://trace.agent.{}".format(DD_SITE), DD_API_KEY ) -# DD_MULTILINE_LOG_REGEX_PATTERN: Datadog Multiline Log Regular Expression Pattern -DD_MULTILINE_LOG_REGEX_PATTERN = None -if "DD_MULTILINE_LOG_REGEX_PATTERN" in os.environ: - DD_MULTILINE_LOG_REGEX_PATTERN = os.environ["DD_MULTILINE_LOG_REGEX_PATTERN"] +# DD_MULTILINE_LOG_REGEX_PATTERN: Multiline Log Regular Expression Pattern +DD_MULTILINE_LOG_REGEX_PATTERN = get_env_var( + "DD_MULTILINE_LOG_REGEX_PATTERN", default=None +) +if DD_MULTILINE_LOG_REGEX_PATTERN: try: multiline_regex = re.compile( "[\n\r\f]+(?={})".format(DD_MULTILINE_LOG_REGEX_PATTERN) @@ -269,7 +282,7 @@ def compileRegex(rule, pattern): DD_CUSTOM_TAGS = "ddtags" DD_SERVICE = "service" DD_HOST = "host" -DD_FORWARDER_VERSION = "2.4.0" +DD_FORWARDER_VERSION = "3.0.0" class RetriableException(Exception): pass @@ -533,17 +546,22 @@ def forward_logs(logs): scrubber = DatadogScrubber(SCRUBBING_RULE_CONFIGS) if DD_USE_TCP: batcher = DatadogBatcher(256 * 1000, 256 * 1000, 1) - cli = DatadogTCPClient(DD_URL, DD_PORT, DD_NO_SSL, DD_API_KEY, scrubber) + cli = DatadogTCPClient( + DD_URL, DD_PORT, DD_NO_SSL, DD_API_KEY, scrubber) else: batcher = DatadogBatcher(256 * 1000, 2 * 1000 * 1000, 200) - cli = DatadogHTTPClient(DD_URL, DD_PORT, DD_NO_SSL, DD_SKIP_SSL_VALIDATION, DD_API_KEY, scrubber) + cli = DatadogHTTPClient( + DD_URL, DD_PORT, DD_NO_SSL, + DD_SKIP_SSL_VALIDATION, DD_API_KEY, scrubber) with DatadogClient(cli) as client: for batch in batcher.batch(logs): try: client.send(batch) except Exception: - log.exception("Exception while forwarding logs in batch %s", batch) + log.exception(f"Exception while forwarding log batch {batch}") + else: + log.debug(f"Forwarded {len(batch)} logs") def parse(event, context): @@ -717,15 +735,19 @@ def forward_metrics(metrics): metric["m"], metric["v"], timestamp=metric["e"], tags=metric["t"] ) except Exception: - log.exception("Exception while forwarding metric %s", metric) + log.exception(f"Exception while forwarding metric {metric}") + else: + log.debug(f"Forwarded metric: {metric}") def forward_traces(traces): - try: - for trace in traces: + for trace in traces: + try: trace_connection.send_trace(trace) - except Exception as e: - print(e) + except Exception: + log.exception(f"Exception while forwarding trace {trace}") + else: + log.debug(f"Forwarded trace: {trace}") # Utility functions diff --git a/aws/logs_monitoring/log-sam-template.yaml b/aws/logs_monitoring/log-sam-template.yaml deleted file mode 100644 index 820b825ed..000000000 --- a/aws/logs_monitoring/log-sam-template.yaml +++ /dev/null @@ -1,17 +0,0 @@ -AWSTemplateFormatVersion: '2010-09-09' -Transform: AWS::Serverless-2016-10-31 -Description: Pushes logs and metrics from AWS to Datadog. -Resources: - loglambdaddfunction: - Type: 'AWS::Serverless::Function' - Properties: - Description: Pushes logs and metrics from AWS to Datadog. - Handler: lambda_function.lambda_handler - MemorySize: 1024 - Runtime: python2.7 - Timeout: 120 - Layers: - - !Sub 'arn:aws:lambda:${AWS::Region}:464622532012:layer:Datadog-Python27:3' - - !Sub 'arn:aws:lambda:${AWS::Region}:464622532012:layer:Datadog-Trace-Forwarder-Python27:1' - - Type: AWS::Serverless::Function diff --git a/aws/logs_monitoring/release.sh b/aws/logs_monitoring/release.sh new file mode 100755 index 000000000..682c43c62 --- /dev/null +++ b/aws/logs_monitoring/release.sh @@ -0,0 +1,82 @@ +#!/bin/bash + +# Usage: ./release.sh + +set -e + +# Read the S3 bucket +if [ -z "$1" ]; then + echo "Must specify a S3 bucket to publish the template" + exit 1 +else + BUCKET=$1 +fi + +# Get the latest code +git checkout master +git pull origin master + +# Read the current version +CURRENT_VERSION=$(grep -o 'Version: \d\+\.\d\+\.\d\+' template.yaml | cut -d' ' -f2) + +# Read the desired version +if [ -z "$2" ]; then + echo "Must specify a desired version number" + exit 1 +elif [[ ! $2 =~ [0-9]+\.[0-9]+\.[0-9]+ ]]; then + echo "Must use a semantic version, e.g., 3.1.4" + exit 1 +elif [[ ! "$2" > "$CURRENT_VERSION" ]]; then + echo "Must use a version greater than the current ($CURRENT_VERSION)" + exit 1 +else + VERSION=$2 +fi + +# Make the template private (default is public) - useful for developers +if [[ $# -eq 3 ]] && [[ $3 = "--private" ]]; then + PRIVATE_TEMPLATE=true +else + PRIVATE_TEMPLATE=false +fi + +# Validate the template +echo "Validating template.yaml" +aws cloudformation validate-template --template-body file://template.yaml + +# Confirm to proceed +read -p "About to bump the version from ${CURRENT_VERSION} to ${VERSION}, create a release aws-dd-forwarder-${VERSION} on Github and upload the template.yaml to s3://${BUCKET}/templates/${VERSION}.yaml. Continue (y/n)?" CONT +if [ "$CONT" != "y" ]; then + echo "Exiting" + exit 1 +fi + +# Bump version number +echo "Bumping the current version number to the desired" +perl -pi -e "s/DD_FORWARDER_VERSION = \"${CURRENT_VERSION}/DD_FORWARDER_VERSION = \"${VERSION}/g" lambda_function.py +perl -pi -e "s/Version: ${CURRENT_VERSION}/Version: ${VERSION}/g" template.yaml +perl -pi -e "s/templates\/${CURRENT_VERSION}/templates\/${VERSION}/g" README.md + +# Commit version number changes to git +git add lambda_function.py template.yaml README.md +git commit -m "Bump version from ${CURRENT_VERSION} to ${VERSION}" +git push origin master + +# Create a github release +echo "Release aws-dd-forwarder-${VERSION} to github" +go get github.com/github/hub +rm -f aws-dd-forwarder-*.zip +zip -r aws-dd-forwarder-${VERSION}.zip . +hub release create -a aws-dd-forwarder-${VERSION}.zip -m "aws-dd-forwarder-${VERSION}" aws-dd-forwarder-${VERSION} + +# Upload the template to the S3 bucket +echo "Uploading template.yaml to s3://${BUCKET}/templates/${VERSION}.yaml" +if [ "$PRIVATE_TEMPLATE" = true ] ; then + aws s3 cp template.yaml s3://${BUCKET}/templates/${VERSION}.yaml +else + aws s3 cp template.yaml s3://${BUCKET}/templates/${VERSION}.yaml --grants read=uri=http://acs.amazonaws.com/groups/global/AllUsers +fi +echo "Done uploading the template, and here is the CloudFormation quick launch URL" +echo "https://console.aws.amazon.com/cloudformation/home#/stacks/new?stackName=datadog-serverless&templateURL=https://${BUCKET}.s3.amazonaws.com/templates/${VERSION}.yaml" + +echo "Done!" diff --git a/aws/logs_monitoring/template.yaml b/aws/logs_monitoring/template.yaml new file mode 100644 index 000000000..68e138903 --- /dev/null +++ b/aws/logs_monitoring/template.yaml @@ -0,0 +1,528 @@ +AWSTemplateFormatVersion: '2010-09-09' +Transform: AWS::Serverless-2016-10-31 +Description: Pushes logs, metrics and traces from AWS to Datadog. +Mappings: + Constants: + DdForwarder: + Version: 3.0.0 +Parameters: + DdApiKey: + Type: String + NoEcho: true + Default: '' + Description: The Datadog API key, which can be found from the APIs page (/account/settings#api). It will be stored in AWS Secrets Manager securely. + DdSite: + Type: String + Default: datadoghq.com + AllowedValues: + - datadoghq.com + - datadoghq.eu + Description: Define your Datadog Site to send data to -- select datadoghq.com for the Datadog US site or datadoghq.eu for the Datadog EU site. + FunctionName: + Type: String + Default: DatadogForwarder + Description: The Datadog Forwarder Lambda function name. DO NOT change when updating an existing CloudFormation stack, otherwise the current forwarder function will be replaced and all the triggers will be lost. + MemorySize: + Type: Number + Default: 1024 + Description: Memory size for the Datadog Forwarder Lambda function + Timeout: + Type: Number + Default: 120 + Description: Timeout for the Datadog Forwarder Lambda function + ReservedConcurrency: + Type: Number + Default: 100 + Description: Reserved concurrency for the Datadog Forwarder Lambda function + LogRetentionInDays: + Type: Number + Default: 90 + Description: CloudWatch log retention for logs generated by the Datadog Forwarder Lambda function + SourceZipUrl: + Type: String + Default: '' + Description: DO NOT CHANGE unless you know what you are doing. Override the default location of the function source code. + DdTags: + Type: String + Default: '' + Description: Add custom tags to forwarded logs, comma-delimited string, no trailing comma, e.g., env:prod,stack:classic + DdFetchLambdaTags: + Type: String + Default: false + AllowedValues: + - true + - false + Description: Let the forwarder fetch Lambda tags using GetResources API calls and apply them to the aws.lambda.enhanced.* metrics parsed from the REPORT log. If set to true, permission tag:GetResources will be automatically added to the Lambda execution IAM role. The tags are cached in memory so that they'll only be fetched when the function cold starts or when the TTL (1 hour) expires. The forwarder increments the aws.lambda.enhanced.get_resources_api_calls metric for each API call made. + DdUseTcp: + Type: String + Default: false + AllowedValues: + - true + - false + Description: By default, the forwarder sends logs using HTTPS through the port 443. To send logs over a SSL encrypted TCP, set this parameter to true. + DdNoSsl: + Type: String + Default: false + AllowedValues: + - true + - false + Description: Disable SSL when forwarding logs, set to true when forwarding logs through a proxy. + DdUrl: + Type: String + Default: '' + Description: The endpoint URL to forward the logs to, useful for forwarding logs through a proxy + DdPort: + Type: String + Default: '' + Description: The endpoint port to forward the logs to, useful for forwarding logs through a proxy + DdSkipSslValidation: + Type: String + Default: false + AllowedValues: + - true + - false + Description: Send logs over HTTPS, while NOT validating the certificate provided by the endpoint. This will still encrypt the traffic between the forwarder and the log intake endpoint, but will not verify if the destination SSL certificate is valid. + RedactIp: + Type: String + Default: false + AllowedValues: + - true + - false + Description: Replace text matching \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} with xxx.xxx.xxx.xxx + RedactEmail: + Type: String + Default: false + AllowedValues: + - true + - false + Description: Replace text matching [a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+ with xxxxx@xxxxx.com + DdScrubbingRule: + Type: String + Default: '' + Description: Replace text matching the supplied regular expression with xxxxx (default) or DdScrubbingRuleReplacement (if supplied). Log scrubbing rule is applied to the full JSON-formatted log, including any metadata that is automatically added by the Lambda function. Each instance of a pattern match is replaced until no more matches are found in each log. + DdScrubbingRuleReplacement: + Type: String + Default: '' + Description: Replace text matching DdScrubbingRule with the supplied text + ExcludeAtMatch: + Type: String + Default: '' + Description: DO NOT send logs matching the supplied regular expression. If a log matches both the ExcludeAtMatch and IncludeAtMatch, it is excluded. Filtering rules are applied to the full JSON-formatted log, including any metadata that is automatically added by the function. + IncludeAtMatch: + Type: String + Default: '' + Description: Only send logs matching the supplied regular expression and not excluded by ExcludeAtMatch. + DdMultilineLogRegexPattern: + Type: String + Default: '' + Description: Use the supplied regular expression to detect for a new log line for multiline logs from S3, e.g., use expression "\d{2}\/\d{2}\/\d{4}" for multiline logs beginning with pattern "11/10/2014". + DdForwardLog: + Type: String + Default: true + AllowedValues: + - true + - false + Description: Set to false to disable log forwarding, while keep the forwarder forward other observability data, such as metrics and traces from Lambda functions. + DdUseCompression: + Type: String + Default: true + AllowedValues: + - true + - false + Description: Set to false to disable log compression. Only valid when sending logs over HTTP. + DdCompressionLevel: + Type: Number + Default: 6 + AllowedValues: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] + Description: Set the compression level from 0 (no compression) to 9 (best compression). +Conditions: + SetFunctionName: + Fn::Not: + - Fn::Equals: + - Ref: FunctionName + - 'DatadogForwarder' + SetSourceZipUrl: + Fn::Not: + - Fn::Equals: + - Ref: SourceZipUrl + - '' + SetDdTags: + Fn::Not: + - Fn::Equals: + - Ref: DdTags + - '' + SetDdUseTcp: + Fn::Equals: + - Ref: DdUseTcp + - true + SetDdNoSsl: + Fn::Equals: + - Ref: DdNoSsl + - true + SetDdUrl: + Fn::Not: + - Fn::Equals: + - Ref: DdUrl + - '' + SetDdPort: + Fn::Not: + - Fn::Equals: + - Ref: DdPort + - '' + SetRedactIp: + Fn::Equals: + - Ref: RedactIp + - true + SetRedactEmail: + Fn::Equals: + - Ref: RedactEmail + - true + SetDdScrubbingRule: + Fn::Not: + - Fn::Equals: + - Ref: DdScrubbingRule + - '' + SetDdScrubbingRuleReplacement: + Fn::Not: + - Fn::Equals: + - Ref: DdScrubbingRuleReplacement + - '' + SetExcludeAtMatch: + Fn::Not: + - Fn::Equals: + - Ref: ExcludeAtMatch + - '' + SetIncludeAtMatch: + Fn::Not: + - Fn::Equals: + - Ref: IncludeAtMatch + - '' + SetDdMultilineLogRegexPattern: + Fn::Not: + - Fn::Equals: + - Ref: DdMultilineLogRegexPattern + - '' + SetDdSkipSslValidation: + Fn::Equals: + - Ref: DdSkipSslValidation + - true + SetDdFetchLambdaTags: + Fn::Equals: + - Ref: DdFetchLambdaTags + - true + SetDdForwardLog: + Fn::Equals: + - Ref: DdForwardLog + - false + SetDdUseCompression: + Fn::Equals: + - Ref: DdUseCompression + - false + SetDdCompressionLevel: + Fn::Not: + - Fn::Equals: + - Ref: DdCompressionLevel + - 6 +Resources: + Forwarder: + Type: AWS::Serverless::Function + DependsOn: ForwarderZip + Properties: + FunctionName: + Fn::If: + - SetFunctionName + - Ref: FunctionName + - Ref: AWS::NoValue + Description: Pushes logs, metrics and traces from AWS to Datadog. + Handler: lambda_function.lambda_handler + MemorySize: + Ref: MemorySize + Runtime: python3.7 + Timeout: + Ref: Timeout + CodeUri: + Bucket: !Ref ForwarderZipsBucket + Key: + Fn::Sub: + - 'aws-dd-forwarder-${DdForwarderVersion}.zip' + - { DdForwarderVersion: !FindInMap [Constants, DdForwarder, Version] } + Layers: + - Fn::Sub: arn:aws:lambda:${AWS::Region}:464622532012:layer:Datadog-Python37:11 + - Fn::Sub: arn:aws:lambda:${AWS::Region}:464622532012:layer:Datadog-Trace-Forwarder-Python37:3 + Tags: + dd_forwarder_version: !FindInMap [Constants, DdForwarder, Version] + Environment: + Variables: + DD_API_KEY_SECRET_ARN: + Ref: DdApiKeySecret + DD_SITE: + Ref: DdSite + DD_TAGS: + Fn::If: + - SetDdTags + - Ref: DdTags + - Ref: AWS::NoValue + DD_FETCH_LAMBDA_TAGS: + Fn::If: + - SetDdFetchLambdaTags + - Ref: DdFetchLambdaTags + - Ref: AWS::NoValue + DD_USE_TCP: + Fn::If: + - SetDdUseTcp + - Ref: DdUseTcp + - Ref: AWS::NoValue + DD_NO_SSL: + Fn::If: + - SetDdNoSsl + - Ref: DdNoSsl + - Ref: AWS::NoValue + DD_URL: + Fn::If: + - SetDdUrl + - Ref: DdUrl + - Ref: AWS::NoValue + DD_PORT: + Fn::If: + - SetDdPort + - Ref: DdPort + - Ref: AWS::NoValue + REDACT_IP: + Fn::If: + - SetRedactIp + - Ref: RedactIp + - Ref: AWS::NoValue + REDACT_EMAIL: + Fn::If: + - SetRedactEmail + - Ref: RedactEmail + - Ref: AWS::NoValue + DD_SCRUBBING_RULE: + Fn::If: + - SetDdScrubbingRule + - Ref: DdScrubbingRule + - Ref: AWS::NoValue + DD_SCRUBBING_RULE_REPLACEMENT: + Fn::If: + - SetDdScrubbingRuleReplacement + - Ref: DdScrubbingRuleReplacement + - Ref: AWS::NoValue + EXCLUDE_AT_MATCH: + Fn::If: + - SetExcludeAtMatch + - Ref: ExcludeAtMatch + - Ref: AWS::NoValue + INCLUDE_AT_MATCH: + Fn::If: + - SetIncludeAtMatch + - Ref: IncludeAtMatch + - Ref: AWS::NoValue + DD_MULTILINE_LOG_REGEX_PATTERN: + Fn::If: + - SetDdMultilineLogRegexPattern + - Ref: DdMultilineLogRegexPattern + - Ref: AWS::NoValue + DD_SKIP_SSL_VALIDATION: + Fn::If: + - SetDdSkipSslValidation + - Ref: DdSkipSslValidation + - Ref: AWS::NoValue + DD_FORWARD_LOG: + Fn::If: + - SetDdForwardLog + - Ref: DdForwardLog + - Ref: AWS::NoValue + DD_USE_COMPRESSION: + Fn::If: + - SetDdUseCompression + - Ref: DdUseCompression + - Ref: AWS::NoValue + DD_COMPRESSION_LEVEL: + Fn::If: + - SetDdCompressionLevel + - Ref: DdCompressionLevel + - Ref: AWS::NoValue + ReservedConcurrentExecutions: + Ref: ReservedConcurrency + Policies: + - Version: '2012-10-17' + Statement: + - Effect: Allow + Action: + - s3:GetObject + Resource: 'arn:aws:s3:::*' + - Effect: Allow + Action: + - secretsmanager:GetSecretValue + Resource: + Ref: DdApiKeySecret + - !If + - SetDdFetchLambdaTags + - + Effect: Allow + Action: + - tag:GetResources + Resource: '*' + - Ref: AWS::NoValue + LogGroup: + Type: AWS::Logs::LogGroup + Properties: + LogGroupName: + Fn::Sub: /aws/lambda/${Forwarder} + RetentionInDays: + Ref: LogRetentionInDays + DdApiKeySecret: + Type: AWS::SecretsManager::Secret + Properties: + Description: Datadog API Key + SecretString: + Ref: DdApiKey + ForwarderZipsBucket: + Type: AWS::S3::Bucket + ForwarderZip: + Type: Custom::ForwarderZip + Properties: + ServiceToken: !GetAtt 'ForwarderZipCopier.Arn' + DestZipsBucket: !Ref 'ForwarderZipsBucket' + SourceZipUrl: + Fn::If: + - SetSourceZipUrl + - !Ref SourceZipUrl + - Fn::Sub: + - 'https://github.com/DataDog/datadog-serverless-functions/releases/download/aws-dd-forwarder-${DdForwarderVersion}/aws-dd-forwarder-${DdForwarderVersion}.zip' + - { DdForwarderVersion: !FindInMap [Constants, DdForwarder, Version] } + ForwarderZipCopier: + Type: AWS::Serverless::Function + Properties: + Description: Copies Datadog Forwarder zip to the destination S3 bucket + Handler: index.handler + Runtime: python3.7 + Timeout: 300 + InlineCode: | + import json + import logging + import threading + import boto3 + import urllib.request + def send_cfn_resp(event, context, response_status): + resp_body = json.dumps({ + 'Status': response_status, + 'Reason': f'See reasons in CloudWatch Logs - group: {context.log_group_name}, stream:{context.log_stream_name}', + 'PhysicalResourceId': context.log_stream_name, + 'StackId': event['StackId'], + 'RequestId': event['RequestId'], + 'LogicalResourceId': event['LogicalResourceId'], + 'Data': {} + }).encode('utf-8') + req = urllib.request.Request(url=event['ResponseURL'], data=resp_body, method='PUT') + with urllib.request.urlopen(req) as f: + logging.info(f'Sent response to CloudFormation: {f.status}, {f.reason}') + def delete_zips(bucket): + s3 = boto3.resource('s3') + bucket = s3.Bucket(bucket) + bucket.objects.all().delete() + def copy_zip(source_zip_url, dest_zips_bucket): + s3 = boto3.client('s3') + filename = source_zip_url.split('/')[-1] + with urllib.request.urlopen(source_zip_url) as data: + s3.upload_fileobj(data, dest_zips_bucket, filename) + def timeout(event, context): + logging.error('Execution is about to time out, sending failure response to CloudFormation') + send_cfn_resp(event, context, 'FAILED') + def handler(event, context): + # make sure we send a failure to CloudFormation if the function + # is going to timeout + timer = threading.Timer((context.get_remaining_time_in_millis() + / 1000.00) - 0.5, timeout, args=[event, context]) + timer.start() + logging.info(f'Received event: {json.dumps(event)}') + try: + source_zip_url = event['ResourceProperties']['SourceZipUrl'] + dest_zips_bucket = event['ResourceProperties']['DestZipsBucket'] + if event['RequestType'] == 'Delete': + delete_zips(dest_zips_bucket) + else: + copy_zip(source_zip_url, dest_zips_bucket) + except Exception as e: + logging.exception(f'Exception when copying zip from {source_zip_url} to {dest_zips_bucket}') + send_cfn_resp(event, context, 'FAILED') + else: + send_cfn_resp(event, context, 'SUCCESS') + finally: + timer.cancel() + Policies: + - Version: '2012-10-17' + Statement: + - Effect: Allow + Action: + - s3:PutObject + - s3:DeleteObject + Resource: + - Fn::Join: + - '/' + - - Fn::GetAtt: 'ForwarderZipsBucket.Arn' + - '*' + - Effect: Allow + Action: + - s3:ListBucket + Resource: + - Fn::GetAtt: 'ForwarderZipsBucket.Arn' +Outputs: + DatadogForwarderArn: + Description: Datadog Forwarder Lambda Function ARN + Value: + Fn::GetAtt: + - Forwarder + - Arn + Export: + Name: + Fn::Sub: ${AWS::StackName}-ForwarderArn +Metadata: + AWS::CloudFormation::Interface: + ParameterGroups: + - Label: + default: Required + Parameters: + - DdApiKey + - DdSite + - Label: + default: Lambda Function (Optional) + Parameters: + - FunctionName + - MemorySize + - Timeout + - ReservedConcurrency + - LogRetentionInDays + - Label: + default: Log Forwarding (Optional) + Parameters: + - DdTags + - DdMultilineLogRegexPattern + - DdUseTcp + - DdNoSsl + - DdUrl + - DdPort + - DdSkipSslValidation + - DdUseCompression + - DdCompressionLevel + - DdForwardLog + - Label: + default: Log Scrubbing (Optional) + Parameters: + - RedactIp + - RedactEmail + - DdScrubbingRule + - DdScrubbingRuleReplacement + - Label: + default: Log Filtering (Optional) + Parameters: + - ExcludeAtMatch + - IncludeAtMatch + - Label: + default: Experimental (Optional) + Parameters: + - DdFetchLambdaTags + - Label: + default: Advanced (Optional) + Parameters: + - SourceZipUrl \ No newline at end of file