Skip to content
This repository has been archived by the owner on May 13, 2021. It is now read-only.

Commit

Permalink
initial hackery
Browse files Browse the repository at this point in the history
added info

move api key to defaults

Add real username

Change order

install latest datadog-agent release

These changes are required to install the latest 5.0 release of the agent that is not affected by POODLE.
I had to swtich to ``stable``. I'm not sure why the latest release is not on ``unstable``.

Initial Redhat Support

fixed apt usage in redhat

add release appropriate psutils

tested on centos 5.x

psutil is included with the 'omnibus' install of the agent

Add tags to the datadog config file

updated to include process checks

updated process checks to align correctly; updated readme with new process check feature

Parameterize the use_mount setting in datadog.conf

make datadog config fully parameterized

quote variables

Add template & rendering for datadog /etc/conf.d/*.yaml files

Add info about "datadog_checks" and "datadog_use_mount" to README.md

Add info about "datadog_config" to README.md

README fixes; Include null default for datadog_checks

default value for datadog_checks needs to be an empty dict

Only create /etc/dd-agent/conf.d/process.yaml when datadog_process_checks is defined

Adding enabled flag for Packer builds, preprod envs, etc

Deprecate process check separate handling

The common `datadog_checks` interface should be used instead

Add more supported OS versions to metadata

Factorize datadog.conf creation

[olivier.vielpeau@datadoghq.com] Also factorized the service task

Remove need to redefine default vars in playbook

`datadog_config` does not require `api_key`, `dd_url` and `use_mount` anymore,
the values of respectively `datadog_api_key`, `datadog_url` and `datadog_use_mount`
are used if defined.

Template YUM repo to support different archs

[olivier.vielpeau@datadoghq.com] Use `template` task instead of `copy`

Delete unused vars/

Prepare ansible galaxy release

Update README and metadata

Fix markdown formatting of author info in README

[readme] Mention that the role needs sudo rights

Related to #3

[apt] Add apt-transport-https to the debian install

In preparation for HTTPS repo.

Changed keyserver to use port 80

Changed the keyserver to utilize the supported port 80 instead of the
default 11371 for use on networks that block/firewall non-standard
ports. When specifying a port number, you need to prepend the URL with
“hkp://“.

[yum] Check signature of the agent's RPM package

For Datadog Agent 5.5.0, the RPM package is signed with a GPG key. Let's
check the validity of this key when installing the RPM package with
Ansible.

It also enables the use of HTTPS on our yum repository.

Adds explicit file permissions

This came about due to issues in the environment from which I work in.
We are bound by strict CIS rules which caused the files dropped by
ansible to have a mode of 0600 and owned by user root.  Preventing the
`dd-agent` user from reading his own config files.

This PR simply enforces that user to own his the files such that he can
read them allowing the daemon to start.

Note that this PR assumes the user is dd-agent.  While this is true for
EL, I'm unaware of what the user is for Debian packages.

Sets default user to root

* Uses an override for the red hat package

Sets default user to dd-agent

* Per recommendation set the default user to dd-agent
* I left the group as root as this would fail on the debian package
* Removes override allowing end users to customize as needed.  I simply overly thought out the solution anyways

Add a link to the role's Ansible Galaxy page in the README

Add Galaxy install to README

Closes #17

Avoids bare variables which is deprecated

[readme] Use `become: yes` instead of `sudo:yes`

`become: yes` is preferred since Ansible 1.9

[meta] Set min Ansible version to 1.6

Since we're using `keyserver` on `apt_key`

[meta] Add Xenial to supported platforms

Add Changelog, and prepare initial release

Allow apt repo settings to be set by playbook user

To enable deploys using a local apt mirror.

New variables:

 * datadog_apt_repo
 * datadog_apt_key_url

[changelog] Prepare 1.1.0 release

Adding datadog tracing to ansible-datadog

Updating config files

Adding ability to add check agents to main.yml

Adding datadock config checks to datadog role

Adding datadock config checks to datadog role

Adding hadoop and mysql steps to datadog

Fixing snake-bit issue

updating repo to default to disabled

Enable repo for datadog trace agent

Adding monitoring acceptance tests

Adding mysql

Removing datadog tracing since its bundled with latest dd-agent

Adding NTP as a default datadog check

Revert "Adding NTP as a default datadog check"

Make sure dd-trace-agent is uninstallad before installing datadog agent

Fixing - making yum instead of service

adding ssl cert expiry check files

adding ssl cert expiry check files
  • Loading branch information
bakins authored and Tracey McEvoy committed Mar 29, 2017
0 parents commit da1274b
Show file tree
Hide file tree
Showing 18 changed files with 901 additions and 0 deletions.
14 changes: 14 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
CHANGELOG
=========

# 1.1.0 / 2016-06-27

* [FEATURE] Allow APT repo settings to be user-defined. See [#20][] (thanks [@geoffwright][])

# 1.0.0 / 2016-06-08

Initial release, compatible with Ansible v1 & v2

[#20]: https://github.com/DataDog/ansible-datadog/issues/20

[@geoffwright]: https://github.com/geoffwright
90 changes: 90 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
Ansible Datadog Role
========
[![Ansible Galaxy](http://img.shields.io/badge/galaxy-Datadog.datadog-660198.svg)](https://galaxy.ansible.com/Datadog/datadog/)

Install and configure Datadog base agent & checks.

Installation
------------

```
ansible-galaxy install Datadog.datadog
```

Role Variables
--------------

- `datadog_api_key` - Your Datadog API key.
- `datadog_checks` - YAML configuration for agent checks to drop into `/etc/dd-agent/conf.d`.
- `datadog_config` - Settings to place in `/etc/dd-agent/datadog.conf`.
- `datadog_process_checks` - Array of process checks and options (DEPRECATED: use `process` under
`datadog_checks` instead)
- `datadog_apt_repo` - Override default Datadog `apt` repository
- `datadog_apt_key_url` - Override default url to Datadog `apt` key

Dependencies
------------
None

Example Playbooks
-------------------------
```
- hosts: servers
roles:
- { role: Datadog.datadog, become: yes } # On Ansible < 1.9, use `sudo: yes` instead of `become: yes`
vars:
datadog_api_key: "123456"
datadog_config:
tags: "mytag0, mytag1"
log_level: INFO
datadog_checks:
process:
init_config:
instances:
- name: ssh
search_string: ['ssh', 'sshd' ]
- name: syslog
search_string: ['rsyslog' ]
cpu_check_interval: 0.2
exact_match: true
ignore_denied_access: true
ssh_check:
init_config:
instances:
- host: localhost
port: 22
username: root
password: changeme
sftp_check: True
private_key_file:
add_missing_keys: True
nginx:
init_config:
instances:
- nginx_status_url: http://example.com/nginx_status/
tags:
- instance:foo
- nginx_status_url: http://example2.com:1234/nginx_status/
tags:
- instance:bar
```

```
- hosts: servers
roles:
- { role: Datadog.datadog, become: yes, datadog_api_key: "mykey" } # On Ansible < 1.9, use `sudo: yes` instead of `become: yes`
```

License
-------

Apache2

Author Information
------------------

brian@akins.org

dustinjamesbrown@gmail.com --Forked from brian@akins.org

Datadog <info@datadoghq.com> --Forked from dustinjamesbrown@gmail.com
33 changes: 33 additions & 0 deletions defaults/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
datadog_enabled: yes
datadog_api_key: "youshouldsetthis"

# Comma seperated list of tags
datadog_tags: ""

datadog_url: "https://app.datadoghq.com"
datadog_use_mount: "no"

# default datadog.conf options
datadog_config: {}

# default checks enabled
datadog_checks: {}

# default checks enabled
datadog_check_agents: {}

# default user/group
datadog_user: dd-agent
datadog_group: root

# default apt repo
datadog_apt_repo: "deb http://apt.datadoghq.com/ stable main"

datadog_mysql_host: localhost
datadog_mysql_user: datadog
datadog_mysql_password: ThisNeedsToBeChangedViaVault
datadog_mysql_replication_enabled: True
datadog_mysql_extra_status_enabled: True
datadog_mysql_extra_innodb_enabled: True
datadog_mysql_extra_performance_enabled: True
217 changes: 217 additions & 0 deletions files/nutcracker.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,217 @@
"""
To test this, run 'sudo -u dd-agent dd-agent check nutcracker'
When ready:
- place this file in /etc/dd-agent/checks.d/nutcracker.py
- put the config file in /etc/dd-agent/conf.d/nutcracker.yaml
- service datadog-agent restart
"""

import hashlib
import json
import md5
import memcache
import os
import socket
import sys
import time
import uuid

from checks import AgentCheck

class NutcrackerCheck(AgentCheck):
SOURCE_TYPE_NAME = 'nutcracker'
SERVICE_CHECK = 'nutcracker.can_connect'

DEFAULT_HOST = '127.0.0.1'
DEFAULT_PORT = 11211
DEFAULT_STATS_PORT = 22222

# Pool stats. These descriptions are from 'nutcracker --describe-stats'
POOL_STATS = [
['curr_connections', 'gauge', None], # Number of current connections
['total_connections', 'rate', None], # Running total connections made
['server_ejects', 'rate', None], # times a backend server was ejected
['client_err', 'rate', None], # errors on client connections
]

# Server stats. These descriptions are from 'nutcracker --describe-stats'
SERVER_STATS = [
['server_eof', 'rate', None], # eof on server connections
['server_err', 'rate', None], # errors on server connections
['server_timedout', 'rate', 'timedout'], # timeouts on server connections
['server_connections', 'gauge', 'connections'], # active server connections
['requests', 'rate', None], # requests
['request_bytes', 'rate', None], # total request bytes
['responses', 'rate', None], # responses
['response_bytes', 'rate', None], # total response bytes
['in_queue', 'gauge', None], # requests in incoming queue
['in_queue_bytes', 'gauge', None], # current request bytes in incoming queue
['out_queue', 'gauge', None], # requests in outgoing queue
['out_queue_bytes', 'gauge', None], # current request bytes in outgoing queue
]

def _get_raw_stats(self, host, stats_port):
# Connect
self.log.debug("Connecting to %s:%s", host, stats_port)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((host, stats_port))

# Read
file = s.makefile('r')
data = file.readline();
s.close()

# Load
return json.loads(data);

def _send_datadog_stat(self, item, data, tag_map, prefix):
# Break out the info
stat_key, stat_type, override_name = item

# Make sure we have a name
if not override_name:
override_name = stat_key

# Add the prefix if appropriate.
if prefix:
override_name = prefix + "_" + override_name

try:
# Get the data, make sure it's there.
stat_data = float(data.get(stat_key))
except:
# Hrm, not there. Let it be zero.
stat_data = 0

# Make the datadog metric.
metric = self.normalize(override_name.lower(), self.SOURCE_TYPE_NAME)

tags = [k+":"+v for k,v in tag_map.iteritems()]

if stat_type == 'gauge':
self.gauge(metric, stat_data, tags=tags)
return

if stat_type == 'rate':
metric += "_rate"
self.rate(metric, stat_data, tags=tags)
return

if stat_type == 'bool':
self.gauge(metric, (1 if stat_data else 0), tags=tags)
return

raise Exception("Unknown datadog stat type '%s' for key '%s'" % (stat_type, stat_key))

def _get_metrics(self, host, port, stats_port, tags, aggregation_key):
try:
raw_stats = self._get_raw_stats(host, stats_port)
except Exception as e:
self.service_check(self.SERVICE_CHECK, AgentCheck.CRITICAL)
self.event({
'timestamp': int(time.time()),
'event_type': 'get_stats',
'msg_title': 'Cannot get stats',
'msg_text': str(e),
'aggregation_key': aggregation_key
})

raise


# pprint.pprint(raw_stats)

# Get all the pool stats
for pool_key, pool_data in raw_stats.iteritems():
try:
# Pools are not separated from the other keys, blarg.
# Just check if it's a dict with one of the pool keys, if not then skip it.
pool_data['client_connections']
except:
# Not there, it's not a pool.
self.log.debug(pool_key + ": NOT A POOL");
continue

# Start the stat tags.
tags['nutcracker_pool'] = pool_key

# It's a pool. Process all the non-server stats
for item in self.POOL_STATS:
self._send_datadog_stat(item, pool_data, tags, "pool")

# Find all the servers.
for server_key, server_data in pool_data.iteritems():
try:
# Servers are not separated from the other keys, blarg.
# Just check if it's a dict with one of the server keys, if not then skip it.
server_data['in_queue_bytes']
except:
# Not there, it's not a server.
self.log.debug(server_key + ": NOT A SERVER");
continue

# Set the server in the tags.
tags['nutcracker_pool_server'] = server_key

# It's a server. Send stats.
for item in self.SERVER_STATS:
self._send_datadog_stat(item, server_data, tags, "server")

# The key for our roundtrip tests.
key = uuid.uuid4().hex

try:
# Make the connection and do a round trip.
mc = memcache.Client([host+':'+str(port)], debug=0)

mc.set(key, key)
data = mc.get(key)
mc.delete(key)
empty_data = mc.get(key)

# Did the get work?
if data != key:
raise Exception("Cannot set and get")

# Did the delete work?
if empty_data:
raise Exception("Cannot delete")

except Exception as e:
# Something failed.
metric = self.normalize("test_connect_fail", self.SOURCE_TYPE_NAME)
self.gauge(metric, 1, tags=tags)

self.service_check(self.SERVICE_CHECK, AgentCheck.CRITICAL)
self.event({
'timestamp': int(time.time()),
'event_type': 'test_data',
'msg_title': 'Cannot get/set/delete',
'msg_text': str(e),
'aggregation_key': aggregation_key
})

raise

# Connection is ok.
self.service_check(self.SERVICE_CHECK, AgentCheck.OK)

# Called by datadog as the starting point for this check.
def check(self, instance):
host = instance.get('host', self.DEFAULT_HOST)
port = int(instance.get('port', self.DEFAULT_PORT))
stats_port = int(instance.get('stats_port', self.DEFAULT_STATS_PORT))

tags = {}
for item in instance.get('tags', []):
k, v = item.split(":", 1)
tags[k] = v

tags["host"] = host + ":" + str(port)

aggregation_key = hashlib.md5(host+":"+str(port)).hexdigest()

self._get_metrics(host, port, stats_port, tags, aggregation_key)

Loading

0 comments on commit da1274b

Please sign in to comment.