Skip to content

Commit

Permalink
Merge branch 'master' into quentin/windows-omnibus-5.0
Browse files Browse the repository at this point in the history
* master: (119 commits)
  [core] remove noisy logs (#2715)
  run multiple instances of pylint (#2716)
  [packaging] Release 5.8.5 (#2712)
  [changelog][5.8.5] Add notes on packaging changes (#2710)
  [changelog] Update 5.8.5 with python upgrade
  be more specific when logging ssh errors (#2708)
  [marathon] allow base_url with path (#2620)
  [changelog] Update 5.8.5
  add issue and pr templates
  [sdk] minor tweaks - sdk env detection, check location (#2694)
  [http_check] log exceptions 🔊
  use 0.0.0.0 as server address when non_local_traffic is passed (#2691)
  [elastic] tag stats metric with the node name 🏷 (#2696)
  [openstack] moving proxy logic to AgentCheck, for maintainability. Fixing typos.
  [changelog] 5.8.5 draft
  [tests] lower flakiness of test_no_parallelism (#2690)
  [core] don't use docker hostname if it's a EC2 one (#2661)
  [haproxy] Fix `KeyError` when an unknown status is found (#2681)
  [changelog] Fix md links
  [jenkins] Deprecate check (#2688)
  ...
  • Loading branch information
degemer committed Jul 28, 2016
2 parents f003b19 + 489902e commit bde7c15
Show file tree
Hide file tree
Showing 181 changed files with 4,321 additions and 2,785 deletions.
8 changes: 8 additions & 0 deletions .github/ISSUE_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
This issue queue is primarily intended for tracking features, bugs and work items associated with
the dd-agent open-source project.

Prior to submitting an issue please review the following:

- [ ] [Troubleshooting](https://datadog.zendesk.com/hc/en-us/sections/200766955-Troubleshooting) section of our [Knowledge base](https://datadog.zendesk.com/hc/en-us).
- [ ] Contact our [support](http://docs.datadoghq.com/help/) and [send them your logs](https://github.com/DataDog/dd-agent/wiki/Send-logs-to-support).
- [ ] Finally, you can open a Github issue respecting this [convention](https://github.com/DataDog/dd-agent/blob/master/CONTRIBUTING.md#commits-titles) (it helps us triage).
20 changes: 20 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
*Note: Please remember to review the Datadog [Contribution Guidelines](https://github.com/DataDog/dd-agent/blob/master/CONTRIBUTING.md)
if you have not yet done so.*


### What does this PR do?

A brief description of the change being made with this pull request.

### Motivation

What inspired you to submit this pull request?

### Testing Guidelines

An overview on [testing](https://github.com/DataDog/dd-agent/blob/master/tests/README.md)
is available in our contribution guidelines.

### Additional Notes

Anything else we should know when reviewing?
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ cache:
directories:
- $HOME/virtualenv/python$TRAVIS_PYTHON_VERSION.9
- vendor/cache
- $HOME/embedded

matrix:
fast_finish: true
Expand Down
117 changes: 116 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,90 @@
Changes
=======

# 5.8.5 / 07-27-2016
**Windows, Linux and Source Install**

### Details
https://github.com/DataDog/dd-agent/compare/5.8.4...5.8.5

### Future rotation of the APT signing key

In preparation for a future rotation of our package signing keys, the `5.8.5` DEB package will, on install, import a new
trusted APT key:

```
pub 4096R/382E94DE 2016-06-29 [expires: 2022-06-28]
uid Datadog, Inc <package@datadoghq.com>
```

During the package install, the DEB package will output the following:

```
Prepare Datadog Agent keys rotation
Add the new 'Datadog, Inc <package@datadoghq.com>' key to the list of APT trusted keys. ... OK
```

The signing key of the Agent hasn't changed yet but will be switched to this new key in a future release.

See [dd-agent-omnibus-81](https://github.com/DataDog/dd-agent-omnibus/pull/81)

### Changes
* [IMPROVEMENT] Core: Upgrade embedded python to `2.7.12`. See [omnibus-software-63](https://github.com/DataDog/omnibus-software/pull/63)
* [IMPROVEMENT] Elasticsearch: Take into account node `name` for cluster stats. See [#2696][]
* [IMPROVEMENT] RPM package: Add runtime dependency on the `initscripts` package. See [dd-agent-omnibus-82](https://github.com/DataDog/dd-agent-omnibus/pull/82)
* [IMPROVEMENT] RPM package: Don't delete `dd-agent` user and group on uninstall. See [dd-agent-omnibus-84](https://github.com/DataDog/dd-agent-omnibus/pull/84)

* [BUGFIX] Core: Use flag to signal config reload to avoid race conditions. See [#2670][]
* [BUGFIX] Core: Don't use Docker hostname if it's an EC2 one. See [#2661][]
* [BUGFIX] Docker: Fix disk metrics rounding issue. See [#2626][]
* [BUGFIX] Haproxy: Fix `KeyError` when an unknown status is found. See [#2681][]
* [BUGFIX] IIS: Remove warnings on 'Name' property. See [#2633][]
* [BUGFIX] MongoDB: Fix case where optimeDate is not available. See [#2625][]
* [BUGFIX] Service Discovery: Introduce _get_image_ident and fix a bug that needed it. See [#2684][]
* [BUGFIX] Windows Event log: Support unicode messages. See [#2660][]


# 5.8.4 / 07-08-2016
**Windows, Linux and Source Install**

### Details
https://github.com/DataDog/dd-agent/compare/5.8.3...5.8.4

### Changes
* [IMPROVEMENT] Core: Upgrades supervisord to 3.3.0 See [#2667][], [#2669][]

* [BUGFIX] MySQL: Fixes MySQL slave detection. See [#2610][]
* [BUGFIX] MySQL: Fixes MySQL replication service check. See [#2603][]
* [BUGFIX] Core: Fixes a bug that caused the thread pool to leak threads. See [#2666][]


# 5.8.3 / 07-05-2016
**Windows, Linux and Source Install**

### Details
https://github.com/DataDog/dd-agent/compare/5.8.2...5.8.3

### Changes
* [FEATURE] Flare: Adds configcheck output to flare command. See [#2588][]

* [IMPROVEMENT] Docker: Bump docker-py to 1.8.1 for network features support. See [#2556][]

* [BUGFIX] HAProxy: Add `collate_status_tags_per_host` flag See [#2590][]
* [BUGFIX] Core: A bug prevented ipv6 from being used in some circumstances. This fixes that, and defaults to ipv6 and falls back to ipv4. See [#2592][]
* [BUGFIX] Docker: Handles buggy responses from docker api better. See [#2608][]
* [BUGFIX] MongoDB: Only collect ReplicationInfo when it's possible to do so, rather than erroring out. See [#2559][]
* [BUGFIX] Postgres: Adds a hard limit on postgres.table.count metric, as this can be very expensive. See [#2575][]
* [BUGFIX] PowerDNS Recursor: The configuration file needed to be renamed from `powerdns.conf` to `powerdns_recursor.conf`. See [#2538][]
* [BUGFIX] Service Discovery: Improvements for testing, logging and service variable interpolation. See [#2573][]
* [BUGFIX] Service Discovery: Yse docker hostname rather than default route to query cadvisor and kublet. See [#2609][]
* [BUGFIX] Service Discovery: Use get_identifier instead of buggy image name extraction. See [#2593][]
* [BUGFIX] SQLServer: Send service checks after every run, rather than only at the beginning. See [#2515][]
* [BUGFIX] vSphere: Enhances topology support, skip unknown metrics. See [#2560][]
* [BUGFIX] vSphere: The whole check shouldn't fail just because the check failed on a certain instance. See [#2548][]
* [BUGFIX] Win32: When memory check collection times out, it causes an error in the collector. Instead, it should recover from this. See [#2553][]
* [BUGFIX] WMI: Allows user to set a profider in request data. See [#2565][], [#2369][]


# 5.8.2 / 05-24-2016
**Windows only**

Expand Down Expand Up @@ -3035,6 +3119,7 @@ https://github.com/DataDog/dd-agent/compare/2.2.9...2.2.10
[#2363]: https://github.com/DataDog/dd-agent/issues/2363
[#2366]: https://github.com/DataDog/dd-agent/issues/2366
[#2368]: https://github.com/DataDog/dd-agent/issues/2368
[#2369]: https://github.com/DataDog/dd-agent/issues/2369
[#2371]: https://github.com/DataDog/dd-agent/issues/2371
[#2372]: https://github.com/DataDog/dd-agent/issues/2372
[#2373]: https://github.com/DataDog/dd-agent/issues/2373
Expand Down Expand Up @@ -3108,9 +3193,39 @@ https://github.com/DataDog/dd-agent/compare/2.2.9...2.2.10
[#2510]: https://github.com/DataDog/dd-agent/issues/2510
[#2512]: https://github.com/DataDog/dd-agent/issues/2512
[#2514]: https://github.com/DataDog/dd-agent/issues/2514
[#2515]: https://github.com/DataDog/dd-agent/issues/2515
[#2516]: https://github.com/DataDog/dd-agent/issues/2516
[#2528]: https://github.com/DataDog/dd-agent/issues/2528
[#2535]: https://github.com/DataDog/dd-agent/issues/2535
[#2538]: https://github.com/DataDog/dd-agent/issues/2538
[#2548]: https://github.com/DataDog/dd-agent/issues/2548
[#2553]: https://github.com/DataDog/dd-agent/issues/2553
[#2556]: https://github.com/DataDog/dd-agent/issues/2556
[#2559]: https://github.com/DataDog/dd-agent/issues/2559
[#2560]: https://github.com/DataDog/dd-agent/issues/2560
[#2565]: https://github.com/DataDog/dd-agent/issues/2565
[#2573]: https://github.com/DataDog/dd-agent/issues/2573
[#2575]: https://github.com/DataDog/dd-agent/issues/2575
[#2588]: https://github.com/DataDog/dd-agent/issues/2588
[#2590]: https://github.com/DataDog/dd-agent/issues/2590
[#2592]: https://github.com/DataDog/dd-agent/issues/2592
[#2593]: https://github.com/DataDog/dd-agent/issues/2593
[#2603]: https://github.com/DataDog/dd-agent/issues/2603
[#2608]: https://github.com/DataDog/dd-agent/issues/2608
[#2609]: https://github.com/DataDog/dd-agent/issues/2609
[#2610]: https://github.com/DataDog/dd-agent/issues/2610
[#2625]: https://github.com/DataDog/dd-agent/issues/2625
[#2626]: https://github.com/DataDog/dd-agent/issues/2626
[#2633]: https://github.com/DataDog/dd-agent/issues/2633
[#2660]: https://github.com/DataDog/dd-agent/issues/2660
[#2661]: https://github.com/DataDog/dd-agent/issues/2661
[#2666]: https://github.com/DataDog/dd-agent/issues/2666
[#2667]: https://github.com/DataDog/dd-agent/issues/2667
[#2669]: https://github.com/DataDog/dd-agent/issues/2669
[#2670]: https://github.com/DataDog/dd-agent/issues/2670
[#2681]: https://github.com/DataDog/dd-agent/issues/2681
[#2684]: https://github.com/DataDog/dd-agent/issues/2684
[#2696]: https://github.com/DataDog/dd-agent/issues/2696
[#3399]: https://github.com/DataDog/dd-agent/issues/3399
[@AirbornePorcine]: https://github.com/AirbornePorcine
[@AntoCard]: https://github.com/AntoCard
Expand Down Expand Up @@ -3244,4 +3359,4 @@ https://github.com/DataDog/dd-agent/compare/2.2.9...2.2.10
[@xkrt]: https://github.com/xkrt
[@yenif]: https://github.com/yenif
[@yyamano]: https://github.com/yyamano
[@zdannar]: https://github.com/zdannar
[@zdannar]: https://github.com/zdannar
2 changes: 1 addition & 1 deletion Rakefile
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ desc 'Setup a development environment for the Agent'
task 'setup_env' do
`mkdir -p venv`
`wget -O venv/virtualenv.py https://raw.github.com/pypa/virtualenv/1.11.6/virtualenv.py`
`python venv/virtualenv.py --no-site-packages --no-pip --no-setuptools venv/`
`python venv/virtualenv.py -p python2 --no-site-packages --no-pip --no-setuptools venv/`
`wget -O venv/ez_setup.py https://bootstrap.pypa.io/ez_setup.py`
`venv/bin/python venv/ez_setup.py --version="20.9.0"`
`wget -O venv/get-pip.py https://bootstrap.pypa.io/get-pip.py`
Expand Down
55 changes: 21 additions & 34 deletions agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,15 +38,15 @@
EC2,
get_hostname,
)
from utils.flare import configcheck, Flare
from utils.flare import Flare
from utils.configcheck import configcheck, sd_configcheck
from utils.jmx import jmx_command
from utils.pidfile import PidFile
from utils.platform import Platform
from utils.profile import AgentProfiler
from utils.watchdog import new_watchdog
from utils.service_discovery.configcheck import sd_configcheck
from utils.service_discovery.config_stores import get_config_store, TRACE_CONFIG
from utils.service_discovery.config_stores import get_config_store
from utils.service_discovery.sd_backend import get_sd_backend
from utils.watchdog import new_watchdog

# Constants
PID_NAME = "dd-agent"
Expand Down Expand Up @@ -77,7 +77,7 @@ def __init__(self, pidfile, autorestart, start_event=True, in_developer_mode=Fal
self._checksd = []
self.collector_profile_interval = DEFAULT_COLLECTOR_PROFILE_INTERVAL
self.check_frequency = None
self.configs_reloaded = False
self.reload_configs_flag = False
self.sd_backend = None

def _handle_sigterm(self, signum, frame):
Expand All @@ -96,14 +96,17 @@ def _handle_sigusr1(self, signum, frame):

def _handle_sighup(self, signum, frame):
"""Handles SIGHUP, which signals a configuration reload."""
log.info("SIGHUP caught!")
self.reload_configs()
self.configs_reloaded = True
log.info("SIGHUP caught! Scheduling configuration reload before next collection run.")
self.reload_configs_flag = True

def reload_configs(self):
"""Reloads the agent configuration and checksd configurations."""
log.info("Attempting a configuration reload...")

# Stop checks
for check in self._checksd.get('initialized_checks', []):
check.stop()

# Reload checksd configs
hostname = get_hostname(self._agentConfig)
self._checksd = load_check_directory(self._agentConfig, hostname)
Expand Down Expand Up @@ -178,8 +181,6 @@ def run(self, config=None):

# Run the main loop.
while self.run_forever:
log.debug("Found {num_checks} checks".format(num_checks=len(self._checksd['initialized_checks'])))

# Setup profiling if necessary
if self.in_developer_mode and not profiled:
try:
Expand All @@ -189,16 +190,16 @@ def run(self, config=None):
except Exception as e:
log.warn("Cannot enable profiler: %s" % str(e))

# Do the work.
if self.reload_configs_flag:
self.reload_configs()

# Do the work. Pass `configs_reloaded` to let the collector know if it needs to
# look for the AgentMetrics check and pop it out.
self.collector.run(checksd=self._checksd,
start_event=self.start_event,
configs_reloaded=self.configs_reloaded)
configs_reloaded=self.reload_configs_flag)

# This flag is used to know if the check configs have been reloaded at the current
# run of the agent yet or not. It's used by the collector to know if it needs to
# look for the AgentMetrics check and pop it out.
# See: https://github.com/DataDog/dd-agent/blob/5.6.x/checks/collector.py#L265-L272
self.configs_reloaded = False
self.reload_configs_flag = False

# Look for change in the config template store.
# The self.sd_backend.reload_check_configs flag is set
Expand All @@ -216,8 +217,7 @@ def run(self, config=None):
# using ConfigStore.crawl_config_template
if self._agentConfig.get('service_discovery') and self.sd_backend and \
self.sd_backend.reload_check_configs:
self.reload_configs()
self.configs_reloaded = True
self.reload_configs_flag = True
self.sd_backend.reload_check_configs = False

if profiled:
Expand Down Expand Up @@ -352,7 +352,6 @@ def main():
return Agent.info(verbose=options.verbose)

elif 'foreground' == command:
logging.info('Running in foreground')
if autorestart:
# Set-up the supervisor callbacks and fork it.
logging.info('Running Agent with auto-restart ON')
Expand Down Expand Up @@ -403,19 +402,7 @@ def parent_func():

elif 'configcheck' == command or 'configtest' == command:
configcheck()

if agentConfig.get('service_discovery', False):
# set the TRACE_CONFIG flag to True to make load_check_directory return
# the source of config objects.
# Then call load_check_directory here and pass the result to sd_configcheck
# to avoid circular imports
agentConfig[TRACE_CONFIG] = True
configs = {
# check_name: (config_source, config)
}
print("\nLoading check configurations...\n\n")
configs = load_check_directory(agentConfig, hostname)
sd_configcheck(agentConfig, configs)
sd_configcheck(agentConfig)

elif 'jmx' == command:
jmx_command(args[1:], agentConfig)
Expand All @@ -427,7 +414,7 @@ def parent_func():
f.collect()
try:
f.upload()
except Exception, e:
except Exception as e:
print 'The upload failed:\n{0}'.format(str(e))

return 0
Expand Down
8 changes: 5 additions & 3 deletions appveyor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ environment:
INTEGRATIONS_DIR: c:\projects\dd-agent\embedded
PIP_CACHE: c:\projects\dd-agent\.cache\pip
VOLATILE_DIR: c:\projects
NOSE_FILTER: windows
FLAVORS: windows
NOSE_FILTER: not unix
PYWIN_PATH: C:\projects\dd-agent\.cache\pywin32-py2.7.exe
SKIP_LINT: true
matrix:
- PYTHON: C:\\Python27
PYTHON_VERSION: 2.7.9
Expand All @@ -31,4 +31,6 @@ install:
build: off
test_script:
- set PATH=%PYTHON%;%PYTHON%\Scripts;%PATH%
- bundle exec rake ci:run[%FLAVORS%]
- bundle exec rake ci:run[default]
- bundle exec rake ci:run[core_integration]
- bundle exec rake ci:run[windows]
2 changes: 1 addition & 1 deletion checks.d/agent_metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,6 @@ def check(self, instance):
self.gauge('datadog.agent.collector.cpu.used', cpu_used_pct)
self.log.info("CPU consumed (%%) is high: %.1f, metrics count: %d, events count: %d",
cpu_used_pct, len(payload['metrics']), len(payload['events']))
except Exception, e:
except Exception as e:
self.log.debug("Couldn't compute cpu used by collector with values %s %s %s",
cpu_time, collection_time, str(e))
4 changes: 4 additions & 0 deletions checks.d/apache.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ class Apache(AgentCheck):
'Uptime': 'apache.performance.uptime',
'Total kBytes': 'apache.net.bytes',
'Total Accesses': 'apache.net.hits',
'ConnsTotal': 'apache.conns_total',
'ConnsAsyncWriting': 'apache.conns_async_writing',
'ConnsAsyncKeepAlive': 'apache.conns_async_keep_alive',
'ConnsAsyncClosing' : 'apache.conns_async_closing'
}

RATES = {
Expand Down
4 changes: 2 additions & 2 deletions checks.d/ceph.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ def _collect_raw(self, ceph_cmd, instance):
args = ceph_args + ['version']
try:
output,_,_ = get_subprocess_output(args, self.log)
except Exception, e:
except Exception as e:
raise Exception('Unable to run cmd=%s: %s' % (' '.join(args), str(e)))

raw = {}
Expand All @@ -46,7 +46,7 @@ def _collect_raw(self, ceph_cmd, instance):
args = ceph_args + cmd.split() + ['-fjson']
output,_,_ = get_subprocess_output(args, self.log)
res = json.loads(output)
except Exception, e:
except Exception as e:
self.log.warning('Unable to parse data from cmd=%s: %s' % (cmd, str(e)))
continue

Expand Down
2 changes: 1 addition & 1 deletion checks.d/directory.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ def _get_stats(self, directory, name, dirtagname, filetagname, filegauges, patte
try:
file_stat = stat(filename)

except OSError, ose:
except OSError as ose:
self.warning("DirectoryCheck: could not stat file %s - %s" % (filename, ose))
else:
# file specific metrics
Expand Down
2 changes: 1 addition & 1 deletion checks.d/disk.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ def collect_metrics_psutil(self):
# Get disk metrics here to be able to exclude on total usage
try:
disk_usage = psutil.disk_usage(part.mountpoint)
except Exception, e:
except Exception as e:
self.log.debug("Unable to get disk metrics for %s: %s",
part.mountpoint, e)
continue
Expand Down
Loading

0 comments on commit bde7c15

Please sign in to comment.