Skip to content

@netdatabot netdatabot released this Jul 8, 2019 · 82 commits to master since this release

Release v1.16.0 contains 40 bug fixes, 31 improvements and 20 documentation updates

At a glance

Binary distributions. To improve the security, speed and reliability of new netdata installations, we are delivering our own, industry standard installation method, with binary package distributions. The RPM binaries for the most common OSs are already available on packagecloud and we’ll have the DEB ones available very soon. All distributions are considered in Beta and, as always, we depend on our amazing community for feedback on improvements.

Netdata now supports SSL encryption! You can secure the communication to the web server, the streaming connections from slaves to the master and the connection to an openTSDB backend.

This version also brings two long-awaited features to netdata’s health monitoring:

  • The health management API introduced in v1.12 allowed you to easily disable alarms and/or notifications while netdata was running. However, those changes were not persisted across netdata restarts. Since part of routine maintenance activities may involve completely restarting a monitoring node, netdata now saves these configurations to disk, every time you issue a command to change the silencer settings. The new LIST command of the API allows you to view at any time which alarms are currently disabled or silenced.
  • A way for netdata to repeatedly send alarm notifications for some, or all active alarms, at a frequency of your choosing. As a result, you will no longer have to worry about missing a notification, forgetting about a raised alarm. The default is still to only send a single notification, so that existing users are not surprised by a different behavior.

As always, we’ve introduced new collectors, 5 of them this time.

  • Of special interest to people with Windows servers in their infrastructure is the WMI collector, though we are fully aware that we need to continue our efforts to do a proper port to Windows.
  • The new perf plugin collects system-wide CPU performance statistics from Performance Monitoring Units (PMU) using the perf_event_open() system call. You can read a wonderful article on why this is useful here.
  • The other three are collectors to monitor Dnsmasq DHCP leases, Riak KV servers and Pihole instances.

Finally, the DB Engine introduced in v1.15.0 now uses much less memory and is more robust than before.

Acknowledgements

As you’ll see in the detailed list below, once again we’ve had great help from our contributors.

  • Steve8291 was helping everywhere
  • apardyl added useful new alarms and helped with documentation
  • jchristgit wrote the Riak KV collector
  • Saruspete made improvements to the freeipmi plugin
  • kam1kaze has added new charts to the python mysql collector
  • akwan and mbarper improved the application monitoring, with new process groupings
  • nodiscc helped with bug and documentation fixes
  • dankohn) helped with the documentation
  • andvgal added an amazing configuration to help us run proper lint checks on our markdown files
  • octomike, Danamir, mbarper, Wing924, n0coast and toofar delivered bug fixes
  • josecv helped improve the Kubernetes helm chart.

We can't stress enough the immense help we get just from users creating an issue in GitHub, helping us identify the root cause and validate the change in their infrastructure. Unfortunately, we are not able to list all of them here, but their contribution is invaluable.

Improvements

Binary packages

Health

Security

New collectors

  • Go.d collector modules for WMI, [Dnsmasq DHCP leases)(https://github.com/netdata/go.d.plugin/tree/master/modules/dnsmasq_dhcp) and Pihole (ilyam8)
  • Riak KV instances collector #6286 (jchristgit)
  • CPU performance statistics using Performance Monitoring Units (PMU) via the perf_event_open() system call. (perf plugin) #6225 (vlvkobal)

Collector improvements

  • Handle different sensor IDs for the same element in the freeipmi plugin #6296 (Saruspete)
  • Increase the cpu_limit chart precision in cgroup plugin #6172 (vlvkobal)
  • Added userstats and deadlocks charts to the python mysql collector #6118 #6115 (kam1kaze)
  • Add perforce server process monitoring to the apps plugin #6064 (akwan)

Backends

DB engine improvements

  • Reduced memory requirements by 40-50% #6134 (mfundul)
  • Reduced the number of pages needed to be stored and indexed when using memory mode = dbengine, by adding empty page detection #6173 (mfundul)

Rebranding

Documentation

  • Improve documentation about file descriptors and systemd configuration. #6372 (mfundul)
  • Update the documentation on charts with zero metrics #6314 (vlvkobal)
  • Document that that in versions before 1.16, the plugins.d directory may be installed in a different location in certain OSs #6301 (cakrit)
  • Remove single and multi-threaded web server configuration instructions #6291 (nodiscc)
  • Add more info on the stream.conf option health enabled by default = auto #6281 (cakrit)
  • Add comments about AWS SDK for C++ installation #6277 (vlvkobal)
  • Fix on the installation readme regarding the supported systems (first came RedHat, then the others) #6271 (paulkatsoulakis)
  • Update the new dbengine documentation #6264 (mfundul)
  • Remove CNCF logo and TOC presentation reference #6234 (dankohn)
  • Added code style guidance to CONTRIBUTING #6212 (cakrit)
  • Visibility fix for anonymous statistics #6208 (cakrit)
  • smartd documentation improvements #6207 (cakrit), #6203 (Steve8291)
  • Made custom notification's instructions clearer #6181 (cakrit)
  • Fix typo in the web server README #6146 (cakrit)
  • Registry documentation fixes #6144 (cakrit)
  • Changed 'netdata' to 'Netdata' in /docs/ and /README.md #6137 (apardyl)
  • Update installer readme with OpenSUSE dependencies #6111 (mfundul)
  • Fixed minor typos in the daemon configuration documentation #6090 (Steve8291)
  • Mention anonymous statistics in additional places in the docs #6084 (cakrit)
  • Local remark-lint checks and autofix support #5898 (andvgal)

Other

  • Pass the the cloud base url parameter to the notifications mechanism, so that modifications to the configuration are respected when creating the link to the alarm #6383 (ladakis)
  • Added a .gitattributes file to improve git diff for C files #6381 (ac000)
  • Improved logging, to be able to trace the CRITICAL: main[main] SIGPIPE received. error #6373 (vlvkobal)
  • Modify the limits of the stale bot, to close stale questions/discussions in GitHub faster #6297 (ilyam8)
  • Internal CI/CD improvements #6282 #6268 (paulkatsoulakis)
  • netdata/packaging: Add more distribution validations #6235 (paulkatsoulakis)
  • Move call to send_statistics later, to get more telemetry events from docker containers #6113 (vlvkobal), #6096 (cakrit)
  • Use github templating mechanisms to classify issues when they are created #5776 (paulfantom)

Bug fixes

  • Fixed ram_available alarm #6261 (octomike)
  • Stop monitoring /dev and /run in the disk space and inode usage charts #6399 (vlvkobal)
  • Fixed the monitoring of the “time” group of processes #6397 (mbarper)
  • Fixed compilation error PERF_COUNT_HW_REF_CPU_CYCLES' undeclared here in old Linux kernels (perf plugin) #6382 (vlvkobal)
  • Fixed autodetection for openldap on Debian (apps.plugin) #6364 (nodiscc)
  • Fixed compilation error on CentOS 6 (nfacct plugin) #6351 (vlvkobal)
  • Fixed invalid XML page error (tomcat plugin) #6345 (Danamir)
  • Remove obsolete monit metrics #6340 (ilyam8)
  • Fixed Failed to parse error in adaptec_raid #6338 (ilyam8)
  • Fixed cluster_health_nodes and cluster_stats_nodes charts in the elasticsearch collector #6311 (Wing924)
  • A modified slave chart's "name" was not properly transferred to the master (streaming) #6304 (vlvkobal)
  • Netdata could run out of file descriptors when using the new DB engine #6303 (mfundul)
  • Fixed UI behavior when pressing the End key #6294 (thiagoftsm)
  • Fixed UI link to check the configuration file, to open in a new tab #6294 (thiagoftsm)
  • Fixed files not found during installation, due to different than expected location of the libexecdir directory #6272 (paulkatsoulakis)
  • Prevented Error: 'module' object has no attribute 'Retry' messages from python collectors, by enforcing minimum version check for the UrlService library #6263 (ilyam8)
  • Fixed typo that causes nfacct.plugin log messages to incorrectly show freeipmi #6260 (vlvkobal)
  • Fixed netdata/netdata docker image failure, when users pass a PGID that already exists on the system #6259 (paulkatsoulakis)
  • The daemon could get stuck during collection or during shutdown, when using the new dbengine. Reduced new dbengine IO utilization by forcing page alignment per dimension of chart. #6240 (mfundul)
  • Properly handle timeouts/no response in dns_query_time python collector #6237 (n0coast)
  • When a collector restarted after having stopped for a long time, the new dbengine would consume a lot of CPU resources. #6216 (mfundul)
  • Fixed error Assertion old_state & PG_CACHE_DESCR_ALLOCATED' failed` of the new dbengine. Eliminated a page cache descriptor race condition #6202 (mfundul)
  • tv.html failed to load the three left charts when accessed via https. Turn tv.html links to https #6198 (cakrit)
  • Change print level from error to info for messages about clearing old files from the database#6195 (mfundul)
  • Fixed warning regarding the x509check_last_collected_secs alarms. Changed the template update frequency to 60s, to match the chart’s update frequency #6194 (ilyam8)
  • Email notification header lines were not terminated with \r\n as per the RFC #6187 (toofar)
  • Some log entries would not be caught by the python web_log plugin. Fixed the regular expressions #6138 #6180 (ilyam8)
  • Corrected the date used in pushbullet notifications #6179 (cakrit)
  • Fixed FATAL error when using the new dbengine with no direct I/O support, by falling back to buffered I/O #6174 (mfundul)
  • Fixed compatibility issues with varnish v4 (varnish collector) #6168 (ilyam8)
  • The total number of disks in mdstat.XX_disks chart was displayed incorrectly. Fixed the "inuse" and "down" disks stacking. #6164 (vlvkobal)
  • The config option --disable-telemetry was being checked after restarting netdata, which means that we would still send anonymous statistics the first time netdata was started. #6127 (cakrit)
  • Fixed apcupsd collector errors, by passing correct info to the run function. #6126 (Steve8291)
  • apcupsd and libreswan were not enabled by default #6120 (Steve8291)
  • Fixed incorrect module name: energi to energid #6112 (Steve8291)
  • The nodes view did not work properly when a reverse proxy was configured to access netdata via paths containing subpaths (e.g. myserver/netdata) #6093 (gmosx)
  • Fix error message PLUGINSD : cannot open plugins directory #6080 #6089 (Steve8291)
  • Corrected invalid links to web_log.conf that appear on the agent UI #6087 (cakrit)
  • Fixed ScaleIO collector endpoint paths go.d PR 226 ilyam8
  • Fixed web client timeout handling in the go.d plugin httpcheck collector go.d PR 225 ilyam8
Assets 5
You can’t perform that action at this time.