Skip to content

@firehol-automation firehol-automation released this Mar 27, 2018 · 1472 commits to master since this release

New to netdata? Check its demo: https://my-netdata.io

User Base Monitored Servers Sessions Served

New Users Today New Machines Today Sessions Today


Posted on twitter, facebook, reddit r/linux,


Hi all,

Another great netdata release: netdata v1.10.0 !

This is a birthday release: netdata is now 2 years old !

Many thanks to all the contributors that help building, enhancing and improving a project useful and helpful for thousands of admins, devops and developers around the world! You rock!

- @ktsaou

At a glance

netdata now has a new web server (called static) with a fixed number of threads, providing a lot better performance and finer control of the resources allocated to it.

All dashboard elements (javascript) have been updated to their latest versions - this allows a smoother experience when embedding netdata charts on third party web sites and apps.


IMPORTANT: all users using older netdata are advised to update to this version. This version offers improved stability, security and a huge number of bug fixes, compared to any prior version of netdata.


new plugins

  • BTRFS - monitor the allocations of BTRFS filesystems (yes, netdata can now properly detect when btrfs is going out of space)
  • BCACHE - monitor the caching block layer that allows building hybrid disks using normal HDDs and SSDs
  • Ceph - monitor ceph distributed storage
  • nginx plus - monitor the nginx+ web servers
  • libreswan - monitor IPSEC tunnels
  • Traefik - monitor traefik reverse proxies
  • icecast - monitor icecast streaming servers
  • ntpd - monitor NTP servers
  • httpcheck - monitor any remote web server
  • portcheck - monitor any remote TCP port
  • spring-boot - monitor java spring boot applications
  • dnsdist - monitor dnsdist name servers
  • hugepages - monitor the allocation of Linux hugepages

enhanced / improved plugins

  • statsd
  • web_log
  • containers monitoring
  • system memory
  • diskspace
  • network interfaces
  • postgres
  • rabbitmq
  • apps.plugin
  • haproxy
  • uptime
  • ksm
  • mdstat
  • elasticsearch
  • apcupsd
  • isc-dhcpd
  • fronius
  • stiebeleltron

new alarm notifications methods

  • alerta
  • IRC

And as always, hundreds more enhancements, improvements and bugfixes.


BTRFS monitoring

BTRFS space usage monitoring and related alarms.

netdata is able to detect if any of the space-related components (physical disk allocation, data, metdata and system) of BTRFS is about the become exhausted!

#3150 - thanks to @Ferroin for explaining everything about btrfs...

screenshot from 2017-12-19 01-15-38

bcache monitoring

netdata now monitors bcache metrics - they are automatically added to any disk that is found to be a bcache disk.

ceph monitoring

New plugin to monitor ceph, the unified, distributed storage system designed for excellent performance, reliability and scalability (#3166 @lets00).

containers and VMs monitoring

  • netdata now monitors systemd-nspawn containers.
  • netdata now renames charts of kubernetes containers.
  • virsh is now called with -r to avoid prompting for password #3144
  • cgroup-network is now a lot more strict, preventing unauthorized privilege escalation #3269
  • cgroup-network now searches for container processes in sub-cgroups too - this improves the mapping of network interfaces to containers
  • cgroup-network now works even when there are no veth interfaces in the system

monitor ntpd

netdata can now monitor isc-ntpd. @rda0 did a marvelous job decoding NTP Control Message Protocol, collecting ntpd metrics in the most efficient way #3421, #3454 @rda0

ntpd_system

btw, netdata also monitors chrony but the chrony module of netdata is disabled by default, because certain CentOS versions ship a version of chrony that consumes 100% cpu when queried for statistics.

nginx plus web servers monitoring

Added python plugin to monitor the operation of nginx plus servers. The plugin monitors everything about nginx+, except streaming #3312 @l2isbad

libreswan IPSEC tunnels monitoring

netdata now monitors libreswan tunnels - #3204
screenshot from 2018-01-03 00-32-14

remote HTTP/HTTPS server monitoring

netdata now has an httpcheck plugin (module of python.d.plugin), that can query remote http/https servers, track the response timings and check that the response body contains certain text #3448 @ccremer .

httpcheck

remote TCP port monitoring

netdata now has portcheck plugin (module of python.d.plugin), that can check any remote TCP port is open #3447 @ccremer

portcheck

icecast streaming server monitoring

netdata now monitors icecast servers #3511 @l2isbad.

traefik reverse proxy monitoring

netdata now monitors traefik reverse proxies - #3557.

spring-boot monitoring

netdata can now monitor java spring-boot applications @Wing924
2018-02-23 11 34 37
2018-02-23 11 34 48

dnsdist

netdata now monitors dnsdist name servers - @nobody-nobody #3009

statsd

  • statsd dimensions now support the options the external plugin dimensions support (currently the only usable option is hidden to add the dimension, but make it hidden on the dashboard - a hidden dimension can participate in various calculations, including alarms).
  • statsd now reports the CPU usage of its threads at the netdata section.
  • statsd metrics are logged to access.log the first time they are encountered.
  • statsd metrics now accept the special value zinit to allow them get initialized without altering their values (this is useful if you have rare metrics that you need to initialize when netdata starts).
  • statsd over TCP is now a lot faster - netdata can process up to 3.5mil statsd metrics / second using just one core. Added options to control the timeouts of TCP statsd connections.
  • fixed the title and context of statsd private charts
  • statsd private charts can now be hidden from the dashboard #3467

postgres

Several new charts have been added to monitor (#3400 by @anayrat):

  1. checkpointer charts
  2. bgwriter charts
  3. autovacuum charts
  4. replication delta charts
  5. WAL archive charts
  6. WAL charts
  7. temporary files charts

Also, the postgres plugin now also works when postgres is in recovery mode.

rabbitmq

  • added Erlang run queue chart. This is useful in conjunction with the existing Erlang processes chart to get a better overall idea of what's going on in the Erlang VM. @arch273
  • added rabbitmq information on the dashboard to complement the charts.

apps.plugin

netdata prior to this version was detecting the user and group of processes by examining the ownership of /proc/PID/stat. Unfortunately it seems that the owneship of files in /proc do not change when the process switches user. So, netdata could not detect the user and group of processes that started as root and then switched to another user.

Now netdata reads /proc/PID/status:

  • process ownship information is now accurate
  • eliminated the need to read /proc/PID/statm (all the information of /proc/PID/statm is available in /proc/PID/status)
  • allowed netdata to read VmSwap, so a new chart has been added to monitor the swap memory usage per process, user and group. screenshot from 2018-02-24 15-07-47
  • fixed issue with unreasonable spikes on processes cpu on FreeBSD (there was a typo) #3245
  • fixed issue with errors reported on FreeBSD about pid 0 #3099

The new plugin is 20% more expensive in terms of CPU. We tried hard to optimize it, but this is as good as it can get. Read about it at #3434 and #3436

haproxy

Added charts:

  • hrsp_1xx, hrsp_2xx, hrsp_3xx, hrsp_4xx, hrsp_5xx, hrsp_other, hrsp_total for backands and frontends
  • qtime, ctime, rtime, ttime metrics for backend servers
  • backend servers In UP state

@ktarasz

uptime

netdata now uses /proc/uptime when CLOCK_BOOTTIME does not report the same uptime. In containers CLOCK_BOOTTIME reports the uptime of the host, while /proc/uptime reports the uptime of the container, so now netdata correctly reports the uptime of the container.

mdstat

various fixes to better monitor rebuild time and rate @l2isbad

KSM

  • removed to_scan dimension
  • the savings % reported by netdata was less than the actual - fixed it.

elasticsearch

Added several charts for translog / indices segments statistics and JVM buffer pool utilization, which are often helpful when evaluating an elasticsearch node health #3544 @NeonSludge

memory monitoring

  • treat slab memory as cached #3288 @amichelic
  • added a new chart for monitoring the memory available for use, before hitting swap screenshot from 2018-01-07 03-38-30
  • netdata now monitors Linux hugepages and transparent hugepages screenshot from 2018-02-24 14-28-44
  • added hugepages monitoring #3462screenshot from 2018-02-23 15-07-26

diskspace monitoring

  • support huge amounts of mountpoints #3258 - netdata was crashing with stack overflow due to recursion - now it is loop, so any number of mount points is supported

network monitoring

  • moved tcp passive and active opens to a separate chart, to allow the TCP issues dimensions scale better by default #3238
  • updated the information presented on TCP charts to match the latest v4.15 kernel source #3239

APC UPS

netdata now supports monitoring multiple APC UPSes.

ISC DHCPd

netdata now also supports monitoring IPv6 leases - @l2isbad

fronius

stiebeleltron

web_log

Added web server response timings histogram #3558 @Wing924 .
2018-03-19 0 06 00

python.d.plugin

  • python.d.plugin can now start even if /etc/netdata/python.d.conf is missing @l2isbad
  • python.d.plugin now has an internal run counter @l2isbad
  • the unicode decoding of the plugin has been fixed (#3406) @l2isbad
  • the plugin now does not validate self-signed certificates @l2isbad
  • the plugin can not revive obsolete charts @l2isbad

charts.d.plugin

charts.d.plugin BASH modules can now have custom number of retries in case of data collection failures #3524.

web server

  • netdata now has a new internal web server that supports a fixed number of threads - we call it static web server. This web server allows netdata to work around memory fragmentation (since the treads are fixed, the underlying memory allocators reuse the same memory arenas) and cpu utilization (we can control the number of threads that will be used by netdata). This is the default now. #3248
  • now the static threads web server reports the CPU usage of each of its threads.
  • the HTTP response headers now include the netdata version

dashboard

  • the print button now respects the URL path netdata is hosted.

  • dygraphs updated to the latest version - this fixes an issue that prevented netdata charts from being interactive under certain conditions

  • added dygraph theme logscale #3283

  • fontawesome updated to version 5

  • d3 updated to the latest version (this broke c3 charts that require an older version)

  • added d3pie charts optimized-d3pie

  • custom dashboards can now have alarms for specific roles (all, none, one or more).

  • allow stacked charts to zoom vertically when dimensions are selected peek 2018-01-27 13-35

  • netdata now has a global XSS protection #3363 screenshot from 2018-01-30 00-30-05

  • netdata now uses intersectionObserver when available #3280 - this improves the scrolling performance of the dashboard.

  • prevent date, time and units from wrapping at the charts legends #3286

  • various units scaling improvements #3285

  • added data-common-colors="NAME" chart option for custom dashboards #3282.

  • added wiki page for creating custom dashboards on Atlassian's Confluence. final-confluence4

  • prevented a double click on the charts' toolbox to select the text of the buttons.

  • fixed the alignment of dashboard icons #3224 @xPaw

  • added a simple js, called refresh-badges.js, to update badges on a custom web page

badges

netdata badges can now be scaled #3474

screenshot from 2018-02-26 01-50-33
screenshot from 2018-02-26 01-50-55
screenshot from 2018-02-26 01-51-21

API

  • added gtime parameter, for group time. This is used to request from netdata to return values in a different rate (i.e. gtime=60 on a X/sec dimension, will return X/min).
  • fixed a rounding bug in JSON generation #3309
  • the dimensions= parameter now supports simple patterns #3170 and added option values match-ids and match-names to control which matches are executed for dimensions.

alarms

  • system.swap alarms now send notifications with a 30 seconds delay, to work-around a kernel bug that incorrectly reports all swap as instantly used under containers #3380.

  • added alarm to predict the time a mount point will run out of inodes #3566.

  • all system alarms are now ported to FreeBSD too #3337 @arch273

  • added alerta.io notifications @kattunga

  • added available memory alarm screenshot from 2018-01-07 03-39-05

  • removed unsupported html tags from hipchat notifications.

  • pagerduty notifications have been modified to avoid incident duplication #3549.

  • alarm definitions can now use both chart IDs and chart names (prior to this version only chart IDs were allowed).

  • curl options (eg for disabling SSL certificates verification) for alarm-notify.sh can now be defined in health_alarm_notify.conf.

  • netdata can now send notifications to IRC channels #3458 @manosf

    IRCCloud web client:
    image

    Irssi terminal client:image

backends

  • on netdata masters, allow filtering the hosts that will be sent to backends with send hosts matching = * pattern.
  • improved connection error handling and added retries to allow netdata connect to certain backends that failed with EALREADY or EINPROGRESS.
  • json backends now receive host tags (the tags have to be formatted in a json friendly way) #3556.
  • re-worked the alarm that triggers when backend data are lost, to avoid flip-flops.

prometheus backends

  • added URL option timestamps=yes|no to /api/v1/allmetrics to support prometheus Pushgateway #3533
  • added netdata_info variable with the version of netdata
  • renamed netdata_host_tags to netdata_host_tags_info (the old exists but is deprecated and will be removed eventually)
  • when prometheus uses average metrics, netdata remembers the last access time the prometheus collected metrics, on a per host basis.

metrics streaming between netdata

  • netdata masters and proxies now expose the version of the netdata collecting the metrics, not their own. So, now a netdata master shows on the dashboard and sends to backends the version of the netdata collecting the metrics #3538.
  • added stream.conf option multiple connections = accept | deny to allow or deny multiple connection for the same netdata host. The default remains accept, but it is likely to be changed to no on future versions.

packaging

  • added docker hub builds for aarch64/arm64 @justin8
  • updated debian containers to use stretch @justin8
  • added FreeBSD init file
  • various installers fixes and improvements (make sure netdata is started, do not give information about features not supported on each operating system, allow non-root installations without errors, etc.)
  • various installer fixes for FreeBSD and MacOS
  • netdata-updater was growing the PATH variable on each of its runs - fixed it.
  • added --accept and --dont-start-it command line options to kickstart-static64.sh
  • netdata can be compiled with long double support (useful in embedded devices that don't support long double numbers) #3354
  • fixed netdata.spec to allow building netdata on older and newer rpm based distros. Also added a script to build a netdata rpm
  • static netdata installer now tries to find the location of the SSL ca-certificates on a system and properly configured the static curl provided with this path.
  • the netdata updater starts netdata only if it was running
  • added alpine dockerfile

other

  • added global option gap when lost iterations to control the number of iterations that should be lost to show a gap on the charts.
  • various fixes/improvements related to netdata logs - the main change is that now netdata logs the thread name that logged the message, providing helpful insights about the thread that complained.
  • re-worked the exit procedure of netdata to allow it cleanup properly - sometimes netdata was deadlocked during exit, waiting forever - now netdata always exits promptly #3184
  • fixed compilation on ancient gcc versions
  • netdata was always setting itself to the idle process scheduling priority, even when it was configured to do otherwise. Fixed it #3523
Assets 20
You can’t perform that action at this time.