firehol-automation
released this
New to netdata? Check its demo: https://my-netdata.io
Posted on twitter, facebook, reddit r/linux,
Hi all,
Another great netdata release: netdata v1.10.0 !
This is a birthday release: netdata is now 2 years old !
Many thanks to all the contributors that help building, enhancing and improving a project useful and helpful for thousands of admins, devops and developers around the world! You rock!
- @ktsaou
At a glance
netdata now has a new web server (called static) with a fixed number of threads, providing a lot better performance and finer control of the resources allocated to it.
All dashboard elements (javascript) have been updated to their latest versions - this allows a smoother experience when embedding netdata charts on third party web sites and apps.
IMPORTANT: all users using older netdata are advised to update to this version. This version offers improved stability, security and a huge number of bug fixes, compared to any prior version of netdata.
new plugins
- BTRFS - monitor the allocations of BTRFS filesystems (yes, netdata can now properly detect when btrfs is going out of space)
- BCACHE - monitor the caching block layer that allows building hybrid disks using normal HDDs and SSDs
- Ceph - monitor ceph distributed storage
- nginx plus - monitor the nginx+ web servers
- libreswan - monitor IPSEC tunnels
- Traefik - monitor traefik reverse proxies
- icecast - monitor icecast streaming servers
- ntpd - monitor NTP servers
- httpcheck - monitor any remote web server
- portcheck - monitor any remote TCP port
- spring-boot - monitor java spring boot applications
- dnsdist - monitor dnsdist name servers
- hugepages - monitor the allocation of Linux hugepages
enhanced / improved plugins
- statsd
- web_log
- containers monitoring
- system memory
- diskspace
- network interfaces
- postgres
- rabbitmq
- apps.plugin
- haproxy
- uptime
- ksm
- mdstat
- elasticsearch
- apcupsd
- isc-dhcpd
- fronius
- stiebeleltron
new alarm notifications methods
- alerta
- IRC
And as always, hundreds more enhancements, improvements and bugfixes.
BTRFS monitoring
BTRFS space usage monitoring and related alarms.
netdata is able to detect if any of the space-related components (physical disk allocation, data, metdata and system) of BTRFS is about the become exhausted!
#3150 - thanks to @Ferroin for explaining everything about btrfs...
bcache monitoring
netdata now monitors bcache metrics - they are automatically added to any disk that is found to be a bcache disk.
ceph monitoring
New plugin to monitor ceph, the unified, distributed storage system designed for excellent performance, reliability and scalability (#3166 @lets00).
containers and VMs monitoring
- netdata now monitors
systemd-nspawncontainers. - netdata now renames charts of kubernetes containers.
virshis now called with-rto avoid prompting for password #3144cgroup-networkis now a lot more strict, preventing unauthorized privilege escalation #3269cgroup-networknow searches for container processes in sub-cgroups too - this improves the mapping of network interfaces to containerscgroup-networknow works even when there are novethinterfaces in the system
monitor ntpd
netdata can now monitor isc-ntpd. @rda0 did a marvelous job decoding NTP Control Message Protocol, collecting ntpd metrics in the most efficient way #3421, #3454 @rda0
btw, netdata also monitors
chronybut the chrony module of netdata is disabled by default, because certain CentOS versions ship a version of chrony that consumes 100% cpu when queried for statistics.
nginx plus web servers monitoring
Added python plugin to monitor the operation of nginx plus servers. The plugin monitors everything about nginx+, except streaming #3312 @l2isbad
libreswan IPSEC tunnels monitoring
netdata now monitors libreswan tunnels - #3204

remote HTTP/HTTPS server monitoring
netdata now has an httpcheck plugin (module of python.d.plugin), that can query remote http/https servers, track the response timings and check that the response body contains certain text #3448 @ccremer .
remote TCP port monitoring
netdata now has portcheck plugin (module of python.d.plugin), that can check any remote TCP port is open #3447 @ccremer
icecast streaming server monitoring
netdata now monitors icecast servers #3511 @l2isbad.
traefik reverse proxy monitoring
netdata now monitors traefik reverse proxies - #3557.
spring-boot monitoring
netdata can now monitor java spring-boot applications @Wing924


dnsdist
netdata now monitors dnsdist name servers - @Nobody-Nobody #3009
statsd
- statsd dimensions now support the options the external plugin dimensions support (currently the only usable option is
hiddento add the dimension, but make it hidden on the dashboard - a hidden dimension can participate in various calculations, including alarms). - statsd now reports the CPU usage of its threads at the netdata section.
- statsd metrics are logged to access.log the first time they are encountered.
- statsd metrics now accept the special value
zinitto allow them get initialized without altering their values (this is useful if you have rare metrics that you need to initialize when netdata starts). - statsd over TCP is now a lot faster - netdata can process up to 3.5mil statsd metrics / second using just one core. Added options to control the timeouts of TCP statsd connections.
- fixed the title and context of statsd private charts
- statsd private charts can now be hidden from the dashboard #3467
postgres
Several new charts have been added to monitor (#3400 by @anayrat):
- checkpointer charts
- bgwriter charts
- autovacuum charts
- replication delta charts
- WAL archive charts
- WAL charts
- temporary files charts
Also, the postgres plugin now also works when postgres is in recovery mode.
rabbitmq
- added Erlang run queue chart. This is useful in conjunction with the existing Erlang processes chart to get a better overall idea of what's going on in the Erlang VM. @arch273
- added rabbitmq information on the dashboard to complement the charts.
apps.plugin
netdata prior to this version was detecting the user and group of processes by examining the ownership of /proc/PID/stat. Unfortunately it seems that the owneship of files in /proc do not change when the process switches user. So, netdata could not detect the user and group of processes that started as root and then switched to another user.
Now netdata reads /proc/PID/status:
- process ownship information is now accurate
- eliminated the need to read
/proc/PID/statm(all the information of/proc/PID/statmis available in/proc/PID/status) - allowed netdata to read
VmSwap, so a new chart has been added to monitor the swap memory usage per process, user and group.
- fixed issue with unreasonable spikes on processes cpu on FreeBSD (there was a typo) #3245
- fixed issue with errors reported on FreeBSD about pid 0 #3099
The new plugin is 20% more expensive in terms of CPU. We tried hard to optimize it, but this is as good as it can get. Read about it at #3434 and #3436
haproxy
Added charts:
- hrsp_1xx, hrsp_2xx, hrsp_3xx, hrsp_4xx, hrsp_5xx, hrsp_other, hrsp_total for backands and frontends
- qtime, ctime, rtime, ttime metrics for backend servers
- backend servers In UP state
uptime
netdata now uses /proc/uptime when CLOCK_BOOTTIME does not report the same uptime. In containers CLOCK_BOOTTIME reports the uptime of the host, while /proc/uptime reports the uptime of the container, so now netdata correctly reports the uptime of the container.
mdstat
various fixes to better monitor rebuild time and rate @l2isbad
KSM
- removed
to_scandimension - the savings % reported by netdata was less than the actual - fixed it.
elasticsearch
Added several charts for translog / indices segments statistics and JVM buffer pool utilization, which are often helpful when evaluating an elasticsearch node health #3544 @NeonSludge
memory monitoring
- treat slab memory as cached #3288 @amichelic
- added a new chart for monitoring the memory available for use, before hitting swap

- netdata now monitors Linux hugepages and transparent hugepages

- added hugepages monitoring #3462

diskspace monitoring
- support huge amounts of mountpoints #3258 - netdata was crashing with stack overflow due to recursion - now it is loop, so any number of mount points is supported
network monitoring
- moved tcp passive and active opens to a separate chart, to allow the TCP issues dimensions scale better by default #3238
- updated the information presented on TCP charts to match the latest v4.15 kernel source #3239
APC UPS
netdata now supports monitoring multiple APC UPSes.
ISC DHCPd
netdata now also supports monitoring IPv6 leases - @l2isbad
fronius
stiebeleltron
- added alarms @ccremer
web_log
Added web server response timings histogram #3558 @Wing924 .

python.d.plugin
- python.d.plugin can now start even if
/etc/netdata/python.d.confis missing @l2isbad - python.d.plugin now has an internal run counter @l2isbad
- the unicode decoding of the plugin has been fixed (#3406) @l2isbad
- the plugin now does not validate self-signed certificates @l2isbad
- the plugin can not revive obsolete charts @l2isbad
charts.d.plugin
charts.d.plugin BASH modules can now have custom number of retries in case of data collection failures #3524.
web server
- netdata now has a new internal web server that supports a fixed number of threads - we call it
static web server. This web server allows netdata to work around memory fragmentation (since the treads are fixed, the underlying memory allocators reuse the same memory arenas) and cpu utilization (we can control the number of threads that will be used by netdata). This is the default now. #3248 - now the static threads web server reports the CPU usage of each of its threads.
- the HTTP response headers now include the netdata version
dashboard
-
the print button now respects the URL path netdata is hosted.
-
dygraphs updated to the latest version - this fixes an issue that prevented netdata charts from being interactive under certain conditions
-
added dygraph theme
logscale#3283 -
fontawesome updated to version 5
-
d3 updated to the latest version (this broke c3 charts that require an older version)
-
custom dashboards can now have alarms for specific roles (all, none, one or more).
-
allow stacked charts to zoom vertically when dimensions are selected

-
netdata now has a global XSS protection #3363

-
netdata now uses intersectionObserver when available #3280 - this improves the scrolling performance of the dashboard.
-
prevent date, time and units from wrapping at the charts legends #3286
-
various units scaling improvements #3285
-
added
data-common-colors="NAME"chart option for custom dashboards #3282. -
added wiki page for creating custom dashboards on Atlassian's Confluence.

-
prevented a double click on the charts' toolbox to select the text of the buttons.
-
added a simple js, called refresh-badges.js, to update badges on a custom web page
badges
netdata badges can now be scaled #3474
API
- added
gtimeparameter, for group time. This is used to request from netdata to return values in a different rate (i.e.gtime=60on aX/secdimension, will returnX/min). - fixed a rounding bug in JSON generation #3309
- the
dimensions=parameter now supports simple patterns #3170 and added option valuesmatch-idsandmatch-namesto control which matches are executed for dimensions.
alarms
-
system.swapalarms now send notifications with a 30 seconds delay, to work-around a kernel bug that incorrectly reports all swap as instantly used under containers #3380. -
added alarm to predict the time a mount point will run out of inodes #3566.
-
all system alarms are now ported to FreeBSD too #3337 @arch273
-
removed unsupported html tags from hipchat notifications.
-
pagerduty notifications have been modified to avoid incident duplication #3549.
-
alarm definitions can now use both chart IDs and chart names (prior to this version only chart IDs were allowed).
-
curloptions (eg for disabling SSL certificates verification) foralarm-notify.shcan now be defined inhealth_alarm_notify.conf. -
netdata can now send notifications to IRC channels #3458 @manosf
backends
- on netdata masters, allow filtering the hosts that will be sent to backends with
send hosts matching = *pattern. - improved connection error handling and added retries to allow netdata connect to certain backends that failed with
EALREADYorEINPROGRESS. - json backends now receive
host tags(the tags have to be formatted in a json friendly way) #3556. - re-worked the alarm that triggers when backend data are lost, to avoid flip-flops.
prometheus backends
- added URL option
timestamps=yes|noto/api/v1/allmetricsto support prometheus Pushgateway #3533 - added
netdata_infovariable with the version of netdata - renamed
netdata_host_tagstonetdata_host_tags_info(the old exists but is deprecated and will be removed eventually) - when prometheus uses
averagemetrics, netdata remembers the last access time the prometheus collected metrics, on a per host basis.
metrics streaming between netdata
- netdata masters and proxies now expose the version of the netdata collecting the metrics, not their own. So, now a netdata master shows on the dashboard and sends to backends the version of the netdata collecting the metrics #3538.
- added
stream.confoptionmultiple connections = accept | denyto allow or deny multiple connection for the same netdata host. The default remainsaccept, but it is likely to be changed tonoon future versions.
packaging
- added docker hub builds for aarch64/arm64 @justin8
- updated debian containers to use stretch @justin8
- added FreeBSD init file
- various installers fixes and improvements (make sure netdata is started, do not give information about features not supported on each operating system, allow non-root installations without errors, etc.)
- various installer fixes for FreeBSD and MacOS
netdata-updaterwas growing thePATHvariable on each of its runs - fixed it.- added
--acceptand--dont-start-itcommand line options tokickstart-static64.sh - netdata can be compiled with
long doublesupport (useful in embedded devices that don't support long double numbers) #3354 - fixed
netdata.specto allow building netdata on older and newer rpm based distros. Also added a script to build a netdata rpm - static netdata installer now tries to find the location of the SSL ca-certificates on a system and properly configured the static
curlprovided with this path. - the netdata updater starts netdata only if it was running
- added alpine dockerfile
other
- added global option
gap when lost iterationsto control the number of iterations that should be lost to show a gap on the charts. - various fixes/improvements related to netdata logs - the main change is that now netdata logs the thread name that logged the message, providing helpful insights about the thread that complained.
- re-worked the exit procedure of netdata to allow it cleanup properly - sometimes netdata was deadlocked during exit, waiting forever - now netdata always exits promptly #3184
- fixed compilation on ancient gcc versions
- netdata was always setting itself to the
idleprocess scheduling priority, even when it was configured to do otherwise. Fixed it #3523











