Skip to content

@netdatabot netdatabot released this Jul 16, 2020 · 150 commits to master since this release

Netdata v1.23.2

Release v1.23.2 of the Netdata Agent is a patch for one significant issue.

PR #9491 fixed a buffer overrun vulnerability in Netdata's JSON parsing code. This vulnerability could be used to crash Agents remotely, and in some circumstances, could be used in an arbitrary code execution (ACE) exploit.

We strongly encourage all Netdata users to update their nodes to v1.23.2 as soon as possible.

This release also contains additional bug fixes and improvements.

Acknowledgements

  • @Saruspete for adding Infiniband monitoring to Netdata!
  • @meesaltena for fixing a typo in netdata-installer.sh.
  • @anirudhdggl for tweaking the PyMySQL library to respect the my.cnf parameter when monitoring MySQL.
  • @candrews for cleaning up the exporting engine by wrapping header definitions in compilation conditions.
  • @RubenKelevra for deploying an update to the IPFS collector that makes it compatible with IPFS v0.5.0+.
  • @vsc55 for adding support for returning headers using python.d's UrlService.

Improvements

  • Add support for multiple ACLK query processing threads (#9355, @underhood)
  • Add Infiniband monitoring to collector proc.plugin (#9091, @Saruspete)
  • Change the HTTP method to make the IPFS collector compatible with 0.5.0+ (#9248, @RubenKelevra)
  • Add support for returning headers using python.d's UrlService (#9236, @vsc55)

Documentation

  • Fix broken link in Kavenegar notification doc (#9492, @joelhans)
  • Add documentation for installing Netdata on k8s clusters (#9364, @joelhans)
  • Add notices to packaging docs for access errors and Cloud dependencies (#9422, @joelhans)
  • Fix broken link to Polyverse in Docker documentation (#9426, @joelhans)
  • Add notice to eBPF documentation about incompatibility with static builds (#9418, @joelhans)

Packaging / installation

CI/CD

Bug fixes

  • Fix vulnerability in JSON parsing (#9491, @underhood)
  • Fixed stored number accuracy (#9540, @stelfrag)
  • Fix transition from archived to active charts not generating alarms (#9536, @mfundul)
  • Fix PyMySQL library to respect my.cnf parameter (#9526, @anirudhdggl)
  • Remove health from archived metrics (#9520, @mfundul)
  • Update exporting engine to read the prefix option from instance config sections (#9463, @vlvkobal)
  • Fix display error in Swagger API documentation (#9417, @underhood)
  • Wrap exporting engine header definitions in compilation conditions (#9458, @candrews)
  • Improve cgroups collector to autodetect unified cgroups (#9249, @underhood)
  • Fix CMake build failing if ACLK is disabled (#9537, @underhood)
  • Fix now_ms in charts.d collector to prevent tc-qos-helper crashes (#9510, @ilyam8)
  • Fix python.d crashes by adding a lock to stdout write function (#9508, @ilyam8)
  • Fix an issue with random crashes when updating a chart's metadata on the fly (#9509, @stelfrag)
  • Fix ACLK protocol version always parsed as 0 (#9502, @underhood)
  • Fix the check condition for chart name change (#9503, @stelfrag)
  • Fix the exporting engine unit tests (#9460, @vlvkobal)
  • Fix a Coverity defect for resource leaks (#9462, @vlvkobal)
Assets 5

@netdatabot netdatabot released this Jul 1, 2020 · 256 commits to master since this release

Netdata v1.23.1

Release v1.23.1 of the Netdata Agent is a patch for two significant issues.

PR #9436 fixed an issue where dimensions were marked obsolete and archived simultaneously, which caused segmentation faults. We're grateful to marioem, who first reported the issue, and other members of the Netdata community who contributed their insights and valuable log information, which we used to diagnose and fix the bug.

PR [#9428] fixed a significant issue with duplicate alarm IDs, which caused issues in how alarms were sent and displayed in Netdata Cloud.

This release also contains a few additional bug fixes that were not fully reviewed before the release of v1.23.0.

Bug fixes

  • Disallow dimensions and chart being obsolete and archived simultaneously. (#9436, @mfundul)
  • Fix duplicate alarm ids in health-log.db (#9428, @stelfrag)
  • Show cgroups/containers ran by Kubelet without access to Kubernetes cluster information (#9321, @cakrit)
  • Fix children version on stream (#9438, @thiagoftsm)
  • Fix internal registry (#9434, @thiagoftsm)
  • Correct virtualization detection in system-info.sh (#9425, @Ferroin)
  • Fix the unittest execution (#9445, @thiagoftsm)
  • Update description in registry with minor copy edits (#9441, @amoss)
  • Stop reading from /proc/sys/kernel/osrelease at trailing newline (#9374, @sjuxax)
Assets 5

@netdatabot netdatabot released this Jun 25, 2020 · 286 commits to master since this release

Release v1.23.0

The v1.23.0 release of the Netdata Agent is all about unlocking new depths of visibility for your applications, services, and systems. We have Kubernetes service discovery, new eBPF metrics like virtual filesystem switch and bandwidth per process out of the Linux kernel at event frequency, more interoperability with your monitoring stack thanks to a new exporting engine, and much more.

This release contains 2 new collectors, 1 new exporting connector, 1 new alarm notification method, 55 improvements, 45 documentation updates, and 40 bug fixes.

At a glance

Our service discovery collector detects Kubernetes (k8s) pods and immediately collects metrics from 22 different services as the associated pods are created, destroyed, and scaled. Service discovery is installed when you use our Helm chart, which means you can now collect and visualize service-, pod-, Kubelet-, kube-proxy-, and node-level k8s metrics with one helm install command and zero configuration. All our Kubernetes monitoring components are open source and free for clusters of any size.

Our low-level Linux kernel monitoring via eBPF is now supercharged. Thanks to an integration with apps.plugin, you can now monitor how a specific application interacts with the Linux kernel. This update also includes new metrics, such as virtual filesystem switch, bandwidth per process, and much more. Netdata collects these metrics at an event frequency, even better than our famous 1s granularity, so that you can debug applications or anomalies with pinpoint accuracy. The eBPF collector is also now installed and enabled by default except on static builds.

Read our guide on troubleshooting apps with eBPF metrics for more details.

Netdata is now more interoperable with your existing monitoring stack thanks to the exporting engine, which replaces the backends system. You can now export to multiple external databases through Graphite, Google Cloud Pub/Sub, Prometheus remote write, MongoDB, and JSON connectors, plus others. Send metrics as soon as they're collected to enrich single pane of glass views or analyze Netdata's metrics with machine learning.

Read our guide on exporting metrics to Graphite for specifics on just one of many pipelines you can set up to archive your Netdata metrics.

We're also releasing an improvement for the availability of your monitoring and metrics: persistent metadata. The Agent now writes metadata to disk alongside metrics to allow access to non-active charts from Netdata Cloud and enable future features.

We added some enhancements to our documentation site, including a new guides section. We'll continue to populate with more use case- and scenario-based content to help you monitor, troubleshoot, visualize, and export your Netdata metrics.

Acknowledgments

Improvements

  • Added libuv thread names support to FATAL log level. (#9382) by mfundul
  • Updated the React dashboard to v1.0.14_2. (#9350) by jacekkolasa
  • Improved PR guidelines for developers and contributors. (#8809) by prologic
  • Removed master-slave verbiage and replaced it with parent-child. (#9323) by amoss, (#9312) by joelhans
  • Added support for persistent metadata. (#9324) by stelfrag
  • Add verbose prints when spawn server fails to spawn. (#9305) by mfundul
  • Updated streaming protocol calculate clock-slew and gap-size when child nodes reconnect to a parent. (#9214) by amoss
  • Implemented a new incremental parser for internal plugins and child nodes. (#9074) by stelfrag
  • Improved database engine by reducing its minimum size to 64 MiB. (#9094) by mfundul
  • Added alphabetical sort and automatic scroll to dash.html. (#8762) by tnyeanderson
  • Added a spawn server to improved Agent scalability by reducing the impact of alarm execution and notification to critical sections in the main health thread. (#8407) by mfundul

Netdata Cloud

  • Added metrics for ACLK performance and status to the Netdata Monitoring section of the dashboard. (#9269) by underhood
  • Improved the node re-claiming process by regenerating the topic base. (#9044) by amoss

Collectors

  • Updated the Go orchestrator to v0.19.2. (#9340) by ilyam8
  • Added the agent-service-discovery collector plugin to apps_group.conf. (#9315) by ilyam8
  • Improved consistency of Kubernetes cgroup names. (#9303) by cakrit
  • Updated the Go orchestrator to v0.19.1. (#9309) by ilyam8
  • Added imunify and lsphp to apps_groups.conf. (#9284) by thiagoftsm
  • Updated the Go orchestrator to v0.19.0. (#9294) by ilyam8
  • Added support for the eBPF collector in static installations (kickstart-static64.sh). (#8879) by prologic
  • Updated the eBPF kernel-collector to v0.4.0. See the changelog for details. (#9212) by Ferroin
  • Added integration between ebpf.plugin and apps.plugin. (#9178) by thiagoftsm
  • Converted the eBPF collector into a modular design to allow multiple eBPF programs to run in parallel. (#9148) by thiagoftsm
  • Added an OSD size collection chart to the Ceph collector. (#8649) by elelayan
  • Updated the eBPF kernel-collector to v0.2.0. See the changelog for details. (#9118) by prologic
  • Improved system-info.sh to better handle certain cases when gathering info on the system's disk capacity. (#7902) by Ferroin
  • Changed the eBPF collector to install and enable it by default. (#8665) by Ferroin
  • Enhanced the Samba collector to only use sudo when not running as the root user. (#9038) by Duffyx
  • Renamed the eBPF collector from ebpf_process.plugin to ebpf.plugin. (#8822) by thiagoftsm
  • Added more command line options to the eBPF collector to support upcoming features. (#8879) by thiagoftsm
  • Added compatibility for Varnish Cache Plus in the varnish collector. (#8940) by pgjavier

Packaging/installation

  • Added new streaming files into CMake build. (#9316) by underhood
  • Added support for macOS/Homebrew in install-required-packages.sh. (#8286) by Ferroin
  • Improved reliability of checksums for kickstart.sh/kickstart-static64.sh installation scripts. (#9165) by prologic
  • Added required bundle for libuuid on ClearLinux. (#9060) by Ferroin
  • Removed conflicting EPEL packages. (#9108) by Saruspete

Exporting

  • Moved nc backend to exporting. (#9030) by thiagoftsm
  • Added missing checks to exporting engine. (#9034) by thiagoftsm
  • Added new alarms for exporting engine resource usage and deprecation of backends. (#9075) by thiagoftsm
  • Added an error report to the AWS Kinesis connector. (#9048) by thiagoftsm
  • Added memory cleanup to remaining exporting connectors. (#9098) by thiagoftsm
  • Added a warning if the exporting engine's update interval is not a multiple of the database's update interval. (#9131) by vlvkobal
  • Added anonymous statistics to exporting engine to collect usage data. (#9125) by vlvkobal
  • Improved dynamic memory cleanup for Pub/Sub exporting connector. (#9112) by vlvkobal
  • Improved dynamic memory cleanup for the MongoDB exporting connector. (#9103) by vlvkobal
  • Finalized the main cleanup function for the exporting engine. (#9099) by vlvkobal
  • Added a function to help clean up memory on exit. (#9081) by vlvkobal
  • Added a Google Cloud Pub/Sub connector to the exporting engine. (#8855) by vlvkobal

Notifications

  • Added support for Matrix notifications. (#9196) by okias

CI/CD

  • Removed Gentoo from CI checks. (#9327) by prologic
  • Added a random offset to the update script when running non-interactively. (#9245) by Ferroin
  • Added a CI check for building against LibreSSL. (#9216) by prologic
  • Added a health check functionality to Docker images. (#9172) by Ferroin
  • Added CI for static builds of the Netdata Agent (used by kickstart-static64.sh). (#9130) by prologic
  • Removed deprecated documentation Dockerfile and associated Docker Hub image. (#9126) by prologic
  • Removed deprecated documentation tooling. (#8783) by prologic
  • Added a CI job to check Markdown links during PRs. (#9003) by joelhans
  • Removed Polyverse Polymorphic Linux from Docker builds to reduce the image size. (#8802) by Ferroin

Documentation

  • Fixed a typo in the Synology installation documentation. (#9400) by pkrasam
  • Added a guide for troubleshooting with eBPF metrics. (#9352) by joelhans
  • Improved the FreeBSD installation documentation. (#9116) by thoggs
  • Added a missing slash to the claiming documentation. (#9257) by oneoneonepig
  • Changed the recommended repository for CentOS 8 users. (#9308) by Ferroin
  • Added a guide for exporting metrics to Graphite. (#9285) by joelhans
  • Added a link in the eBPF documentation to the kernel documentation for ftrace. (#9211) by Steve8291
  • Fixed curly to straight apostrophe. (#8723) by zack-shoylev
  • Added documentation and dashboard information for new eBPF-apps.plugin integration. (#9199) by thiagoftsm
  • Moved and refactored docs to accomodate new Guides section on Learn. (#9266) by joelhans
  • Removed outdated information/links from main README and registry doc. (#9265) by joelhans
  • Added notes/known issues section to installation page. (#9053) by joelhans
  • Fixed ambiguity in health reference for of and foreach options in lookup line. (#9255) by underhood
  • Added a new "home base" document for the exporting engine. (#9246) by joelhans
  • Improved database engine documentation for streaming setups. (#9177) by joelhans
  • Fixed typo in eBPF collector README.md. (#9205) by Steve8291
  • Fixed typo in README.md. (#9151) by stephenrauch
  • Removed the "experimental" label from the exporting engine documentation. (#9171) by vlvkobal
  • Fixed typo in step 3 of step-by-step guide. (#9150) by waybeforenow
  • Added a Certbot troubleshooting section to step 10 of the step-by-step guide. (#9000) by Jelmerrevers
  • Updated eBPF documentation to reflect default enabled status. (#9105) by joelhans
  • Added ACLK connection details. (#9047) by zack-shoylev
  • Added CMake to the list of packages to install on FreeBSD installations. (#9031) by zvarnes
  • Improved Synology installation document with better formatting and instructions. (#8658) by thenktor
  • Updated pfSense installation document with new packages and processes. (#8544) by electropup42
  • Updated documentation contributing guidelines and Netdata style guide. (#8781) by joelhans
  • Added links to promote database engine calculator. (#9067) by joelhans
  • Updated exporting engine documentation to prepare for enabling it by default. (#9066) by vlvkobal
  • Added requirements to the ProxySQL collector documentation. (#9071) by ilyam8
  • Added proc.plugin configuration example for high-processor systems. (#9062) by joelhans
  • Added frontmatter for exporting connectors. (#9052) by joelhans
  • Fixed grammar error in HAProxy documentation. (#8703) by cherouvim
  • Updated FreeBSD package installation documentation. (#8643) by thenktor
  • Fixed docker run instruction in claiming document. (#9058) by ilyam8
  • Added a note about restarting a node during reclaiming. (#9049) by zack-shoylev
  • Removed mentions of old Cloud and replaced them with new Cloud/dashboard. (#8874) by joelhans
  • Fixed broken link in web server log guide on GitHub. (#9033) by joelhans
  • Removed emoji from step-by-step guide. (#8872) by MeganBishopMoore
  • Added text to claiming documentation about reclaiming. (#9027) by joelhans
  • Updated daemon output with new URLs and dates. (#8965) by joelhans
  • Added netdatalib and netdatacache volumes to the Docker-with-Caddy documentation. (#8999) by webash
  • Fixed an incorrect file name in the Go-based web log collector. (#8964) by gruentee
  • Removed incorrect UNUSED from flood protection configuration options documentation. (#8964) by mfundul
  • Fixed internal links and removed obsolete admonitions. (#8946) by joelhans
  • Updated docs with go-live claiming and ACLK information. (#8960) by joelhans

Bug fixes

  • Fixed a Coverity defect. (#9402) by amoss
  • Fix a bug in the simple exporting connector that caused crashes when both opentsdb:https and another connector were enabled together. (#9389) by vlvkobal
  • Fixed missing host variables on stream. (#9396) by thiagoftsm
  • Fixed race-hazard in streaming during the shutdown sequence. (#9370) by amoss
  • Fixed error handling and recovery during compaction and metadata log replay. (#9354) by stelfrag
  • Fixed ACLK shutdown sequence. (#9367) by underhood
  • Fixed logging by replacing assert() calls with new fatal_assert(). (#9349) by mfundul
  • Fixed issues with CentOS 6 installations by getting Netdata execution path early to avoid user permission issues. (#9339) by mfundul
  • Fixed issues with ebpf.plugin and apps.plugin integration. (#9333) by thiagoftsm
  • Fixed Coverity warnings in database. (#9338) by mfundul
  • Fixed compiler warnings from the database when the Agent is compiled with the --disable-cloud flag. (#9337) by stelfrag
  • Fixed invalid memory access in databases to avoid Coverity errors. (#9326) by stelfrag
  • Fixed broken updates to do enabling the eBPF collector by default with a dummy --enable-ebpf flag. (#9310) by Ferroin
  • Fixed exporting to Cortex by adding an additional HTTP header to the Prometheus remore write connector. (#9302) by vlvkobal
  • Fixed a race hazard causing crashes in streaming configurations. (#9297) by amoss
  • Fixed handling of OpenSSL on CentOS/RHEL by bundling a static copy and selecting a configuration directory at install time. (#9263) by Ferroin
  • Fixed static installation from overwriting netdata.conf. (#9174) by Ferroin
  • Fixed compilation on older systems (Ubuntu 14.04 LTS, Debian 8, CentOS 6). (#9198) by ktsaou
  • Fixed broken unit tests for the exporting engine. (#9183) by vlvkobal
  • Fixed an issue with the exporting engine not cleaning a string on exit. (#9188) by vlvkobal
  • Fixed issue with incremental parser breaking CMake builds. (#9186) by stelfrag
  • Fixed the eBPF collector failing to install on certain systems. (#9182) by prologic
  • Fixed Coverity warning. (#9180) by thiagoftsm
  • Fixed required packages for Gentoo builds. (#9141) by vsc55
  • Fixed Coverity warning. (#9157) by stelfrag
  • Fixed broken collector plugins due to bug in parser. (#9158) by stelfrag
  • Fixed the Xenstat collector to correctly track the last number of vCPUs. (#8720) by rushikeshjadhav
  • Fixed incorrect link in install-required-packages.sh to help users submit a GitHub issue. (#8911) by prologic
  • Fixed enable/start of netdata service in Debian package. (#9005) by MrFreezeex
  • Fixed buffer splitting in the Kinesis exporting connector. (#9122) by vlvkobal
  • Fixed suid bits on plugin for Debian packaging. (#8996) by MrFreezeex
  • Fixed zombie procesess in Docker image by restoring SIGCHLD signal handler. (#9107) by mfundul
  • Fixed static installation to not overwrite netdata.conf when updating. (#9046) by Ferroin
  • Fixed typo in the dashboard's description of the mem.kernel chart. (#9096) by Neamar
  • Fixed incorrectly formatted TYPE lines in the Prometheus backend/exporter. (#9086) by jeffgdotorg
  • Fixed error handling in the exporting connector. (#8910) by vlvkobal
  • Added a missing bracket to the Netdata API swagger .json file. (#8814) by dpsy4
  • Fixed the health entity calculation used for ram_in_use and used_ram_to_ignore in systems using ZFS. (#8913) by araemo
  • Fixed incorrect hostnames in the exporting engine. (#8892) by vlvkobal
  • Fixed an issue with the PostgreSQL collector to correctly ignore template1/template0 databases. (#8929) by slavaGanzin
Assets 5

@netdatabot netdatabot released this May 12, 2020 · 624 commits to master since this release

Netdata v1.22.1

Release v1.22.1 is a hotfix release to address issues related to packaging and how Agents connect to Netdata Cloud.

With packaging, we fixed an error that caused DEB and RPM packages to only display the old dashboard and not the new React version. We also fixed an issue that caused Netdata Docker containers to fail due to incorrect permissions. Finally, we ensured JSON-C is correctly fetched and built for compatibility with Netdata Cloud.

We appreciate our community's help in identifying and diagnosing these issues so we could fix them quickly.

For Netdata Cloud, we optimized the on-connect payload sent through the Agent-Cloud link to improve latency between Agents and Cloud. We also removed a check for old alarm status when sending alarms to Cloud via the ACLK.

Finally, we made a fix that ensures Agents running on systems using the musl C library can receive auto-updates.

Bug fixes

  • Fixed the latency issue on the ACLK and suppress the diagnostics. (#8992) by amoss and stelfrag
  • Restored old semantics of "netdata -W set" command. (#8987) by mfundul
  • Added JSON-C packaging fils to make dist. (#8986) by Ferroin
  • Fixed bundling of React dashboard in DEB and RPM packages. (#8988) by Ferroin
  • Removed check for old alarm status. (#8978) by stelfrag
  • Fixed shutdown via netdatacli with musl C library. (#8931) by mfundul
Assets 5

@netdatabot netdatabot released this May 11, 2020 · 646 commits to master since this release

Release v1.22.0

Release v1.22.0 marks the official launch of our rearchitected Netdata Cloud! This Agent release contains both backend and interface changes necessary to connect your distributed nodes to this dramatically improved experience.

Netdata Cloud builds on top of our open source monitoring Agent to give you real-time visibility for your entire infrastructure. Once you've connected your Agents to Cloud, you can view key metrics, insightful charts, and active alarms from all your nodes in a single web interface. When an anomaly strikes, seamlessly navigate to any node to troubleshoot and discover the root cause with the familiar Netdata dashboard.

Animated GIF of Netdata Cloud

Sign in to Cloud and read our Get started with Cloud guide for details on updating your nodes, claiming them, and navigating the new Cloud.

While Netdata Cloud offers a centralized method of monitoring your Agents, your metrics data is not stored or centralized in any way. Metrics data remains with your nodes and is only streamed to your browser through Cloud.

In addition, Cloud only expands on the functionality of the wildly popular free and open source Agent. We will never make any of our open source Agent features Cloud-exclusive, and we will actively continue to develop the Agent so that we can integrate new features with Netdata Cloud.

This release also contains 1 new collector, 1 new exporting connector, 1 new alarm notification method, 27 improvements, 16 documentation updates, and 22 bug fixes.

At a glance

We added a new collector called whoisquery that helps you monitor a domain name's expiration date. You can track as many domains as you'd like, and set custom warning and critical thresholds for each. For more information on setup and configuration, see the Whois domain expiry monitoring documentation.

We added a new connector to our experimental exporting engine: Prometheus remote write. You can use this connector to send Netdata metrics to your choice of more than 20 external storage providers for long-term archiving and further analysis.

Our new documentation experience is now available at Netdata Learn! We encourage you to try it out and give us feedback or ask questions in our GitHub issues. Learn features documentation for both the Agent and Cloud in separate-but-connected vaults, which streamlines the experience of learning about both products.

While Learn only features documentation for now, we plan on releasing more types of educational content serving the Agent's open-source community of developers, sysadmins, and DevOps folks. We'll have more to announce soon, but in the meantime, we hope you enjoy what we believe is a smoother (and prettier) docs experience.

Acknowledgments

  • amishmm for updating netdata.conf and netdata.service.v235.in.
  • adamwolf for fixing a typo in netdata-installer.sh.
  • lassebm for fixing a crash when shutting down an Agent with the ACLK disabled.
  • yasharne for adding a new whoisquery collector and for adding health alarm templates for both the whoisquery and x509check collectors.
  • illumine for adding Dynatrace as a new alarm notification method.
  • slavaGanzin, carehart, Jiab77, and IceCodeNew for documentation fixes and improvements.

Breaking changes

  • The previous iteration of Netdata Cloud, accessible through various Sign in and Nodes view (beta) buttons on the Agent dashboard, is deprecated in favor of the new Cloud experience.
  • Our old documentation site (docs.netdata.cloud) was replaced with Netdata Learn. All existing backlinks redirect to the new site.
  • Our localization project is no longer actively maintained. We're grateful for the hard work of its contributors.

Improvements

Netdata Cloud

Collectors

Packaging/installation

  • Added missing NETDATA_STOP_CMD in netdata-installer.sh. (#8897) by prologic
  • Added JSON-C dependency handling to installation and packaging. (#8776) by Ferroin
  • Added a check to wait for a recently-published tag to appear in Docker Hub before publishing new images. (#8713) by knatsakis
  • Removed obsolete scripts from Docker images. (#8704) by knatsakis
  • Removed obsolete DEVEL support from Docker images. (#8702) by knatsakis
  • Improved how we publish Docker images by pushing synchronously. (#8701) by knatsakis

Exporting

  • Enabled internal statistics for the exporting engine in the Agent dashboard. (#8635) by vlvkobal
  • Implemented a Prometheus exporter web API endpoint. (#8540) by vlvkobal

Notifications

  • Added a certificate revocation alarm for the x509check collector. (#8684) by yasharne
  • Added the ability to send Agent alarm notifications to Dynatrace. (#8476) by illumine

CI/CD

  • Disabled document-start yamllint check. (#8522) by ilyam8
  • Simplified Docker build/publish scripts to support only a single architecture. (#8747) by knatsakis
  • Added Fedora 32 to build checks. (#8417) by Ferroin
  • Added libffi to ArchLinux CI tests as a workaround for an upstream bug. (#8476) by Ferroin

Other

  • Updated main copyright and links for the year 2020 in daemon help output. (#8937) by zack-shoylev
  • Moved bind to to [web] section and update netdata.service.v235.in to sync it with recent changes. (#8454) by amishmm
  • Put old dashboard behind a prefix instead of using a script to switch. (#8754) by Ferroin
  • Enabled the truthy rule in yamllint. (#8698) by ilyam8
  • Added Borg backup, Squeezebox servers, Hiawatha web server, and Microsoft SQL to apps.plugin so that it can appropriately group them by type of service. (#8646), (#8655), (#8656), and (#8659) by vlvkobal

Documentation

  • Add custom label to collectors frontmatter to fix sidebar titles in generated docs site at learn.netdata.cloud. (#8936) by joelhans
  • Added instructions to persist metrics and restart policy in Docker installations. (#8813) by joelhans
  • Fixed modifier in Nginx guide to ensure correct paths and filenames. (#8880) by slavaGanzin
  • Added documentation for working around Clang build errors. (#8867) by Ferroin
  • Fixed typo in Docker installation instructions. (#8861) by carehart
  • Added Docker instructions to claiming docs. (#8755) by joelhans
  • Capitalized title in streaming doc. (#8712) by zack-shoylev
  • Updated pfSense doc and added warning for apcupsd users. (#8686) by cryptoluks
  • Improved offline installation instructions to point to correct installation scripts and clarify process. (#8680) by IceCodeNew
  • Added missing path to the process of editing charts.d.conf. (#8740) by Jiab77
  • Added combined claiming and ACLK documentation. (#8724) by joelhans
  • Standardized how we link between various Agent-specific documentation. (#8638) by joelhans
  • Pinned mkdocs-material to re-enable Netlify builds of documentation site. (#8639) by joelhans
  • Updated main README.md with v1.21 release news. (#8619) by joelhans
  • Changed references of MacOS to macOS. (#8562) by joelhans

Bug fixes

  • Fixed kickstart error by removing old cron symlink. (#8849) by prologic
  • Fixed bundling of old dashboard in binary packages. (#8844) by Ferroin
  • Fixed typo in netdata-installer.sh. (#8811) by adamwolf
  • Fixed failure output during installations by removing old function call. (#8824) by Ferroin
  • Fixed bundle-dashboard.sh script to prevent broken package builds. (#8823) by prologic
  • Fixed mdstat failed devices alarm. (#8752) by ilyam8
  • Fixed rare race condition in old Cloud iframe. (#8786) by jacekkolasa
  • Removed no-clear-notification options from portcheck health templates. (#8748) by ilyam8
  • Fixed issue in system-info.shregarding the parsing of lscpu output. (#8754) by Ferroin
  • Fixed old URLs to silence Netlify's mixed content warnings. (#8759) by knatsakis
  • Fixed master streaming fatal exits. (#8780) by thiagoftsm
  • Fixed email authentiation to Cloud/Nodes View. (#8757) by jacekkolasa
  • Fixed non-escaped characters in private registry URLs. (#8757) by jacekkolasa
  • Fixed crash when shutting down an Agent with the ACLK disabled. (#8725) by lassebm
  • Fixed Docker-based builder image. (#8718) by ilyam8
  • Fixed status checks for UPS devices using the apcupsd collector. (#8688) by ilyam8
  • Fixed the build matrix in the build and install GitHub Actions checks. (#8715) by Ferroin
  • Fixed eBPF collector compatibility with the 7.x family of RedHat. (#8694) by thiagoftsm
  • Fixed alarm notification script by adding a check to the Dynatrace notification method. (#8654) by ilyam8
  • Fixed threads_creation_rate chart context in the python.d MySQL collector. (#8636) by ilyam8
  • Fixed errors shown when running install-requred-packages.sh on certain Linux systems. (#8606) by ilyam8
  • Fixed sudo check in charts.d libreswan collector to prevent daily security notices. (#8569) by ilyam8
Assets 5

@netdatabot netdatabot released this Apr 13, 2020 · 812 commits to master since this release

Netdata v1.21.1

Release v1.21.1 is a hotfix release to improve the performance of the new React dashboard, which was merged and enabled by default in v1.21.0.

The React dashboard shipped in v1.21.0 did not properly freeze charts that were outside of the browser's viewport. If a user who loaded many charts by scrolling through the dashboard, charts outside of their browser's viewport continued updating. This excess of chart updates caused all charts to update more slowly than every second.

v.1.21.1 includes improvements to the way the Netdata dashboard freezes, maintains state, and restores charts as users scroll.

Assets 5

@netdatabot netdatabot released this Apr 6, 2020 · 836 commits to master since this release

Netdata v1.21.0

Release v1.21.0 contains 2 new collectors, 3 new exporting connectors, 37 bug fixes, 46 improvements, and 25 documentation updates. We also made 26 bug fixes or improvements related to the upcoming release of Netdata Cloud.

At a glance

We added a new collector for Apache Pulsar, a popular open-source distributed pub-sub messaging system. We use Pulsar in our Netdata Cloud infrastructure (more on that later this month!), and are excited to start sharing metrics about our own Pulsar systems when the time comes. The Pulsar collector attempts to auto-detect any running Pulsar processes, but you can always configure the collector based on your setup.

Also new in v1.21 is a VerneMQ collector. We use the open-source MQ Telemetry Transport (MQTT) broker for Netdata Cloud as well. As with Pulsar, you can configure the VerneMQ collector to auto-detect your installation in just a few steps.

Our experimental exporting engine received significant updates with new connectors for Prometheus remote write, MongoDB, and AWS Kinesis Data Streams. You can now send Netdata metrics to more than 20 additional external storage providers for long-term archiving and deeper analysis. Learn more about the exporting engine in our documentation.

We upgraded our TLS compatibility to include 1.3, which applies to HTTPS for both Netdata's web server and streaming connections. TLS 1.3 is the most up-to-date version of the TLS protocol, and contains important fixes and improvements to ensure strong encryption. If you enabled TLS in the web server or streaming, Netdata attempts to use 1.3 by default, but you can also set the version and ciphers explicitly. Learn more in the documentation.

The Netdata dashboard has been completely re-written in React. While the look and behavior hasn't changed, these under-the-hood changes enable a suite of new features, UX improvements, and design overhauls. With React, we'll be able to work faster and better resource our talented engineers.

As part of the ongoing work to polish our eBPF collector tech preview, we've now proven the collector's performance is very good, and have vastly expanded the number of operating system versions the collector works on. Learn how to enable it in our documentation. We've also extensively stress-tested the eBPF collector and found that it's impressively fast given the depth of metrics it collects! Read up on our benchmarking analysis on GitHub.

Acknowledgments

  • Jiab77 for helping remove extra printed \n in various installation methods.
  • SamK for fixing missing folders in /var/ for .deb installations.
  • kevenwyld for improving Netdata's support of RHEL distributions.
  • WoozyMasta for adding in the ability to get Kubernetes pod names with kubectl in bare-metal deployments.
  • paulmezz for adding the ability to to connect to non-admin user IDs when trying to collect metrics from a Ceph storage cluster.
  • ManuelPombo for adding additional charts to our Postgres collector, and anayrat for helping review the changes.
  • Default for adding lsyncd to the backup group in apps.plugin.
  • bceylan, peroxy, toadjaune, grinapo, m-rey, and YorikSar for documentation fixes.

Breaking changes

None.

Improvements

  • Extended TLS support for 1.3. (#8505) by thiagoftsm
  • Switched to the React dashboard code as the default dashboard. (#8363) by Ferroin

Collectors

  • Added a new Pulsar collector. (#8364) by ilyam8
  • Added a new VerneMQ collector. (#8236) by ilyam8
  • Added high precision timer support for plugins such as idlejitter. (#8441) by mfundul
  • Added an alarm to the dns_query collector that detects DNS query failure. (#8434) by ilyam8
  • Added the ability to get the pod name from cgroup with kubectl in bare-metal deployments. (#7416) by WoozyMasta
  • Added the ability to connect to non-admin user IDs for a Ceph storage cluster. (#8276) by paulmezz
  • Added connections (backend) usage to Postgres monitoring. (#8126) by ManuelPombo
  • eBPF: Added support for additional Linux kernels found in Debian 10.2 and Ubuntu 18.04. (#8192) by thiagoftsm

Packaging/installation

  • Added missing override for Ubuntu Eoan. (#8547) by prologic
  • Added Docker build arguments to pass extra options to Netdata installer. (#8472) by Ferroin
  • Added deferred error message handling to the installer. (#8381) by Ferroin
  • Fixed cosmetic error checking for CentOS 8 version in install-required-packages.sh. (#8339) by prologic
  • Added various fixes and improvements to the installers. (#8315) by Ferroin
  • Migrated to installing only Python 3 packages during installation. (#8318) by Ferroin
  • Improved support for RHEL by not installing the CUPS plugin when v1.7 of CUPS cannot be installed. (#7216) by kevenwyld
  • Added support for Clear Linux in install-required-packages.sh. (#8154) by Ferroin
  • Removed Fedora 29 from CI and packaging. (#8100) by Ferroin
  • Removed Ubuntu 19.04 from CI and packaging. (#8040) by Ferroin
  • Removed OpenSUSE Leap 15.0 from CI. (#7990) by Ferroin

Exporting

  • Added a MongoDB connector to the exporting engine. (#8416) by vlvkobal
  • Added a Prometheus Remote Write connector to the exporting engine. (#8292) by vlvkobal
  • Added an AWS Kinesis connector to the exporting engine. (#8145) by vlvkobal

Documentation

  • Fixed typo in main README.md. (#8547) by bceylan
  • Updated the update instructions with per-method details. (#8394) by joelhans
  • Updated paragraph on install-required-packages.sh. (#8347) by prologic
  • Added Patti's dashboard video to the documentation. (#8385) by joelhans
  • Fixed go.d modules in the COLLECTORS.md. (#8380) by ilyam8
  • Added frontmatter to all documentation in bulk. (#8354) and (#8372) by joelhans
  • Fixed MDX parsing in installation guide. (#8362) by joelhans
  • Fixed typo in eBPF documentation. (#8360) by ilyam8
  • Fixed links in packaging/installer to work on GitHub and docs. (#8319) by joelhans
  • Fixed typo in main README.md. (#8335) by peroxy
  • Removed mention saying that .deb packages are experimental. (#8250) by toadjaune
  • Added standards for abbreviations/acronyms to docs style guide. (#8313) by joelhans
  • Tweaked eBPF documentation, and added performance data. (#8261) by joelhans
  • Added requirements for the exim collector. (#8096) by petarkozic
  • Fixed misspelling of openSUSE and SUSE. (#8233) by m-rey
  • Added OpenGraph tags to documentation pages. (#8224) by joelhans
  • Fixed typo in custom dashboard documentation. (#8213) by shortpatti
  • Removed extra asterisks in main README. (#8193) by grinapo
  • Added eBPF README to documentation navigation and improved page title. (#8191) by joelhans
  • Fixed figure+image without closing tag in new documentation. (#8177) by joelhans
  • Corrected instructions for running Netdata behind Apache. (#8169) by cakrit
  • Added PR title guidelines to the contribution guidelines to make CHANGELOG.md more meaningful. (#8150) by cakrit
  • Fixed formatting in Custom dashboards documentation. (#8102) by YorikSar
  • Updated the manual install documentation with better information about CentOS 6. (#8088) by Ferroin
  • Added tutorials to support v1.20 release (#7943) by joelhans

CI/CD

  • Added logic to bail early on LWS build if cmake is not present. (#8559) by Ferroin
  • Added python.d configuration files to YAML linting CI process and increase line limit to 120 characters. (#8541) and (#8542) by ilyam8
  • Cleaned up GitHub Actions workflows. (#8383) by Ferroin
  • Migrated tests from Travis CI to Github Workflows. (#8331) by prologic
  • Covered install-required-packages.sh with Coverity scan. (#8388) by prologic
  • Added support for cross-host docker-compose builds. (#7754) by amoss
  • Reconfigured Travis CI to retry transient failures on lifecycle tests. (#8203) by prologic
  • Switched to checkout@v2 in GitHub Actions. (#8170) by ilyam8

Other

  • Added lsyncd to the backup group in apps.plugin. (#8159) by Default

Netdata Cloud

  • Fixed compiler warnings in the claiming code. (#8567) by vlvkobal
  • Fixed regressions in cloud functionality (build, CI, claiming). (#8568) by underhood
  • Switched over to soft feature flag. (#8545) by amoss
  • Improved claiming behavior to run as netdata user by default, or override if necessary. (#8516) by amoss
  • Updated the info endpoint for Cloud notifications. (#8519) by amoss
  • Added correct error logging for ACLK challenge/response. (#8538) by stelfrag
  • Cleaned up Cloud configuration files to move [agent_cloud_link] settings to [cloud]. (#8501) by underhood
  • Enhanced ACLK header payload to include timestamp-offset-usec. (#8499) by stelfrag
  • Added ACLK build failures to anonymous statistics. (#8429) by underhood
  • Added ACLK connection failures to anonymous statistics. (#8456) by underhood
  • Added HTTP proxy support to ACLK. (#8406)/(#8418) by underhood
  • Improved ownership of the claim.d directory. (#8475) by amoss
  • Fixed the ACLK response payload to match the new specification. (#8420) by stelfrag
  • Added the new cloud info in the info endpoint. (#8430) by amoss
  • Implemented ACLK Last Will and Testament. (#8410) by stelfrag
  • Fixed JSON parsing in ACLK. (#8426) by stelfrag
  • Fixed outstanding problems in claiming and add SOCKS5 support. (#8406)/(#8404) by amoss and underhood
  • Fixed the type value for alarm updates in the ACLK. (#8403) by stelfrag
  • Improved performance of ACLK. (#8399)/(#8401) by amoss
  • Improved the ACLK's agent "pop-corning" phase. (#8398) by stelfrag
  • Improved ACLK according to results of the smoke-test. (#8358) by amoss and underhood
  • Added code to bundle LWS in binary packages. (#8255) by Ferroin
  • Added libwebsockets files to make dist. (#8275) by Ferroin
  • Adapted the claiming script to new API responses. (#8245) by hmoragrega
  • Fixed claiming script to reflect Netdata Cloud API changes. (#8220) by cosmix
  • Added libwebsockets bundling code to netdata-installer.sh. (#8144) by Ferroin

Bug fixes

  • Removed notifications from the dashboard and fixed the /default.html route. (#8599 by jacekkolasa
  • Fixed help-tooltips styling, private registry node deletion, and the right-hand sidebar "jumping" on document clicks. (#8553 by jacekkolasa
  • Fixed errors reported by Coverity. (#8593) by thiagoftsm, (#8579) by amoss, and (#8586) by thiagoftsm
  • Added netdata.service.* to .gitignore to hide system/netdata.service.v235 file. (#8556) by vlvkobal
  • Fixed Debian 8 (Jessie) support. (#8590) and (#8593) by prologic
  • Fixed broken Fedora 30/31 RPM builds. (#8572) by prologic
  • Fixed broken pipe ignoring in apps.plugin. (#8554) by vlvkobal
  • Fixed the bytespersec chart context in the Python Apache collector. (#8550) by ilyam8
  • Fixed charts.d.plugin to exit properly during Netdata service restart. (#8529) by ilyam8
  • Fixed minimist dependency vulnerability. (#8537) by jacekkolasa
  • Fixed our Debian/Ubuntu packages to package the expected systemd unit files. (#8468) by prologic
  • Fixed auto-updates for static (kickstart-static64.sh) installs. (#8507) by prologic
  • Fixed openSUSE 15.1 RPM package builds. (#8494) by prologic
  • Fixed how SimpleService truncates Python module names. (#8492) by ilyam8
  • Removed erroneous \n in uninstaller output. (#8446) by prologic
  • Fixed install-required-packages script to self-update apt. (#8491) by prologic
  • Added proper prefix to Python module names during loading. (#8474) by ilyam8
  • Fixed how the Netdata updater script cleans up after being run. (#8414) by prologic
  • Fixed the flushing error threshold with the database engine. (#8425) by mfundul
  • Fixed memory leak for host labels streaming from slaves to master. (#8460) by thiagoftsm
  • Fixed support for uninstalling the eBPF collector in the uninstaller. (#8444) by prologic
  • Fixed a bug involving stop_all_netdata uv_pipe_connect() in the installer. (#8444) by prologic
  • Fixed installer output regarding newlines. (#8447) by prologic
  • Fixed broken dependencies for Ubuntu 19.10. (#8397) by prologic
  • Fixed streaming scaling. (#8375) by mfundul
  • Fixed missing characters in kernel version field by encoding slave fields. (#8216) by thiagoftsm
  • Fixed installation for Ubuntu 14.04 (#7690) by Ehekatl
  • Fixed dependencies for Debian Jessie. (#8290) by Ferroin
  • Fixed dependency names for Arch Linux. (#8334) by Ferroin
  • Removed extra printed \n in various installers. (#8324)/(#8325)/(#8326) by Jiab77
  • Fixed missing folders in /var/ for .deb packages. (#8314) by SamK
  • Fixed Ceph collector to get osd_perf_infos in versions 14.2 and higher. (#8248) by ilyam8
  • Fixed RHEL / CentOS 8.x dependencies for Judy-devel and others.(#8202) by prologic
  • Removed extraneous commas from chart information in dashboard. (#8266) by FlyingSixtySix
  • Removed tmem collection from xenstat_plugin to allow Netdata on Xen 4.13 to compile successfully. (#7951) by rushikeshjadhav
  • Fixed get_latest_version for nightly channel update script. (#8172) by ilyam8
  • Restricted messages to Google Analytics. (#8161) by thiagoftsm
  • Fixed Python 3 dict access in OpenLDAP collector module. (#8162) by Mic92
Assets 5

@netdatabot netdatabot released this Feb 21, 2020 · 1167 commits to master since this release

Netdata v1.20.0

Release v1.20.0 contains 3 new collectors, 54 bug fixes, 89 improvements, and 38 documentation updates.

At a glance

Our first major release of 2020 comes with an alpha version of our new eBPF collector. eBPF (extended Berkeley Packet Filter) is a virtual bytecode machine, built directly into the Linux kernel, that you can use for advanced monitoring and tracing.

With this release, the eBPF collector monitors system calls inside your kernel to help you understand and visualize the behavior of your file descriptors, virtual file system (VFS) actions, and process/thread interactions. You can already use it for debugging applications and better understanding how the Linux kernel handles I/O and process management.

The eBPF collector is in a technical preview, and doesn't come enabled out of the box. If you'd like to learn more about_why_ eBPF metrics are such an important addition to Netdata, see our blog post: Linux eBPF monitoring with Netdata. When you're ready to get started, enable the
eBPF collector by following the steps in our documentation.

This release also introduces host labels, a powerful new way of organizing your Netdata-monitored systems. Netdata automatically creates a handful of labels for essential information, but you can supplement the defaults by segmenting your systems based on their location, purpose, operating system, or even when they went live.

You can use host labels to create alarms that apply only to systems with specific labels, or apply labels to metrics you archive to other databases with our exporting engine. Because labels are streamed from slave to master systems, you can now find critical information about your entire infrastructure directly from the master system.

Our host labels tutorial will walk you through creating your first host labels and putting them to use in Netdata's other features.

Finally, we introduced a new CockroachDB collector. Because we use CockroachDB internally, we wanted a better way of keeping tabs on the health and performance of our databases. Given how popular CockroachDB is right now, we know we're not alone, and are excited to share this collector with our community. See our tutorial on monitoring CockroachDB metrics for set-up details.

We also added a new squid access log collector that parses and visualizes requests, bandwidth, responses, and much more. Our apps.plugin collector has new and improved way of processing groups together, and our cgroups collector is better at LXC (Linux
container) monitoring.

Speaking of collectors, we revamped our collectors documentation to simplify how users learn about metrics collection. You can now view a collectors quickstart to learn the process of enabling collectors and monitoring more applications and services with Netdata, and see everything Netdata collects in our supported collectors list.

Acknowledgements

We're extremely grateful to the following contributors for their help since our last major release in November 2019. Whether it's their first or fiftieth contribution, insights from our users not only help make Netdata better, but also remind us why we're so lucky to be part of a vibrant open-source community.

  • k0ste and DefauIt for improving the application groups of the apps plugin.
  • gmeszaros for a fix to the broken updater.
  • blaines for an elastisearch collector fix.
  • stevenh for adding freeipmi support to our Docker image and lassebm for related fixes and documentation.
  • yasharne for helping us improve the httpcheck collector.
  • candrews for the introduction of -fno-common in CFLAGS.
  • Jiab77 for fixing a typo in the installer options.
  • amishmm for improvements to the systemd service files.
  • tnyeanderson for continuing to improve his multi-host sample dashboard.
  • yasharne and especially schneiderl for corrections to the docs.
  • lucasRolff for improvements to the litespeed collector.
  • Ehekatl for the improvements to the Prometheus remote write API and the fix to thesoftnet alarm.
  • wonsangki for translating several docs into Korean.
  • candrews for fixing the option to disable the Prometheus remote API from configure.
  • kkoomen for improvements to the Apache proxy guide.
  • vzDevelopment for assistance with the unicode support in the python.d plugin.
  • hexchain for the addition of pressure stall information to the proc plugin.
  • nabijaczleweli and rex4539 for documentation fixes.

Breaking Changes

  • Removed deprecated bash collectors apache, cpu_apps, cpufreq, exim, hddtemp, load_average, mem_apps, mysql, nginx, phpfpm, postfix, squid, tomcat #7962 (ilyam8). If you were still using one of these collectors with custom configurations, you can find the new collector that replaces it in the supported collectors list.
  • Modified the Netdata updater to prevent unecessary updates right after installation and to avoid updates via local tarballs #7939 (prologic). These changes introduced a critical bug to the updater, which was fixed via #8057 #8076 (prologic) and #8028 (gmeszaros). See issue 8056 if your Netdata is stuck on v1.19.0-432.

Improvements

Host Labels

New Collectors

Collector improvements

  • apps.plugin
  • varnish: Added SMF metrics (cache on disk) #7926 (ilyam8)
  • phpfpm: Fixed per process chart titles and readme #7876 (ilyam8)
  • python.d: Formatted the code in all modules #7832 (ilyam8)
  • node.d/snmp: - Added snmpv3 support #7802 (ilyam8) - Formatted the code in snmp.node.js #7816 (ilyam8)
  • cgroups: Improved LXC monitoring by filtering out irrelevant LXC cgroups #7760 (vlvkobal)
  • litespeed: Added support for different .rtreport format #7705 (lucasRolff)
  • freeipmi: Added support to the docker image #7081 (stevenh)
  • proc.plugin: Added pressure stall information #7209 #7547 (hexchain)
  • sensors: Improved collection logic #7447 (ilyam8)
  • proc: Started monitoring network interface speed, duplex, operstate #7395 (stelfrag)
  • smartd_log: Fixed the setting in the reallocated sectors count, by setting ATTR5 chart algorithm to absolute #7384 (ilyam8)
  • nvidia-smi: Allow executing nvidia-smi in normal instead of loop mode #7372 (ilyam8)
  • wmi: collect logon metrics, collect logical_disk disk latency metrics
  • weblog: handle MKCOL, PROPFIND, MOVE, SEARCH http request methods
  • scaleio: storage pools and sdcs metrics. (#294)

Exporting Engine

  • Implemented the main flow for the Exporting Engine #7149 (vlvkobal)

Streaming

Installation/Packages

  • Fixed missing directory when creating the symbolic link during eBPF installation and remove future options. #8133 (prologic)
  • Fixed NetData installer on *BSD systems after libmosquitto and eBPF functionality was enabled. #8121 (prologic)
  • Fixed issues with the RPM nightly builds resulting from the bundled libmosquitto functionality that was recently merged. #8109 (Ferroin)
  • Corrected the invocations of mktemp so that they produce temporary directories in $TEMPDIR instead of the current directory, in a way that is compatible with busybox. #8066 (Ferroin)
  • Improved CI/CD workflow to install required packages and build the agent across all the OS/Distro(s) we support #7969 #7949 (prologic)
  • Updated the installer to download go.d.plugin, only if we have a new version #7946 (ilyam8)
  • Assorted cleanup items in the RPM spec file. #7927 (Ferroin)
  • Added a new, simpler, Alpine based Dockerfile for quick dev and testing #7914 (prologic)
  • Added minor fixes and improvements to the installer/updater shell scripts. #7847 (prologic)
  • Added ReviewDog CI checks
  • Stopped removing netdata groups/users during uninstall (Debian postrm) #7817 (prologic)
  • Started using the system service manager to shut down Netdata. #7814 (Ferroin)
  • Improved the systemd service files, by removing unecessary ExecStartPre lines and moving global options to netdata.conf #7790 (amishmm)
  • Removed unnessecary echo calls from the updater. #7783 (Ferroin)
  • Fixed warnings in the Debian package build process and enabled the builds to work with older versions of dpkg-buildpackage by modifying the formatting of the trailer line in the Debian changelog template. #7763 (Ferroin)
  • Cleaned up static build process, by using /bin/sh and removing use of sudo #7725 (prologic)
  • Added auto-updates to kickstart-static64 installations. #7704 (Ferroin)
  • Added static build support for Prometheus remote write #7691 (Ehekatl)
  • Moved the script for installing required packages into the main repo. #7563 (Ferroin)
  • Updated the distribution support matrix. #7636 (Ferroin)
  • Added Ubuntu 19.10 to packaging and lifecycle checks. #7629 (Ferroin
  • Removed EOL distros from CI jobs. #7628 (Ferroin)
  • Made the netdata installer more flexible, to accommodate install with ssl on MacOS #6922 (paulkatsoulakis)
  • Improved shutdown of the Netdata agent on update and uninstall. #7595 (Ferroin)
  • Added Fedora 31 CI integrations. #7524 (Ferroin)
  • Removed CentOS 6 package building and lifecycle tests #7425 (knatsakis), #7430 (ncmans)
  • Removed -f option from groupdel in uninstaller. #7507 (Ferroin)
  • Injected archived backports repository on Debian Jessie for CI package builds. #7495 (Ferroin)
  • Set the default release channel to stable #7399 (ncmans)
  • Removed EOL'd Ubuntu Trusty (14.04) from build #7481 (ncmans)
  • Corrected installer instructions during a non-privileged install #7393 (julidegulen)

Documentation

Privacy

  • Added support for opting out of telemetry via the DO_NOT_TRACK environment variable #7846 #7929 (prologic)
  • Fixed typo in the installer options to disable telemetry #7843 (Jiab77)
  • Improved documentation of opting out of anonymous statistics #7597 (joelhans)
  • Added anon tracking notice for installers #7437 (ncmans)

Other

Bug fixes

  • Fixed problems reported by Coverity for eBPF collector plugin. #8135 (thiagoftsm)
  • Fixed invalid literal for float\(\): NN.NNt error in the elasticsearch python plugin, by adding terabyte unit parsing. #8013 (blaines)
  • Fixed timeout failing in docker containers which broke some python.d collectors #8002 (ilyam8)
  • Fixed python collectors to work on synology6 #7980 (ilyam8)
  • Fixed problem with the httpcheck python collector not being able to check URLs with the POST method, by adding body to the URLService #7956 (ilyam8). Also record the new options in httpcheck.conf #7952 (yasharne)
  • Fixed netdata-updater.sh appearing to fail #7955 (ilyam8)
  • Fixed error/warnings found by shellcheck for the netdata-updater.sh #7938 (prologic)
  • Fixed editing configuration via edit-config, when NetData is installed to a symlinked /opt #7933 (prologic)
  • Fixed installation failures due to .keep files #7829 (prologic)
  • Fixed installation on FreeBSD systems with non GNU sed #7796 (prologic)
  • Fixed Source0 URL in RPM spec #7794 (prologic)
  • Fixed text if current version is >= latest version and already installed #8078 (prologic)
  • Fixed CentOS 7 RPM build failures. #7993 (Ferroin)
  • Fixed wrong messages during the build process #7989 (Ferroin)
  • Fixed the unit tests for the exporting engine #7784 (vlvkobal)
  • Fixed a Coverity issue with an unchecked return value #7780 (vlvkobal)
  • Fixed port in use after uninstall issue, by resolving a libuv IPC pipe cleanup problem #7778 (mfundul)
  • Fixed dbengine repeated global flushing errors and collectors being blocked, by dropping dirty dbengine pages if the disk cannot keep up #7777 (mfundul)
  • Fixed issue with alarm notifications occasionally ignoring the configured severity filter when the ROLE was set to root. #7769 (thiagoftsm)
  • Fixed Netlink Connection Tracker charts in the nfacct plugin #7727 (vlvkobal)
  • Fixed support for read-only /lib on SystemD systems like CoreOS in static build installation #7726 (prologic)
  • Fixed invalid shell installer error and netdata not starting from its installed location. #7698 (Ferroin)
  • Fixed metric values sent via remote write to Prometheus backends, when using average/sum #7694 (Ehekatl)
  • Fixed unclosed brackets in softnet alarm #7693 (Ehekatl)
  • Fixed SEGFAULT when localhost initialization failed #7663 (underhood)
  • Fixed the handling of permissions in the installer script and the RPM spec file so that theya re consistent with each other and with a clean install done with make install. #7632 (Ferroin)
  • Reduced the number of broken pipe error log entries, after a SIGKILL #7588 (thiagoftsm)
  • Fixed a syntax error in the packaging functions. #7686 (Ferroin)
  • Fixed Coverity errors by restoring support for protobuf 3.0 #7683 (vlvkobal)
  • Fixed inability to disable Prometheus remote API #7674 (candrews)
  • Fixed SEGFAULT from the cpuidle plugin #7664 (Saruspete)
  • Fixed samba collector not working, due to inability to run sudo #7655 (ilyam8)
  • Fixed invalid css/js resource errors when URL for slave node has no final / on streaming master #7643 (underhood)
  • Fixed keys_redis chart in the redis collector, by populating keys at runtime #7639 (ilyam8)
  • Fixed UrlService bytes decoding and logger unicode encoding in the python.d plugin #7601 #7614 (ilyam8), #7376 (vzDevelopment)
  • Fixed a warning in the prometheus remote write backend #7609 (vlvkobal)
  • Fixed not detecting more than one adapter in the hpssa collector #7580 (gnoddep)- Fixed race condition in dbengine #7565 (thiagoftsm)
  • Fixed race condition with the dbenging page cache descriptors #7478 (mfundul)
  • Fixed dbengine dirty page flushing warning #7469 (mfundul)
  • Fixed missing parenthesis on alarm softnet.conf #7476 (Steve8291)
  • Fixed race condition in the dbengine #7533 (mfundul)
  • Fixed "Master thread EXPORTING takes too long to exit. Giving up" error, by cleaning up the main exporting engine thread on exit #7558 (vlvkobal)
  • Fixed rabbitmq error "update() unhandled exception: invalid literal for int() with base 10" #7464 (ilyam8)
  • Fixed some LGTM alerts #7441 (jacekkolasa)
  • Fixed valgrind errors #7532 (mfundul)
  • Fixed monit collector LGTM warnings (#7387 (ilyam8)
  • Fixed the following go.d.plugin collector issues: - mysql: panic in Cleanup (#326) - unbound: gather metrics via unix socket (#319) - logstash: pipelines chart (#317) - unbound: configuration file parsing. Support include mechanism. (#298) - logstash: pipelines metrics parsing (#293) - phpfpm: processes metrics parsing (#297)
Assets 5

@netdatabot netdatabot released this Nov 28, 2019 · 1711 commits to master since this release

Netdata v1.19.0

Release v1.19.0 contains 2 new collectors, 19 bug fixes, 17 improvements, and 19 documentation updates.

At a glance

We completed a major rewrite of our web log collector to dramatically improve its flexibility and performance. The new collector, written entirely in Go, can parse and chart logs from Nginx and Apache servers, and combines numerous improvements. Netdata now supports the LTSV log format, creates charts for TLS and cipher usage, and is amazingly fast. In a test using SSD storage, the collector parsed the logs for 200,000 requests in about 200ms, using 30% of a single core.

This Go-based collector also has powerful custom log parsing capabilities, which means we're one step closer to a generic application log parser for Netdata. We're continuing to work on this parser to support more application log formatting in the future.

We have a new tutorial on enabling the Go web log collector and using it with Nginx and/or Apache access logs with minimal configuration. Thanks to Wing924 for starting the Go rewrite!

We introduced more cmocka unit testing to Netdata. In this release, we're testing how Netdata's internal web server processes HTTP requests—the first step to improve the quality of code throughout, reduce bugs, and make refactoring easier. We wanted to validate the web server's behavior but needed to build a layer of parametric testing on top of the CMocka test runner. Read all about our process of testing and selecting cmocka on our blog post: Building an agile team's 'safety harness' with cmocka and FOSS.

Netdata's Unbound collector was also completely rewritten in Go to improve how it collects and displays metrics. This new version can get dozens of metrics, including details on queries, cache, uptime, and even show per-thread metrics. See our tutorial on enabling the new collector via Netdata's amazing auto-detection feature.

We fixed an error where invalid spikes appeared on certain charts by improving the incremental counter reset/wraparound detection algorithm.

Netdata can now send health alarm notifications to IRC channels thanks to Strykar!

And, Netdata can now monitor AM2320 sensors, thanks to hard work from Tom Buck.

Acknowledgements

Our thanks go to:

  • andyundso for fixing the packagecloud binary installation in Debian 8.
  • Strykar for adding support IRC health notifications.
  • tommybuck for the new AM2320 sensors collector.
  • Saruspete for the new ability to provide metrics on fragmentation of free memory pages.
  • OdysLam for improving the documentation for new collector plugins.
  • k0ste, xginn8 and nodiscc for improving the configuration of the apps plugin.
  • amichelic for improving the web_log collector.
  • cherouvim, arkamar, half-duplex and CtrlAltDel64 for improving the documentation.
  • mniestroj for the fix to the dbengine compilation with musl standard C.
  • arkamar for an improvement to the xenstat collector.
  • vakartel for improving the cgroup network interfaces detection in Proxmox 6.

Improvements

New Collectors

Collector improvements

  • We rewrote our web log parser in Go, drastically improving its flexibility and performance. go.d.plugin/#141 (ilyam8)
  • The Kubernetes kubelet collector now reads the service account token and uses it for authorization. We also added a new default job to collect metrics from https://localhost:10250/metrics. go.d.plugin/#285
  • Added a new default job to the Kubernetes coredns collector to collect metrics from http://kube-dns.kube-system.svc.cluster.local:9153/metrics. go.d.plugin/#285
  • apps.plugin: Synced FRRouting daemons configuration with the frr 7.2 release. #7333 (k0ste)
  • apps.plugin: Added process group for git-related processes. #7289 (nodiscc)
    -apps.plugin: Added balena to the container-engines application group. #7287 (xginn8)
  • web_log: Treat 401 Unauthorized requests as successful. #7256 (amichelic)
  • xenstat.plugin: Prepare for xen 4.13 by checking for check xenstat_vbd_error presence. #7103 (arkamar)
  • mysql: Added galera cluster_status alarm. #6989 (ilyam8)

Metrics Database

  • Netdata generates alarms if the disk cannot keep up with data collection. #7139 (mfundul)

Health

  • Fine tune various default alarm configurations. #7322 (Ferroin)
  • Update SYN cookie alarm to be less aggressive. #7250 (Ferroin)
  • Added support for IRC alarm notifications #7148 (Strykar)

Installation/Packages

  • Corrected the Makefile.am files indentation, to prevent unexpected errors. #7252 (knatsakis)
  • Rationalized ownership and permissions of /etc/netdata. #7244 (knatsakis)
  • Made various improvements to the installer script netdata-installer.sh. #7200 (knatsakis)
  • Include go.d.plugin version v0.11.0 #7365 (ilyam8)

Documentation

  • Correct versions of FreeNAS that Netdata is available on. #7355 (knatsakis)
  • Update plugins.d/README.md. #7335 (OdysLam)
  • Note regarding stable vs nightly was accidentally being shown as a code fragment in the installation documentation. #7330 (cakrit)
  • Properly link to translated documents from netdata-security.md. #7343 (cakrit)
  • Update documentation of the netdata-updater, to properly cover kickstart-static64.sh and kickstart.sh installations. #7262 (knatsakis)
  • Converted the swagger documentation to OpenAPI3.0. #7257 (amoss)
  • Minor corrections to the netdata installer documentation. #7246 (paulkatsoulakis)
  • Fix typo in collectors README. #7242 (cherouvim)
  • Clarified database engine/RAM in getting started guide. #7225 (joelhans)
  • Suggest using /var/run/netdata for the unix socket, in running behind nginx documentation. #7206 (CtrlAltDel64)
  • Added GA links to new documents. #7194 (joelhans)
  • Added a page for metrics archiving to TimescaleDB. #7180 (joelhans)
  • Fixed typo in the contrib/debian descriptions for cupsd. #7154 (arkamar)
  • Added user information to MySQL Python module documentation. #7128 (prhomhyse)
  • Document the results of the spike investigation into CMake. #7114 (amoss)
  • Fix to docker-compose+Caddy installation. #7088 (joelhans)
  • Fixed broken links and added setup instructions for Telegram health notifications. #7033 (half-duplex)
  • Minor grammar change in /web/gui documentation #7363 (eviemsrs)

Other

Bug fixes

  • Fixed packagecloud binary installation in Debian 8. #7342 (andyundso)
  • Fixed missing libraries in certain compilations, by adding missing trailing backslash to Makefile.am. #7326 (oxplot)
  • Prevented freezes due to isolated CPUs. #7318 (stelfrag)
  • Fixed missing streaming when slave has SSL activated. #7306 (thiagoftsm)
  • Fixed error 421 in IRC notifications, by removing a line break from the message. #7243 (thiagoftsm)
  • proc/pagetypeinfo collection could under particular circumstances cause high CPU load. As a workaround, we disabled pagetypeinfo by default. #7230 (vlvkobal)
  • Fixed incorrect memory allocation in proc plugin’s pagetypeinfo collector. #7187 (thiagoftsm)
  • Eliminated cached responses from the postgres collector. #7228 (ilyam8)
  • rabbitmq: Fixed "disk_free": "disk_free_monitoring_disabled" error. #7226 (ilyam8)
  • Fixed build with musl standard C library by including limits.h before using LONG_MAX. #7224 (mniestroj)
  • Fixed Apache module not working with letsencrypt certificate by allowing the python UrlService to skip tls_verify for http scheme. #7223 (ilyam8)
  • Fixed invalid spikes appearing in certain charts, by improving the incremental counter reset/wraparound detection algorithm. #7220 (mfundul)
  • Fixed DNS-lookup performance issue on FreeBSD. #7132 (amoss)
  • Fixed handling of the stable option, so that the installers and automatic updater respect it. #7083 (knatsakis), #7051 (oxplot)
  • Fixed handling of the static binary installer’s handling of the --auto-update option. #7076 (knatsakis)
  • Fixed cgroup network interfaces classification on Proxmox 6. #7037 (vakartel)
  • Added missing dbengine flags to the installer. #7027 (paulkatsoulakis)
  • Fixed issue with unknown variables in alarm configuration expressions always being evaluated to zero. #6984 (thiagoftsm)
  • Fixed issue of automatically picking up Pi-hole stats from a Pi-hole instance installed on another device by disabling the default job that collects metrics from http://pi.hole. go.d.plugin 289 (ilyam8)
Assets 5

@netdatabot netdatabot released this Oct 18, 2019 · 1904 commits to master since this release

Netdata v1.18.1

Release v1.18.1 contains 17 bug fixes, 5 improvements, and 5 documentation updates.

At a glance

Patch release 1.18.1 contains several bug fixes, mainly related to FreeBSD and the binary package generation process.

Netdata can now send notifications to Google Hangouts Chat!

On certain systems, the slabinfo plugin introduced in v1.18.0 added thousands of new metrics. We decided the collector's usefulness to most users didn't justify the increase in resource requirements. This release disables the collector by default.

Finally, we added a chart under Netdata Monitoring to present a better view of the RAM used by the database engine (dbengine). The chart doesn't currently take into consideration the RAM used for slave nodes, so we intend to add more related charts in the future.

Acknowledgements

We'd like to thank:

  • hendrikhofstadt for the Google Hangouts notifications
  • stevenh for the awesome zombie process reaper and the fix for the freeipmi collector
  • samm-git for the addition of the VMware VMXNET3 driver to the default interfaces list for FreeBSD
  • sz4bi for a documentation fix

Improvements

  • Disable slabinfo plugin by default to reduce the total number of metrics collected #7056 (vlvkobal)
  • Add dbengine RAM usage statistics #7038 (mfundul)
  • Support Google Hangouts chat notifications #7013 (hendrikhofstadt)
  • Add CMocka unit tests #6985 (vlvkobal)
  • Add prerequisites to enable automatic updates for installations via the static binary (kickstart-static64.sh) #7060 (knatsakis)

Documentation

Bug fixes

  • Fix unbound collector timings: Convert recursion timings to milliseconds. #7121 (Ferroin)
  • Fix unbound collector unhandled exceptions #7112 (ilyam8)
  • Fix upgrade path from v1.17.1 to v1.18.x for deb packages #7118 (knatsakis)
  • Fix CPU charts in apps plugin on FreeBSD #7115 (vlvkobal)
  • Fix megacli collector binary search and sudo check #7108 (ilyam8)
  • Fix missing packages, by running the triggers for DEB and RPM package build in separate stages #7105 (knatsakis)
  • Fix segmentation fault in FreeBSD when statsd is disabled #7102 (vlvkobal)
  • Fix Clang warnings #7090 (thiagoftsm)
  • Fix python.d error logging: change chart suppress msg level from ERROR to INFO #7085 (ilyam8)
  • Fix freeipmi update frequency check: was warning that 5 was too frequent and it was setting it to 5. #7078 (stevenh)
  • Fix alarm configurations not getting loaded, via better handling of chart names with special characters #7069 (thiagoftsm)
  • Fix dbengine not working when mmap fails - mostly with BSD kernels #7065 (mfundul)
  • Fix FreeBSD issue due to incorrect size of a zeroed block #7061 (vlvkobal)
  • Don't write HTTP response 204 messages to the logs #7035 (vlvkobal)
  • Fix build when CMocka isn't installed #7129 (vlvkobal)
  • FreeBSD plugin: Add VMware VMXNET3 driver to the default interfaces list #7109 (samm-git)
  • Prevent zombie processes when a child is re-parented to netdata when its running in a container , by adding child process reaper #7059 (stevenh)
Assets 5
You can’t perform that action at this time.