Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP-DEVOPS-7488: update node-exporter #8

Open
wants to merge 870 commits into
base: master
Choose a base branch
from

Conversation

kirandark
Copy link

No description provided.

SuperQ and others added 30 commits September 4, 2020 11:15
Fix capitalization of CPU acronym throughout
…etheus#1835)

* Add: configure 2 thresholds for NodeFilesystemAlmostOutOfSpace alert

Signed-off-by: Nicolas Lamirault <nicolas.lamirault@gmail.com>
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
* Bump Go modules to latest.
* Update to Go 1.15.
* Remove obsolete darwin/386 build.

Signed-off-by: Ben Kochie <superq@gmail.com>
…#1810)

* Expose metric for state=check for node_md_state
* Added new e2e output fixture including md201 which is in checking state and a the new state=check labeled metric for all other md

Signed-off-by: Christian Rohmann <github@frittentheke.de>
We've gathered enough evidence that the CPU counter bug workaround is
working as intended. Downgrade the message from Warning to Debug.

Signed-off-by: Ben Kochie <superq@gmail.com>
Signed-off-by: paulfantom <pawel@krupa.net.pl>
Signed-off-by: fsschmitt <492108+fsschmitt@users.noreply.github.com>
…hronized clock

Signed-off-by: paulfantom <pawel@krupa.net.pl>
docs/node-mixin/alerts: use ratio for network alerts
This should be the way forward when importing libraries in jsonnet. It's
closer to how Go imports look and makes it more obvious where packages
live.

This is not breaking anything, as the old imports were already symlinks
to the now directly used directories.

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>
…lute-import-paths

Use absolute jsonnet import paths
* Expose XFS inode statistics (prometheus#1869)

Also fixes prometheus#1177

@SuperQ @discordianfish

Signed-off-by: Ondrej Baudys <obaudys@gmail.com>
Co-authored-by: obaudys@gmail.com <ondrej.baudys@nextgen.net>
Signed-off-by: Julien Pivotto <roidelapluie@inuits.eu>
Create a metric node_zfs_zpool_state.

Signed-off-by: Artur Molchanov <artur.molchanov@easybrain.com>
this change fixes the logging message for the filesystem ignored-fs-types flag to output the flag instead of the mountpoints flag.

Signed-off-by: xinau <felix.ehrenpfort@protonmail.com>
Signed-off-by: Trey Dockendorf <tdockendorf@osc.edu>
I have rewritten all CGO dependencies for OpenBSD amd64
using pure go, be able to crosscompile node_exporter.

Signed-off-by: ston1th <ston1th@giftfish.de>
Signed-off-by: ston1th <ston1th@giftfish.de>
Signed-off-by: ston1th <ston1th@giftfish.de>
Signed-off-by: ston1th <ston1th@giftfish.de>
Signed-off-by: ston1th <ston1th@giftfish.de>
new type: `netDevStats map[string]map[string]uint64`

Signed-off-by: ston1th <ston1th@giftfish.de>
Signed-off-by: Paul Manley <paul.manley@wholefoods.com>
Signed-off-by: Paul Manley <paul.manley@wholefoods.com>
The txt was changed to rst:

    https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/accounting/psi.rst

But it's probably better to link to the rendered docs, since the link
should be more stable.

Signed-off-by: Louis Taylor <louis@kragniz.eu>
* Modest doc improvements

Signed-off-by: Anthony D'Atri <anthony.datri@gmail.com>
Move end-user install instructions to the top of the README.
* Add a Docker Compose example.
* Improve some wording.
* Link to the Cloud Alchemy Ansible role.
* Update to git clone method for dev/building

Signed-off-by: Ben Kochie <superq@gmail.com>
Signed-off-by: kamijin_fanta <kamijin@live.jp>
jordy1024 and others added 23 commits October 24, 2021 12:48
Use `time.NewTimer()` and explicit `Stop()` to avoid memory bloat / GC problems with `time.After()` in the Linux filesystem collector timeout handling.

Signed-off-by: bawenmao <bawenmao@sogou-inc.com>
Signed-off-by: prombot <prometheus-team@googlegroups.com>
Upstream is replacing `golint` with `revive`.
* Cleanup unused mixin go files.

Signed-off-by: Ben Kochie <superq@gmail.com>
* Exclude mountpoints under /run/credentials

Signed-off-by: ml <ml@visu.li>
Add a DMI collector to expose the Desktop Management Interface (DMI)
info from `/sys/class/dmi/id/`. This will expose information about the
BIOS, mainboard, chassis, and product.

Closes: prometheus#303
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
* feat: new collector about thermal conditions on macos

Signed-off-by: STRRL <str_ruiling@outlook.com>
Signed-off-by: Alessio Caiazza <nolith@abisso.org>
* Extract powersupply linux code from collector common file.
* Add Darwin powersupply collector.

Signed-off-by: Alessio Caiazza <nolith@abisso.org>
The master branch of `ethtool` merged the fix for
safchain/ethtool#39

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
Disable `collector/zfs_linux_test.go` in case `!nozfs` is set to
completely disable ZFS.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
Add test case for ethtool metrics with leading spaces reported in prometheus#2185:

```
$ ethtool -S
NIC statistics:
     Tx Queue#: 0
       TSO pkts tx: 0
       TSO bytes tx: 0
       ucast pkts tx: 20487
       ucast bytes tx: 1908107
       mcast pkts tx: 83
       mcast bytes tx: 5906
       bcast pkts tx: 4
       bcast bytes tx: 168
       pkts tx err: 0
       pkts tx discard: 0
       drv dropped tx total: 0
          too many frags: 0
          giant hdr: 0
          hdr err: 0
          tso: 0
       ring full: 0
       pkts linearized: 0
       hdr cloned: 0
       giant hdr: 0
     Rx Queue#: 0
       LRO pkts rx: 0
       LRO byte rx: 0
       ucast pkts rx: 25086
       ucast bytes rx: 2404103
       mcast pkts rx: 0
       mcast bytes rx: 0
       bcast pkts rx: 0
       bcast bytes rx: 0
       pkts rx OOB: 0
       pkts rx err: 0
       drv dropped rx total: 0
          err: 0
          fcs: 0
       rx buf alloc fail: 0
     tx timeout count: 0
```

Bug: prometheus#2185
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
'iowait' and 'steal' indicate specific idle/wait states, which shouldn't
be counted into CPU Utilisation. Also see
prometheus-operator/kube-prometheus#796 and
kubernetes-monitoring/kubernetes-mixin#667.

Per the iostat man page:

%idle
    Show the percentage of time that the CPU or CPUs were idle and the
    system did not have an outstanding disk I/O request.

%iowait
     Show the percentage of time that the CPU or CPUs were idle during
     which the system had an outstanding disk I/O request.

%steal
     Show the percentage of time spent in involuntary wait by the
     virtual CPU or CPUs while the hypervisor was servicing another
     virtual processor.

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
LLVM/Clang 11.0 adds a `-Wundef-prefix=TARGET_OS_` build flag which
breaks this build flag.

Signed-off-by: Ben Kochie <superq@gmail.com>
* Add clocksource metrics to time collector

This closes prometheus#1336

Signed-off-by: Johannes 'fish' Ziemke <github@freigeist.org>
Use SysctlTimeval from the golang.org/x/sys/unix package to
simplify the implementation of the boottime collector for the BSDs and
allows to build it without cgo.

Tested on macOS 11.6, FreeBSD 13 and OpenBSD 7.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Sanitizing the metric names can lead to duplicate metric names:

```
caller=level.go:63 level=error caller="error gathering metrics: [from Gatherer #2] collected metric \"node_ethtool_giant_hdr\" { label:<name:\"device\" value:\"ens192\" > untyped:<value:0" msg=" > } was collected before with the same name and label values"
```

Generate a map from the sanitized metric names to the metric names from
ethtool. In case of duplicate sanitized metric names drop both metrics,
because it is unknown which one to take.

Fixes: prometheus#2185
Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
Signed-off-by: computerphilosopher <bspark@jam2in.com>
ethtool got its first release.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
Signed-off-by: Andrei Marin <hedrox53@gmail.com>
The new `lnstat` collector produces a high number of metrics, per-cpu,
and results in approximately double the number of metrics previously
scraped. For example, a typical server with 64 cores produces 3832
lnstat metrics compared to 4147 metrics for the remaining collectors.

Therefore disable the `lnstat` collector by default.

Signed-off-by: Benjamin Drung <benjamin.drung@ionos.com>
TCP timeouts count is a useful signal to show
abnormal network performance and is another
signal to aid debugging. This metric can be
used to generate proactive alerts for host
network namespace workloads.

Signed-off-by: Martin Kennelly <mkennell@redhat.com>
NOTE: In order to support globs in the textfile collector path, filenames exposed by
      `node_textfile_mtime_seconds` now contain the full path name.

* [CHANGE] Add path label to rapl collector prometheus#2146
* [CHANGE] Exclude filesystems under /run/credentials prometheus#2157
* [FEATURE] Add lnstat collector for metrics from  /proc/net/stat/ prometheus#1771
* [FEATURE] Add darwin powersupply collector prometheus#1777
* [FEATURE] Add support for monitoring GPUs on Linux prometheus#1998
* [FEATURE] Add Darwin thermal collector prometheus#2032
* [FEATURE] Add os release collector prometheus#2094
* [FEATURE] Add netdev.address-info collector prometheus#2105
* [ENHANCEMENT] Support glob textfile collector directories prometheus#1985
* [ENHANCEMENT] ethtool: Expose node_ethtool_info metric prometheus#2080
* [ENHANCEMENT] Use include/exclude flags for ethtool filtering prometheus#2165
* [ENHANCEMENT] Add flag to disable guest CPU metrics prometheus#2123
* [ENHANCEMENT] Add DMI collector prometheus#2131
* [ENHANCEMENT] Add threads metrics to processes collector prometheus#2164
* [ENHANCMMENT] Reduce timer GC delays in the Linux filesystem collector prometheus#2169
* [BUGFIX] ethtool: Sanitize metric names prometheus#2093
* [BUGFIX] Fix ethtool collector for multiple interfaces prometheus#2126
* [BUGFIX] Fix possible panic on macOS prometheus#2133
* [BUGFIX] Collect flag_info and bug_info only for one core prometheus#2156

Signed-off-by: Ben Kochie <superq@gmail.com>
@kirandark kirandark force-pushed the DEVOPS-7488-update-node-exporter branch 2 times, most recently from 5dec9a3 to baaaf91 Compare January 11, 2022 14:49
@kirandark kirandark force-pushed the DEVOPS-7488-update-node-exporter branch from baaaf91 to 78d7730 Compare January 12, 2022 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.