Skip to content

Processes exporter logs error message: no such process #1901

@acastong

Description

@acastong

Host operating system: output of uname -a

Linux 3.10.0-1160.6.1.el7.x86_64 #1 SMP Tue Nov 17 13:59:11 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
CentOS 7

node_exporter version: output of node_exporter --version

node_exporter --version
node_exporter, version 1.0.1 (branch: HEAD, revision: 3715be6)
build user: root@1f76dbbcfa55
build date: 20200616-12:44:12
go version: go1.14.4

node_exporter command line flags

node_exporter --collector.processes --collector.qdisc --collector.systemd

Are you running node_exporter in Docker?

No

What did you do that produced an error?

Simply rand node_exporter for an extended period of time

What did you expect to see?

No error logs

What did you see instead?

node_exporter[15746]: level=error ts=2020-11-25T13:12:02.291Z caller=collector.go:161 msg="collector failed" name=processes duration_seconds=0.027611184 err="unable to retrieve number of allocated threads: "read /proc/2054/stat: no such process""

Analysis

This is very closely related to #1043 : that change fixed processes disappearing between list the /proc directory and reading the actual process stats. But another race condition is possible: between opening the /proc/<process id>/stat file and actually reading it, another race condition can occur and the error code returned is different. Bellow is a small code snippet to reproduce that race condition.

The recommended fix is to modify getAllocatedThreads() in collector/processes_linux.go to continue after stat, err := pid.Stat() if the error meets this condition: strings.Contains(err.Error(),syscall.ESRCH.Error()).

package main

import (
    "fmt"
    "os"
    "io"
    "io/ioutil"
    "syscall"
    "strings"
    "strconv"
    "os/exec"
    "log"
)

func main(){
    const maxBufferSize = 1024 * 512

    fmt.Printf("Starting process sleep\n")
    cmd := exec.Command("sleep","1")
    err := cmd.Start()
    if(err != nil) {
        log.Fatal(err)
    }

    procPath := "/proc/" + strconv.Itoa(cmd.Process.Pid) + "/stat"

    fmt.Printf("Read stat for %s\n",procPath)
    f, err := os.Open(procPath)
    defer f.Close()
    if(err != nil) {
        log.Fatal(err)
    }

    cmd.Wait()
    fmt.Printf("Sleep process existed, reading opened stat file\n")
    reader := io.LimitReader(f, maxBufferSize)
    _, err = ioutil.ReadAll(reader)

    if err != nil {
        if strings.Contains(err.Error(),syscall.ESRCH.Error()) {
            fmt.Println("Got error no such process:", err)
        } else {
            fmt.Println("Read stat failed: ",err)
        }
    } else {
        fmt.Println("No error reading stat")
    }
}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions