Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus exit when systemd restart systemd-journald #5171

Open
eahydra opened this Issue Feb 1, 2019 · 3 comments

Comments

Projects
None yet
2 participants
@eahydra
Copy link

eahydra commented Feb 1, 2019

Proposal

Use case. Why is this important?
I have some Prometheus instances, but three days ago, these instances have exited.
These instances is running as Systemd Service. And use systemctl status check the status, these instances have been killed by SIGPIPE.

I confirmed the systemd have bug even if IgnoreSIGPIPE=true.

And Docker has also encountered the same problem, and now Docker have code
that ignore SIGPIPE pkg/signal/trap.go#L39

Bug Report

What did you do?
Do nothing.

What did you expect to see?
Handle the SIGPIPE correctly

What did you see instead? Under which circumstances?
when systemd restart systemd-journald, the Prometheus service exited.

Environment
Centos 7, with systemd 219 (although the systemd version is old.)

  • System information:

    the internal version but based on CentOS 7

  • Prometheus version:

prometheus, version 2.5.0 (branch: release/20181129-11-33, revision: 67dc912)
build user: admin@rs7h13559.et2sqa
build date: 20181129-03:34:17
go version: go1.11

  • Prometheus configuration file:
global:
  scrape_interval:     30s
  evaluation_interval: 30s

scrape_configs:
- job_name: 'inspector'
  scrape_interval: 30s
  scheme: http
  metrics_path: "/metrics"
  static_configs:
      - targets: ["target1:8070", "target2:8070"]
@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Feb 1, 2019

Looking at a similar issue reported for node_exporter, it doesn't seem easy to fix properly. Even if we were to catch and ignore SIGPIPE in Prometheus as moby does, Prometheus would stop logging.

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Feb 1, 2019

Hacked quickly a Prometheus binary that catches SIGPIPE and when I restart or stop journald, Prometheus starts eating lots of CPU...

@eahydra

This comment has been minimized.

Copy link
Author

eahydra commented Feb 1, 2019

Maybe we can write logging to file, and support logging file rotate by size or count.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.