Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error indexing label pair to fingerprints batch: no space left on device #5103

Closed
cauwulixuan opened this Issue Jan 17, 2019 · 2 comments

Comments

Projects
None yet
3 participants
@cauwulixuan
Copy link

cauwulixuan commented Jan 17, 2019

What did you do?
I am using prometheus + node exporter + other exporters to monitor my cluster. Now the prometheus work very unstable. I lost some metrics, I can see all metrics scraped by exporter, they can be found in /metrics URL, but didn't show in prometheus via HTTP API or /graph page. Exporters worked well, but something was wrong with Prometheus.

Environment

  • System information:

    Linux 3.10.0-514.el7.x86_64 x86_64

  • Prometheus version:

        prometheus, version 1.7.2 (branch: HEAD, revision: 22eadbe635528fa17b99a7635fed6b6018103042)
        build user:       root@05b1548df2cc
        build date:       20170926-16:41:43
        go version:       go1.8.3
  • Prometheus configuration file:
global:
  scrape_interval:     60s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 30s # Evaluate rules every 30 seconds. The default is every 1 minute.
  external_labels:
      monitor: 'indata-monitor'

rule_files:
  - "/etc/prometheus/rules/*.rules"

alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - 'localhost:9093'

scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 15s
    static_configs:
      - targets: ['10.250.50.19:9500']

  - job_name: 'consul_node'
    consul_sd_configs:
      - server: '127.0.0.1:8500'

    relabel_configs:
      - source_labels: [__meta_consul_tags]
        regex: .*,monitor,.*
        action: keep
      - source_labels: [__meta_consul_service]
        target_label: service

  - job_name: 'consul_hadoop'
    scrape_interval: 120s
    consul_sd_configs:
      - server: '127.0.0.1:8500'

    relabel_configs:
      - source_labels: [__meta_consul_tags]
        regex: .*,hadoop.*,.*
        action: keep
      - source_labels: [__meta_consul_service]
        target_label: service
      - source_labels: [__meta_consul_service_address]
        target_label: hostname
      - source_labels: [__meta_consul_address,__meta_consul_service_port]
        separator: ","
        regex: (.*),(.*)
        replacement: ${1}:${2}
        target_label: __address__
  • Logs:
Dec  9 06:58:17 prometheus: time="2018-12-09T06:58:17+08:00" level=error msg="Error indexing label pair to fingerprints batch: write /var/lib/prometheus/labelpair_to_fingerprints/000495.log: no space left on device" source="persistence.go:1410"
Dec  9 06:58:27 prometheus: time="2018-12-09T06:58:27+08:00" level=error msg="Error indexing label pair to fingerprints batch: write /var/lib/prometheus/labelpair_to_fingerprints/000495.log: no space left on device" source="persistence.go:1410"
Dec  9 06:58:47 prometheus: time="2018-12-09T06:58:47+08:00" level=error msg="Error indexing label pair to fingerprints batch: write /var/lib/prometheus/labelpair_to_fingerprints/000495.log: no space left on device" source="persistence.go:1410"
  • More info:
[root@inspurinsight-server-19 ~]# df -h
Filesystem           Size  Used Avail Use% Mounted on
/dev/mapper/cl-root   50G  7.4G   43G  15% /
devtmpfs              63G     0   63G   0% /dev
tmpfs                 63G   12K   63G   1% /dev/shm
tmpfs                 63G  4.4G   59G   7% /run
tmpfs                 63G     0   63G   0% /sys/fs/cgroup
/dev/mapper/cl-usr    64G   26G   39G  40% /usr
/dev/sdb             1.7T  206G  1.5T  13% /data1
/dev/mapper/cl-var   100G   79G   22G  79% /var
/dev/sda1            197M  150M   47M  77% /boot
tmpfs                 13G     0   13G   0% /run/user/0
tmpfs                 13G     0   13G   0% /run/user/1024


[root@inspurinsight-server-19 ~]# df -i
Filesystem             Inodes  IUsed     IFree IUse% Mounted on
/dev/mapper/cl-root  26214400  10050  26204350    1% /
devtmpfs             16427685    665  16427020    1% /dev
tmpfs                16431618      4  16431614    1% /dev/shm
tmpfs                16431618   1580  16430038    1% /run
tmpfs                16431618     16  16431602    1% /sys/fs/cgroup
/dev/mapper/cl-usr   33554432 212582  33341850    1% /usr
/dev/sdb            175585856  12189 175573667    1% /data1
/dev/mapper/cl-var   44792416  97212  44695204    1% /var
/dev/sda1               96272    330     95942    1% /boot
tmpfs                16431618      1  16431617    1% /run/user/0
tmpfs                16431618      1  16431617    1% /run/user/1024


[root@inspurinsight-server-19 ~]# ps -ef|grep prometheus
root      6647     1  3 11:30 ?        00:08:05 /usr/local/bin/prometheus -config.file /etc/prometheus/prometheus.yml -storage.local.path=/var/lib/prometheus 


[root@inspurinsight-server-19 ~]# free -h
              total        used        free      shared  buff/cache   available
Mem:           125G         60G        7.4G        4.4G         57G         60G
Swap:           63G        1.7G         62G

Any suggestions? Thanks a lot.

@simonpasquier

This comment has been minimized.

Copy link
Member

simonpasquier commented Jan 17, 2019

1.7.2 is an old version that isn't maintained anymore, you should definitely consider moving to 2.x. df might return "inaccurate" information if deleted files are still opened by some process.

I'm closing it for now. If you have further questions, please use our user mailing list, which you can also search.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jan 17, 2019

df might return "inaccurate" information if deleted files are still opened by some process.

It'll still show the correct information in that case, but du would be inaccurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.