Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd_debugging_mvcc_db_total_size_in_bytes metric freeze #8146

Closed
mwf opened this issue Jun 21, 2017 · 1 comment
Closed

etcd_debugging_mvcc_db_total_size_in_bytes metric freeze #8146

mwf opened this issue Jun 21, 2017 · 1 comment

Comments

@mwf
Copy link

mwf commented Jun 21, 2017

Hi guys, maybe it's already fixed, but I decided to ask anyway.

etcd server version 3.1.5

I've run a compaction + defrag on cluster and found out that etcd_debugging_mvcc_db_total_size_in_bytes metrics is "frozen" somehow.

It returns old DB size, before the defrag.
It returns the real size only after etcdctl endpoint status is called for the particular node.

curl http://<host>:4001/metrics | grep db_total 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0# HELP etcd_debugging_mvcc_db_total_size_in_bytes Total size of the underlying database in bytes.
# TYPE etcd_debugging_mvcc_db_total_size_in_bytes gauge
etcd_debugging_mvcc_db_total_size_in_bytes 2.19680768e+09

ETCDCTL_API=3 etcdctl -w=table endpoint status
+----------------+------------------+---------+---------+-----------+-----------+------------+
|    ENDPOINT    |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------+------------------+---------+---------+-----------+-----------+------------+
| 127.0.0.1:2379 | e273c0dc9c7d588b | 3.1.5   | 807 kB  | false     |      1999 | 2023933200 |
+----------------+------------------+---------+---------+-----------+-----------+------------+

curl http://<host>:4001/metrics | grep db_total 
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
 58 41876   58 24248    0     0  32760      0  0:00:01 --:--:--  0:00:01 32723# HELP etcd_debugging_mvcc_db_total_size_in_bytes Total size of the underlying database in bytes.
# TYPE etcd_debugging_mvcc_db_total_size_in_bytes gauge
etcd_debugging_mvcc_db_total_size_in_bytes 806912

Is it fixed already?

@mwf
Copy link
Author

mwf commented Jun 21, 2017

On some of our clusters it got "unfrozen" by itself after ~20 minutes.
On some - it showed old data for 40+ minutes and only manual endpoint status call fixed the metrics data.

So the bug may not be easy to reproduce. But we found the time-lag on every cluster we have.

heyitsanthony added a commit to heyitsanthony/etcd that referenced this issue Jun 21, 2017
heyitsanthony added a commit to heyitsanthony/etcd that referenced this issue Jun 22, 2017
Relying on mvcc to set the db size metric can cause it to
miss size changes when a txn commits after the last write
completes before a quiescent period. Instead, load the
db size on demand.

Fixes etcd-io#8146
heyitsanthony added a commit to heyitsanthony/etcd that referenced this issue Jun 22, 2017
Relying on mvcc to set the db size metric can cause it to
miss size changes when a txn commits after the last write
completes before a quiescent period. Instead, load the
db size on demand.

Fixes etcd-io#8146
heyitsanthony added a commit to heyitsanthony/etcd that referenced this issue Jun 22, 2017
Relying on mvcc to set the db size metric can cause it to
miss size changes when a txn commits after the last write
completes before a quiescent period. Instead, load the
db size on demand.

Fixes etcd-io#8146
heyitsanthony added a commit to heyitsanthony/etcd that referenced this issue Jun 22, 2017
Relying on mvcc to set the db size metric can cause it to
miss size changes when a txn commits after the last write
completes before a quiescent period. Instead, load the
db size on demand.

Fixes etcd-io#8146
heyitsanthony added a commit to heyitsanthony/etcd that referenced this issue Jun 22, 2017
Relying on mvcc to set the db size metric can cause it to
miss size changes when a txn commits after the last write
completes before a quiescent period. Instead, load the
db size on demand.

Fixes etcd-io#8146
gyuho pushed a commit that referenced this issue Jun 22, 2017
Relying on mvcc to set the db size metric can cause it to
miss size changes when a txn commits after the last write
completes before a quiescent period. Instead, load the
db size on demand.

Fixes #8146
yudai pushed a commit to yudai/etcd that referenced this issue Oct 5, 2017
Relying on mvcc to set the db size metric can cause it to
miss size changes when a txn commits after the last write
completes before a quiescent period. Instead, load the
db size on demand.

Fixes etcd-io#8146
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant