Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd clientv3 WithCountOnly performence #11036

Closed
ilixiaocui opened this issue Aug 15, 2019 · 7 comments
Closed

etcd clientv3 WithCountOnly performence #11036

ilixiaocui opened this issue Aug 15, 2019 · 7 comments

Comments

@ilixiaocui
Copy link

I find following OpOption can get number of records in a range.
// WithCountOnly makes the 'Get' request return only the count of keys.
func WithCountOnly() OpOption {
return func(op *Op) { op.countOnly = true }
}

I'd like to know its performance. if i had 10w+ records in etcd, how long it take to calculate?

@jingyih
Copy link
Contributor

jingyih commented Aug 15, 2019

Roughly speaking, it should be very fast compared to w/o this option. With WithCountOnly, a range request traverses the in-memory indexing tree and return the key count - no disk I/O involved.

Exact performance number depends on cluster configuration and hardware.

@ilixiaocui
Copy link
Author

ilixiaocui commented Aug 16, 2019

Roughly speaking, it should be very fast compared to w/o this option. With WithCountOnly, a range request traverses the in-memory indexing tree and return the key count - no disk I/O involved.

Exact performance number depends on cluster configuration and hardware.

  1. my etcd version as following
    etcd Version: 3.3.10
    Git SHA: 27fc7e2
    Go Version: go1.10.4
    Go OS/Arch: linux/amd64

  2. cpu infomation
    56 Intel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz

  3. I start single node etcd cluster by command "etcd --quota-backend-bytes=8589934592"

  4. I put 50w records in range["01", "02"), WithCountOnly cost 1.423867s
    and put 250w records in range ["02", "03"), WithCountOnly cost 281.2565ms

and question is:

  1. Is indexing tree sort considering dictionary order?
  2. Is this performance normal?
  3. Is there official doc introducing implemention of indexing tree?

@jingyih
Copy link
Contributor

jingyih commented Aug 16, 2019

  1. Is indexing tree sort considering dictionary order?

Yes. It is sorted by user key in lexical byte-order. [1]

  1. Is this performance normal?

In your test, ranging 500K keys takes longer than 2,500K keys?

  1. Is there official doc introducing implemention of indexing tree?

I am not aware of any doc regarding implementation details of the index tree. The code is in mvcc package. And there is a brief description on etcd data model [2].

ref:
[1]

etcd/mvcc/key_index.go

Lines 327 to 328 in 9b29151

func (ki *keyIndex) Less(b btree.Item) bool {
return bytes.Compare(ki.key, b.(*keyIndex).key) == -1

[2] https://github.com/etcd-io/etcd/blob/master/Documentation/learning/data_model.md

@ilixiaocui
Copy link
Author

I am not aware of any doc regarding implementation details of the index tree. The code is in mvcc package. And there is a brief description on etcd data model [2].

thanks for your answer.

its a mistake:
3 . I put 50w records in range["01", "02"), WithCountOnly cost 1.423867s
and put 250w records in range ["02", "03"), WithCountOnly cost 281.2565ms

correct is:
50w records in range["01", "02"), WithCountOnly cost 281.2565ms
250w records in range ["02", "03"), WithCountOnly cost 281.2565ms

@ilixiaocui
Copy link
Author

  1. Is indexing tree sort considering dictionary order?

Yes. It is sorted by user key in lexical byte-order. [1]

  1. Is this performance normal?

In your test, ranging 500K keys takes longer than 2,500K keys?

  1. Is there official doc introducing implemention of indexing tree?

I am not aware of any doc regarding implementation details of the index tree. The code is in mvcc package. And there is a brief description on etcd data model [2].

ref:
[1]

etcd/mvcc/key_index.go

Lines 327 to 328 in 9b29151

func (ki *keyIndex) Less(b btree.Item) bool {
return bytes.Compare(ki.key, b.(*keyIndex).key) == -1

[2] https://github.com/etcd-io/etcd/blob/master/Documentation/learning/data_model.md

the new question is:

  1. does WithCountOnly count records before the point of request arrival? does any put/delete/txn happend in counting progress affect couting?

  2. does put/delete/txn block by counting?

@jingyih
Copy link
Contributor

jingyih commented Aug 16, 2019

  1. does WithCountOnly count records before the point of request arrival? does any put/delete/txn happend in counting progress affect couting?

It provides the same guarantee as a regular range request (by default range requests are linearizable reads, as opposed to serializable [1]). Your range result is guaranteed to see the latest finished mutating requests. During counting / ranging keys, the view of the data store is fixed.

  1. does put/delete/txn block by counting?

Generally speaking, no. etcd supports 1 write + N reads. But there are caveats. I noticed you are using 3.3.10. Here are some of the improvements to backend concurrency since that version: #10523, #9296

[1] https://github.com/etcd-io/etcd/blob/master/Documentation/learning/api_guarantees.md#linearizability

@jingyih
Copy link
Contributor

jingyih commented Aug 21, 2019

Closing for now. Feel free to reopen if there are followup questions.

@jingyih jingyih closed this as completed Aug 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants