-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In-Memory Engine: Eviction, Algorithmic loading #17023
In-Memory Engine: Eviction, Algorithmic loading #17023
Conversation
[REVIEW NOTIFICATION] This pull request has not been approved. To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Hi @afeinberg. Thanks for your PR. I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @tonyxuqqi |
7a41a13
to
4a7dad7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor comments that I expect: I know I will need to add some documents and an attempt at tests.
if remaining == 0 { | ||
break; | ||
} | ||
if self.engine.write().mut_range_manager().evict_range(range) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: I need to add a check to see if this was recently loaded. It should be easy here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should call engine.evict_range directly as it will do some cleanup works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should call engine.evict_range directly as it will do some cleanup works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.engine.write().mut_range_manager().evict_range(range)
returns true means the range can be deleted from the skiplist directly and thus not need to wait for the drop of some snapshots. So we have to delete the range if it return true, and this logic is already have in RangeCacheMemoryEngine::evict_range. It seems we cannot get the engine here, so we have to delete the range mannually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bringing this to my attention. Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SpadeA-Tang Done. Is what I am doing now what you had in mind?
/cc @overvenus |
e0ea397
to
d203840
Compare
} | ||
if self.engine.write().mut_range_manager().evict_range(range) { | ||
info!("evict on soft limit reached"; "range" => ?&range, "approx_size" => approx_size, "remaining" => remaining); | ||
remaining = remaining |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There maybe a huge difference between the memory usage of the region in in-memory engine and the region approximate size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is true. I understand this is imprecise, so there is a best effort approaches. Approximate size is via region info. Do we have a preferred way to check this that is more direct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SpadeA-Tang Looks like there's no clean API to get memory usage of a range directly from the range cache engine. I do have an idea on this, but it might take some time to implement. On the other hand, approximate size is most likely to be smaller than the actual size - which would lead to over eviction, which seems safer than under eviction. Any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's indeed over eviction. Actually, the approximate region size is generally estimazted by including all MVCC versions as well as rocksdb tombstones where there's may be only a small fraction of them that is in the memory engine. So this difference may be significantly huge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SpadeA-Tang How difficult would it to be to add an API for accurate size? Seems like adding a quick lock free way to keep track of a region (perhaps by using thread local to set region id (or start and end keys) when allocating memory for that region, or keeping track of all allocations for a specific range some other way?) would be the right way to do so. I can create an issue for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SpadeA-Tang What do you think about making delete_range return the number of bytes it has deleted? I can make this change in this PR. It might have slight overhead. I don't see any drawback to that.
We may add integration tests for this. |
5efa46b
to
5db0bea
Compare
dc7fb92
to
a44d420
Compare
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Alex Feinberg <alex@strlen.net>
acf13db
to
1b5775a
Compare
Fixed the merge conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
@zhangjinpeng87: We have migrated to builtin 👉 Please use
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: overvenus, zhangjinpeng87 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test pull-unit-test |
Signed-off-by: Alex Feinberg <alex@strlen.net>
Fixed integration test failure. |
ref tikv#16141, ref tikv#16764 In-Memory Engine: Algorithmic Load and Eviction Signed-off-by: Alex Feinberg <alex@strlen.net>
// TODO (afeinberg): This is inefficient. | ||
let _ = self | ||
.info_provider | ||
.find_region_by_key(&evicted_range.start) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should not use evicted_range.start directly here as the key in CacheRange contains the DATA_PREFIX, where as key in region does not. We should use evicted_range.start[1..] instead. @afeinberg
* build: bump tikv pkg version (#17063) Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io> * resolved_ts: refactor log for the unexpected path (#17064) (#17068) ref #16818 Refactor logs for the unexpected path, print both the exsiting row and input key/value. Signed-off-by: cfzjywxk <lsswxrxr@163.com> Co-authored-by: cfzjywxk <lsswxrxr@163.com> * chore: add prow OWNERS files for critial configuration files (#17071) (#17108) close #17004 Signed-off-by: wuhuizuo <wuhuizuo@126.com> Co-authored-by: wuhuizuo <wuhuizuo@126.com> * OWNERS: Auto Sync OWNERS files from community membership (#16973) (#17120) Signed-off-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> Signed-off-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com> Co-authored-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com> Co-authored-by: wuhuizuo <wuhuizuo@126.com> * In-memory engine: strip off read related code from engine.rs to a separte file (#16792) ref #16141 strip off read related code from engine.rs to a separte file Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: collector iteration related metrics (#16735) ref #16141 collector iteration related metrics Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: expose range cache engine configuration (#16791) ref #16141 expose range cache engine configuration Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: do not load a range overlapped with an unresolved range (#16794) ref #16141 do not load a range overlapped with an unresolved range Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> * server/node: rename Node to MultiRaftServer (#16783) close #16782 Signed-off-by: zhangjinpeng87 <zzzhangjinpeng@gmail.com> Signed-off-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com> Co-authored-by: lucasliang <nkcs_lykx@hotmail.com> * In-memory engine: add metrics for memory usage (#16822) ref #16141 add metrics for memory usage Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Co-authored-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com> * In-memory engine: add metrics for gc duration, range load duration, and write duration (#16831) ref #16141 add metrics for gc duration, range load duration, and write duration Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: delete lock physically rather than writting tombstone (#16852) ref #16141 delete lock physically rather than writting tombstone Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: fix missing memory record when snapshot loading (#16850) ref #16141 fix missing memory record when snapshot loading Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-Memory Engine: add a tag to the range for easier troubleshooting (#16816) ref #16141 In-Memory Engine: add a tag into Range for diagnostics Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Signed-off-by: Qi Xu <tonyxuqqi@outlook.com> Co-authored-by: SpadeA-Tang <u6748471@anu.edu.au> Co-authored-by: Qi Xu <tonyxuqqi@outlook.com> * In-memory engine: remove out of date pending ranges (#16872) ref #16141 remove out of date pending ranges Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * compile memory-engine by default and support dynamic config update for memory-engine (#16897) ref #16141 compile the in-memory-engine by default and support dynamic config update for memory-engine. The enabled flag would support enabled:true -> enabled:false and soft/hard limit also supports online change. But if enabled:false is set initially, the config change to make enabled:true needs the TiKV restart. Signed-off-by: tonyxuqqi <tonyxuqi@outlook.com> * In-memory engine: skiplist tombstone should be handled separetely in gc (#16944) ref #16141 skiplist tombstone should be handled separetely in gc Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: manually implement partial_eq for CacheRange to avoid the consideration of tag (#16959) ref #16141 manually implement partial_eq for CacheRange to avoid the consideration of tag Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Co-authored-by: tonyxuqqi <tonyxuqi@outlook.com> * In-memory engine: implement core local array and collect read flow statistics (#16945) ref #16141 implement core local array and collect read flow statistics Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: fix seek does not set self.valid properly (#16991) ref #16141 fix seek does not set self.valid properly Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> * In-memory engine: add metrics for different types of iteration operations (#16997) ref #16141 add metrics for different types of iteration operations Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: add metrics for range count (#16971) ref #16141 add metrics for range count Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Signed-off-by: Spade A <71589810+SpadeA-Tang@users.noreply.github.com> Co-authored-by: lucasliang <nkcs_lykx@hotmail.com> * In-memory engine: fix gc worker not used (#17032) ref #16141 fix gc worker not used Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * raftstore: fix online config change panic for periodic-full-compact-start-time (#17069) close #17066 fix online config change panic for periodic-full-compact-start-time Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * config: test +HHMM offset in ReadableSchedule (#17081) close #17080 * Test both formats +HHMM offsets and as well as +HH:MM for ReadableSchedule. Signed-off-by: Alex Feinberg <alex@strlen.net> * In-memory engine: mvcc keys with different sequence number should only be handled once in GC (#17083) ref #16141, close #17060 mvcc keys with different sequen number should only be handled once in GC Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Signed-off-by: Spade A <71589810+SpadeA-Tang@users.noreply.github.com> Co-authored-by: Neil Shen <overvenus@gmail.com> Co-authored-by: lucasliang <nkcs_lykx@hotmail.com> * In-memory engine: enable compaction filter for HybridEngine (#17087) ref #16141, close #17086 enable compaction filter for HybridEngine Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> * In-memory Engine: sequence number should be increased for each key (#17122) ref #16141, close #17114 sequence number should be increased for each key Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: support changing the direction of iteration (#17129) ref #16141, close #17079 support reverse direction when iterating Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> * In-memory engine: clean lock cf tombstone in a background worker (#17128) ref #16141, close #17127 clean lock cf tombstone in a background worker Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory Engine: encode internal key trailer in little endian. (#17125) close #17082 An internal key trailer is `(sequence_number << 8) | value_type as u64` and it should be encoded in little endian to be consistent with RocksDB. Signed-off-by: Neil Shen <overvenus@gmail.com> * In-memory engine: set boundries when gc (#17145) ref #16141, close #17143 set boundries when gc Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> * In-memory engine: encode key for lock cf without mvcc version in delete range (#17142) ref #16141, close #17140 encode key for lock cf without mvcc version in delete range Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> * In-memory Engine: get GC safe_point from PD (#17144) close #17123 The `safe_point` is used to check if a cache is eligible to serve a read request. If the `safe_point` drifts into the future, for example by 10 minutes, the cache will not be able to serve any requests for those 10 minutes. This commit fixes such issue by getting the `safe_point` from PD. Signed-off-by: Neil Shen <overvenus@gmail.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> * In-Memory Engine: Eviction, Algorithmic loading (#17023) ref #16141, ref #16764 In-Memory Engine: Algorithmic Load and Eviction Signed-off-by: Alex Feinberg <alex@strlen.net> * In-memory engine: consider range overlaps when eviction (#17137) ref #16141, close #17131 consider range overlaps when eviction Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: support delete range (#17151) ref #16141 support delete range Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: evict range should consider loading range (#17154) close #17153 evict range should consider loading range Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> * In-memory engine: clear write batch when range load failed (#17156) ref #16141, close #17103 clear write batch when range load failed Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: fix concurrency issue between delete range and write to memory (#17149) close #17104 fix concurrency issue between delete range and write to memory Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> * In-memory engine: add seek duration metrics (#17160) ref #16141 add seek duration metrics Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: add metrics for prepare for write duration (#17161) ref #16141 add metrics for prepare for write duration Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: set disk engine when start (#17165) ref #16141 set disk engine when start Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> * In-memory engine: avoid using snapshot cache (#16863) ref #16141 avoid using snapshot cache Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: evict range after some raft commands (#17159) ref #16141 evict range when becomeing follower, merge, and deleting data ranges. Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * In-memory engine: `filter` plus one when any key is removed during GC (#17171) ref #16141 filter plus one when any key is removed during GC Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Signed-off-by: Spade A <71589810+SpadeA-Tang@users.noreply.github.com> * periodic reload Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * avoid mem negative Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * make period of reload config Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * add overhead for delte entry size Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> * Fix missing field reload_period Signed-off-by: Neil Shen <overvenus@gmail.com> * Fix get_regions_in_range and BackgroundWorker compile Signed-off-by: Neil Shen <overvenus@gmail.com> --------- Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io> Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au> Signed-off-by: tonyxuqqi <tonyxuqi@outlook.com> Signed-off-by: Alex Feinberg <alex@strlen.net> Signed-off-by: Neil Shen <overvenus@gmail.com> Signed-off-by: Spade A <71589810+SpadeA-Tang@users.noreply.github.com> Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io> Co-authored-by: cfzjywxk <lsswxrxr@163.com> Co-authored-by: wuhuizuo <wuhuizuo@126.com> Co-authored-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com> Co-authored-by: Spade A <71589810+SpadeA-Tang@users.noreply.github.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com> Co-authored-by: lucasliang <nkcs_lykx@hotmail.com> Co-authored-by: tonyxuqqi <tonyxuqi@outlook.com> Co-authored-by: SpadeA-Tang <u6748471@anu.edu.au> Co-authored-by: Qi Xu <tonyxuqqi@outlook.com> Co-authored-by: Alex Feinberg <alex@strlen.net>
What is changed and how it works?
Issue Number: Ref #16764 Ref #16141
What's Changed:
Related changes
pingcap/docs
/pingcap/docs-cn
:Check List
Tests
Configure with range cache engine enabled and load a dataset larger than the soft limit.
Side effects
Release note