Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-memory engine: implement core local array and collect read flow statistics #16945

Merged
merged 20 commits into from
May 9, 2024

Conversation

SpadeA-Tang
Copy link
Member

@SpadeA-Tang SpadeA-Tang commented May 6, 2024

What is changed and how it works?

Issue Number: Ref #16141

What's Changed:

implement core local array and collect read flow statistics

This PR implements CoreLocalArray (which is implemented in RocksDB) for collecting metrics where, ideally, each thread only report stats to the cpu core executing it to readuce contention. In addition, this PR collects read flow and report to grafana like below:
image

Related changes

  • PR to update pingcap/docs/pingcap/docs-cn:
  • Need to cherry-pick to the release branch

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Release note

None

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Copy link
Contributor

ti-chi-bot bot commented May 6, 2024

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • LykxSassinator
  • tonyxuqqi

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note-none size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 6, 2024
SpadeA-Tang and others added 6 commits May 6, 2024 10:57
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
u
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
u
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Copy link
Contributor

@afeinberg afeinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @tonyxuqqi any thoughts on RangeCacheOptions?

Copy link
Contributor

ti-chi-bot bot commented May 8, 2024

@afeinberg: Thanks for your review. The bot only counts approvals from reviewers and higher roles in list, but you're still welcome to leave your comments.

In response to this:

LGTM. @tonyxuqqi any thoughts on RangeCacheOptions?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@@ -417,6 +435,14 @@ impl Iterator for RangeCacheIterator {
if self.valid {
self.find_next_visible_key(true, guard);
}

if self.valid {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not merge with the self.valid above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the code is duplicated by many times, make it an shared inline function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after find_next_visible_key, the self.valid may not be true.

Copy link
Contributor

@tonyxuqqi tonyxuqqi May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment to mention that self.valid can be changed. It's not straightforward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we should not have duplicated code when possible

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have polished that.

Copy link
Contributor

@tonyxuqqi tonyxuqqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM. Some details can be polished.

@ti-chi-bot ti-chi-bot bot added the status/LGT1 Indicates that a PR has LGTM 1. label May 8, 2024
@@ -701,15 +704,15 @@ impl<T: fmt::Display + Send + 'static> Stop for LazyWorker<T> {

pub trait KvEngineBuilder: KvEngine {
fn build(
range_cache_engine_config: Arc<VersionTrack<RangeCacheEngineConfig>>,
range_cache_engine_config: RangeCacheEngineOptions,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the name and type does not match.

@@ -99,3 +104,21 @@ impl RangeCacheEngineConfig {
}
}
}

pub struct RangeCacheEngineOptions {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming of RangeCacheEngineOptions is not appropriate. Statistics should not be part of Option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think statistics should be just a member of RangeCacheMemoryEngine directly and initialized inside its constructor.

@tonyxuqqi
Copy link
Contributor

LGTM. @tonyxuqqi any thoughts on RangeCacheOptions?

RangeCacheOptions is not necessary and it's also impropriate to include statistics as part of Options

@SpadeA-Tang
Copy link
Member Author

LGTM. @tonyxuqqi any thoughts on RangeCacheOptions?

RangeCacheOptions is not necessary and it's also impropriate to include statistics as part of Options

RocksDB put the Statistics in its db option. Also, we need this so we can get the statistics at the TiKVServer::init_raw_engines.

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
u
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
@SpadeA-Tang
Copy link
Member Author

/merge

Copy link
Contributor

ti-chi-bot bot commented May 9, 2024

@SpadeA-Tang: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

Copy link
Contributor

ti-chi-bot bot commented May 9, 2024

@SpadeA-Tang: /merge in this pull request requires 2 approval(s).

In response to this:

/merge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels May 9, 2024
@SpadeA-Tang
Copy link
Member Author

/merge

Copy link
Contributor

ti-chi-bot bot commented May 9, 2024

@SpadeA-Tang: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

You only need to trigger /merge once, and if the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

If you have any questions about the PR merge process, please refer to pr process.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

Copy link
Contributor

ti-chi-bot bot commented May 9, 2024

This pull request has been accepted and is ready to merge.

Commit hash: 95e8ef6

@ti-chi-bot ti-chi-bot bot added the status/can-merge Indicates a PR has been approved by a committer. label May 9, 2024
@ti-chi-bot ti-chi-bot bot merged commit f7ccc07 into tikv:master May 9, 2024
6 of 7 checks passed
@ti-chi-bot ti-chi-bot bot added this to the Pool milestone May 9, 2024
overvenus pushed a commit to overvenus/tikv that referenced this pull request Jun 17, 2024
…atistics (tikv#16945)

ref tikv#16141

implement core local array and collect read flow statistics

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
overvenus added a commit that referenced this pull request Jun 21, 2024
* build: bump tikv pkg version (#17063)

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

* resolved_ts: refactor log for the unexpected path (#17064) (#17068)

ref #16818

Refactor logs for the unexpected path, print both the exsiting row and input key/value.

Signed-off-by: cfzjywxk <lsswxrxr@163.com>

Co-authored-by: cfzjywxk <lsswxrxr@163.com>

* chore: add prow OWNERS files for critial configuration files (#17071) (#17108)

close #17004

Signed-off-by: wuhuizuo <wuhuizuo@126.com>

Co-authored-by: wuhuizuo <wuhuizuo@126.com>

* OWNERS: Auto Sync OWNERS files from community membership (#16973) (#17120)

 

Signed-off-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com>

Co-authored-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com>
Co-authored-by: wuhuizuo <wuhuizuo@126.com>

* In-memory engine: strip off read related code from engine.rs to a separte file (#16792)

ref #16141

strip off read related code from engine.rs to a separte file

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: collector iteration related metrics (#16735)

ref #16141

collector iteration related metrics

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: expose range cache engine configuration (#16791)

ref #16141

expose range cache engine configuration

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: do not load a range overlapped with an unresolved range (#16794)

ref #16141

do not load a range overlapped with an unresolved range

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* server/node: rename Node to MultiRaftServer (#16783)

close #16782

Signed-off-by: zhangjinpeng87 <zzzhangjinpeng@gmail.com>
Signed-off-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com>

Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>

* In-memory engine: add metrics for memory usage (#16822)

ref #16141

add metrics for memory usage

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com>

* In-memory engine: add metrics for gc duration, range load duration, and write duration (#16831)

ref #16141

add metrics for gc duration, range load duration, and write duration

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: delete lock physically rather than writting tombstone (#16852)

ref #16141

delete lock physically rather than writting tombstone

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: fix missing memory record when snapshot loading (#16850)

ref #16141

fix missing memory record when snapshot loading

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-Memory Engine: add a tag to the range for easier troubleshooting (#16816)

ref #16141

In-Memory Engine: add a tag into Range for diagnostics

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: Qi Xu <tonyxuqqi@outlook.com>

Co-authored-by: SpadeA-Tang <u6748471@anu.edu.au>
Co-authored-by: Qi Xu <tonyxuqqi@outlook.com>

* In-memory engine: remove out of date pending ranges (#16872)

ref #16141

remove out of date pending ranges

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* compile memory-engine by default and support dynamic config update for memory-engine (#16897)

ref #16141

compile the in-memory-engine by default and support dynamic config update for memory-engine.
The enabled flag would support enabled:true -> enabled:false and soft/hard limit also supports online change. 
But if enabled:false is set initially, the config change to make enabled:true needs the TiKV restart.

Signed-off-by: tonyxuqqi <tonyxuqi@outlook.com>

* In-memory engine: skiplist tombstone should be handled separetely in gc (#16944)

ref #16141

skiplist tombstone should be handled separetely in gc

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: manually implement partial_eq for CacheRange to avoid the consideration of tag (#16959)

ref #16141

manually implement partial_eq for CacheRange to avoid the consideration of tag

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: tonyxuqqi <tonyxuqi@outlook.com>

* In-memory engine: implement core local array and collect read flow statistics (#16945)

ref #16141

implement core local array and collect read flow statistics

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: fix seek does not set self.valid properly (#16991)

ref #16141

fix seek does not set self.valid properly

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory engine: add metrics for different types of iteration operations (#16997)

ref #16141

add metrics for different types of iteration operations

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: add metrics for range count (#16971)

ref #16141

add metrics for range count

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: Spade  A <71589810+SpadeA-Tang@users.noreply.github.com>

Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>

* In-memory engine: fix gc worker not used (#17032)

ref #16141

fix gc worker not used

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* raftstore: fix online config change panic for periodic-full-compact-start-time (#17069)

close #17066

fix online config change panic for periodic-full-compact-start-time

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* config: test +HHMM offset in ReadableSchedule (#17081)

close #17080

* Test both formats +HHMM offsets and as well as +HH:MM for ReadableSchedule.

Signed-off-by: Alex Feinberg <alex@strlen.net>

* In-memory engine: mvcc keys with different sequence number should only be handled once in GC (#17083)

ref #16141, close #17060

mvcc keys with different sequen number should only be handled once in GC

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: Spade  A <71589810+SpadeA-Tang@users.noreply.github.com>

Co-authored-by: Neil Shen <overvenus@gmail.com>
Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>

* In-memory engine: enable compaction filter for HybridEngine (#17087)

ref #16141, close #17086

enable compaction filter for HybridEngine

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory Engine: sequence number should be increased for each key (#17122)

ref #16141, close #17114

sequence number should be increased for each key

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: support changing the direction of iteration (#17129)

ref #16141, close #17079

support reverse direction when iterating

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory engine: clean lock cf tombstone in a background worker (#17128)

ref #16141, close #17127

clean lock cf tombstone in a background worker

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory Engine: encode internal key trailer in little endian. (#17125)

close #17082

An internal key trailer is `(sequence_number << 8) | value_type as u64`
and it should be encoded in little endian to be consistent with RocksDB.

Signed-off-by: Neil Shen <overvenus@gmail.com>

* In-memory engine: set boundries when gc (#17145)

ref #16141, close #17143

set boundries when gc

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory engine: encode key for lock cf without mvcc version in delete range (#17142)

ref #16141, close #17140

encode key for lock cf without mvcc version in delete range

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory Engine: get GC safe_point from PD (#17144)

close #17123

The `safe_point` is used to check if a cache is eligible to serve a read request.
If the `safe_point` drifts into the future, for example by 10 minutes, the cache
will not be able to serve any requests for those 10 minutes.

This commit fixes such issue by getting the `safe_point` from PD.

Signed-off-by: Neil Shen <overvenus@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-Memory Engine: Eviction, Algorithmic loading (#17023)

ref #16141, ref #16764

In-Memory Engine: Algorithmic Load and Eviction

Signed-off-by: Alex Feinberg <alex@strlen.net>

* In-memory engine: consider range overlaps when eviction (#17137)

ref #16141, close #17131

consider range overlaps when eviction

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: support delete range (#17151)

ref #16141

support delete range

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: evict range should consider loading range (#17154)

close #17153

evict range should consider loading range

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory engine: clear write batch when range load failed (#17156)

ref #16141, close #17103

clear write batch when range load failed

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: fix concurrency issue between delete range and write to memory (#17149)

close #17104

fix concurrency issue between delete range and write to memory

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory engine: add seek duration metrics (#17160)

ref #16141

add seek duration metrics

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: add metrics for prepare for write duration (#17161)

ref #16141

add metrics for prepare for write duration

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: set disk engine when start (#17165)

ref #16141

set disk engine when start

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory engine: avoid using snapshot cache (#16863)

ref #16141

avoid using snapshot cache

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: evict range after some raft commands (#17159)

ref #16141

evict range when becomeing follower, merge, and deleting data ranges.

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: `filter` plus one when any key is removed during GC (#17171)

ref #16141

filter plus one when any key is removed during GC

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: Spade  A <71589810+SpadeA-Tang@users.noreply.github.com>

* periodic reload

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* avoid mem negative

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* make period of reload config

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* add overhead for delte entry size

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* Fix missing field reload_period

Signed-off-by: Neil Shen <overvenus@gmail.com>

* Fix get_regions_in_range and BackgroundWorker compile

Signed-off-by: Neil Shen <overvenus@gmail.com>

---------

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: tonyxuqqi <tonyxuqi@outlook.com>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Neil Shen <overvenus@gmail.com>
Signed-off-by: Spade  A <71589810+SpadeA-Tang@users.noreply.github.com>
Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Co-authored-by: cfzjywxk <lsswxrxr@163.com>
Co-authored-by: wuhuizuo <wuhuizuo@126.com>
Co-authored-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com>
Co-authored-by: Spade  A <71589810+SpadeA-Tang@users.noreply.github.com>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>
Co-authored-by: tonyxuqqi <tonyxuqi@outlook.com>
Co-authored-by: SpadeA-Tang <u6748471@anu.edu.au>
Co-authored-by: Qi Xu <tonyxuqqi@outlook.com>
Co-authored-by: Alex Feinberg <alex@strlen.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note-none size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants