Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-memory Engine: lock should not be deleted directly #17127

Closed
Tracked by #16141
SpadeA-Tang opened this issue Jun 12, 2024 · 2 comments · Fixed by #17128
Closed
Tracked by #16141

In-memory Engine: lock should not be deleted directly #17127

SpadeA-Tang opened this issue Jun 12, 2024 · 2 comments · Fixed by #17128
Labels
severity/critical type/bug The issue is confirmed as a bug.

Comments

@SpadeA-Tang
Copy link
Member

Bug Report

For delete cmd of lock cf, the delete is performed directly on in-memory engine rather than write tombsone. This is considered that we always want to read the latest value of the lock cf. However, it volates the write atomicity of the in-memory engine and rocksdb.
The complete process for writing a write batch is as follows:
rocksdb:write_callback -> .. -> rocksdb:write to memtable and callback with seqno. -> in-memory engine: write_impl -> rocksdb:publish seqno.
We can see that only when the last step that rocksdb publish the seqno will the write be visible. So we cannot avoid to write tombstone in lock-cf to guarantee the write atomicity.

One of the abnormal caused by this issue is: #17018

@seiya-annie
Copy link

/found customer

@ti-chi-bot ti-chi-bot bot added the report/customer Customers have encountered this bug. label Jun 19, 2024
@seiya-annie
Copy link

/remove-found customer

@ti-chi-bot ti-chi-bot bot removed the report/customer Customers have encountered this bug. label Jun 19, 2024
overvenus added a commit that referenced this issue Jun 21, 2024
* build: bump tikv pkg version (#17063)

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

* resolved_ts: refactor log for the unexpected path (#17064) (#17068)

ref #16818

Refactor logs for the unexpected path, print both the exsiting row and input key/value.

Signed-off-by: cfzjywxk <lsswxrxr@163.com>

Co-authored-by: cfzjywxk <lsswxrxr@163.com>

* chore: add prow OWNERS files for critial configuration files (#17071) (#17108)

close #17004

Signed-off-by: wuhuizuo <wuhuizuo@126.com>

Co-authored-by: wuhuizuo <wuhuizuo@126.com>

* OWNERS: Auto Sync OWNERS files from community membership (#16973) (#17120)

 

Signed-off-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Signed-off-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com>

Co-authored-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com>
Co-authored-by: wuhuizuo <wuhuizuo@126.com>

* In-memory engine: strip off read related code from engine.rs to a separte file (#16792)

ref #16141

strip off read related code from engine.rs to a separte file

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: collector iteration related metrics (#16735)

ref #16141

collector iteration related metrics

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: expose range cache engine configuration (#16791)

ref #16141

expose range cache engine configuration

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: do not load a range overlapped with an unresolved range (#16794)

ref #16141

do not load a range overlapped with an unresolved range

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* server/node: rename Node to MultiRaftServer (#16783)

close #16782

Signed-off-by: zhangjinpeng87 <zzzhangjinpeng@gmail.com>
Signed-off-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com>

Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>

* In-memory engine: add metrics for memory usage (#16822)

ref #16141

add metrics for memory usage

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com>

* In-memory engine: add metrics for gc duration, range load duration, and write duration (#16831)

ref #16141

add metrics for gc duration, range load duration, and write duration

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: delete lock physically rather than writting tombstone (#16852)

ref #16141

delete lock physically rather than writting tombstone

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: fix missing memory record when snapshot loading (#16850)

ref #16141

fix missing memory record when snapshot loading

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-Memory Engine: add a tag to the range for easier troubleshooting (#16816)

ref #16141

In-Memory Engine: add a tag into Range for diagnostics

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: Qi Xu <tonyxuqqi@outlook.com>

Co-authored-by: SpadeA-Tang <u6748471@anu.edu.au>
Co-authored-by: Qi Xu <tonyxuqqi@outlook.com>

* In-memory engine: remove out of date pending ranges (#16872)

ref #16141

remove out of date pending ranges

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* compile memory-engine by default and support dynamic config update for memory-engine (#16897)

ref #16141

compile the in-memory-engine by default and support dynamic config update for memory-engine.
The enabled flag would support enabled:true -> enabled:false and soft/hard limit also supports online change. 
But if enabled:false is set initially, the config change to make enabled:true needs the TiKV restart.

Signed-off-by: tonyxuqqi <tonyxuqi@outlook.com>

* In-memory engine: skiplist tombstone should be handled separetely in gc (#16944)

ref #16141

skiplist tombstone should be handled separetely in gc

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: manually implement partial_eq for CacheRange to avoid the consideration of tag (#16959)

ref #16141

manually implement partial_eq for CacheRange to avoid the consideration of tag

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: tonyxuqqi <tonyxuqi@outlook.com>

* In-memory engine: implement core local array and collect read flow statistics (#16945)

ref #16141

implement core local array and collect read flow statistics

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: fix seek does not set self.valid properly (#16991)

ref #16141

fix seek does not set self.valid properly

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory engine: add metrics for different types of iteration operations (#16997)

ref #16141

add metrics for different types of iteration operations

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: add metrics for range count (#16971)

ref #16141

add metrics for range count

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: Spade  A <71589810+SpadeA-Tang@users.noreply.github.com>

Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>

* In-memory engine: fix gc worker not used (#17032)

ref #16141

fix gc worker not used

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* raftstore: fix online config change panic for periodic-full-compact-start-time (#17069)

close #17066

fix online config change panic for periodic-full-compact-start-time

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* config: test +HHMM offset in ReadableSchedule (#17081)

close #17080

* Test both formats +HHMM offsets and as well as +HH:MM for ReadableSchedule.

Signed-off-by: Alex Feinberg <alex@strlen.net>

* In-memory engine: mvcc keys with different sequence number should only be handled once in GC (#17083)

ref #16141, close #17060

mvcc keys with different sequen number should only be handled once in GC

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: Spade  A <71589810+SpadeA-Tang@users.noreply.github.com>

Co-authored-by: Neil Shen <overvenus@gmail.com>
Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>

* In-memory engine: enable compaction filter for HybridEngine (#17087)

ref #16141, close #17086

enable compaction filter for HybridEngine

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory Engine: sequence number should be increased for each key (#17122)

ref #16141, close #17114

sequence number should be increased for each key

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: support changing the direction of iteration (#17129)

ref #16141, close #17079

support reverse direction when iterating

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory engine: clean lock cf tombstone in a background worker (#17128)

ref #16141, close #17127

clean lock cf tombstone in a background worker

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory Engine: encode internal key trailer in little endian. (#17125)

close #17082

An internal key trailer is `(sequence_number << 8) | value_type as u64`
and it should be encoded in little endian to be consistent with RocksDB.

Signed-off-by: Neil Shen <overvenus@gmail.com>

* In-memory engine: set boundries when gc (#17145)

ref #16141, close #17143

set boundries when gc

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory engine: encode key for lock cf without mvcc version in delete range (#17142)

ref #16141, close #17140

encode key for lock cf without mvcc version in delete range

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory Engine: get GC safe_point from PD (#17144)

close #17123

The `safe_point` is used to check if a cache is eligible to serve a read request.
If the `safe_point` drifts into the future, for example by 10 minutes, the cache
will not be able to serve any requests for those 10 minutes.

This commit fixes such issue by getting the `safe_point` from PD.

Signed-off-by: Neil Shen <overvenus@gmail.com>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-Memory Engine: Eviction, Algorithmic loading (#17023)

ref #16141, ref #16764

In-Memory Engine: Algorithmic Load and Eviction

Signed-off-by: Alex Feinberg <alex@strlen.net>

* In-memory engine: consider range overlaps when eviction (#17137)

ref #16141, close #17131

consider range overlaps when eviction

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: support delete range (#17151)

ref #16141

support delete range

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: evict range should consider loading range (#17154)

close #17153

evict range should consider loading range

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory engine: clear write batch when range load failed (#17156)

ref #16141, close #17103

clear write batch when range load failed

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: fix concurrency issue between delete range and write to memory (#17149)

close #17104

fix concurrency issue between delete range and write to memory

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory engine: add seek duration metrics (#17160)

ref #16141

add seek duration metrics

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: add metrics for prepare for write duration (#17161)

ref #16141

add metrics for prepare for write duration

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: set disk engine when start (#17165)

ref #16141

set disk engine when start

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

* In-memory engine: avoid using snapshot cache (#16863)

ref #16141

avoid using snapshot cache

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: evict range after some raft commands (#17159)

ref #16141

evict range when becomeing follower, merge, and deleting data ranges.

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* In-memory engine: `filter` plus one when any key is removed during GC (#17171)

ref #16141

filter plus one when any key is removed during GC

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: Spade  A <71589810+SpadeA-Tang@users.noreply.github.com>

* periodic reload

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* avoid mem negative

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* make period of reload config

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* add overhead for delte entry size

Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>

* Fix missing field reload_period

Signed-off-by: Neil Shen <overvenus@gmail.com>

* Fix get_regions_in_range and BackgroundWorker compile

Signed-off-by: Neil Shen <overvenus@gmail.com>

---------

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
Signed-off-by: SpadeA-Tang <u6748471@anu.edu.au>
Signed-off-by: tonyxuqqi <tonyxuqi@outlook.com>
Signed-off-by: Alex Feinberg <alex@strlen.net>
Signed-off-by: Neil Shen <overvenus@gmail.com>
Signed-off-by: Spade  A <71589810+SpadeA-Tang@users.noreply.github.com>
Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Co-authored-by: cfzjywxk <lsswxrxr@163.com>
Co-authored-by: wuhuizuo <wuhuizuo@126.com>
Co-authored-by: Jinpeng Zhang <zzzhangjinpeng@gmail.com>
Co-authored-by: Spade  A <71589810+SpadeA-Tang@users.noreply.github.com>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Co-authored-by: lucasliang <nkcs_lykx@hotmail.com>
Co-authored-by: tonyxuqqi <tonyxuqi@outlook.com>
Co-authored-by: SpadeA-Tang <u6748471@anu.edu.au>
Co-authored-by: Qi Xu <tonyxuqqi@outlook.com>
Co-authored-by: Alex Feinberg <alex@strlen.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/critical type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants