Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

txn refactoring part 3 #5857

Merged
merged 3 commits into from Nov 21, 2019
Merged

txn refactoring part 3 #5857

merged 3 commits into from Nov 21, 2019

Conversation

@nrc
Copy link
Contributor

nrc commented Nov 11, 2019

What have you changed?

The first commit is mostly a simple renaming, plus some minor code motion. The second commit moves from using u64 to represent time stamps to using a TimeStamp newtype. This improves type safety and readability.

What is the type of the changes?

  • Engineering (engineering change which doesn't change any feature or fix any issue)

PTAL @AndreMouche @breeswish

@nrc nrc requested review from AndreMouche and breeswish Nov 11, 2019
@nrc

This comment has been minimized.

Copy link
Contributor Author

nrc commented Nov 11, 2019

Note that the noisy intos are mostly in test code. We also get to move some functions onto methods of TimeStamp which improves readability.

TimeStamp((physical << TSO_PHYSICAL_SHIFT_BITS) + logical)
}

pub fn now_sec() -> TimeStamp {

This comment has been minimized.

Copy link
@breeswish

breeswish Nov 11, 2019

Member

I don't think this method makes sense, since TimeStamp is a composition of the physical part and the logical part. This function should be only useful in vary rare and limited cases.

This comment has been minimized.

Copy link
@nrc

nrc Nov 11, 2019

Author Contributor

It's not a new method, it used to be a function in one of the util functions which I just moved to being an inherent method,

components/keys/src/types.rs Outdated Show resolved Hide resolved
Copy link
Member

breeswish left a comment

I doubt the necessarily of replacing the TS in the MVCC/Txn layer. TimeStamp is a composition of the logical part and the physical part, while the MVCC/Txn layer does not rely on this feature to work. Instead they only need a monotonically increasing transaction number. Using TimeStamp type in the MVCC/Txn layer may cause misleading implications. @zhangjinpeng1987 @BusyJay PTAL

@nrc

This comment has been minimized.

Copy link
Contributor Author

nrc commented Nov 11, 2019

Currently we use one u64 everywhere as the timestamp afaict, whether that is logical + physical or just a number (and we usually call it ts or ..._ts, there are safepoints and versions, but these are freely interchanged with other timestamps, there is no conversion step). This PR makes that distinct from region ids or other u64s. I think we can improve further by making two kinds of timestamp, for the two kinds you mention. This PR seems to be a step in the right direction and an improvement over the current situation.

@nrc nrc force-pushed the nrc:refactor3 branch from 9e06def to 01e66f9 Nov 11, 2019
@nrc nrc added the C: Txn label Nov 11, 2019
#[derive(Clone, Copy, Debug, Default, Eq, PartialEq, Ord, PartialOrd, Hash)]
pub struct TimeStamp(u64);

const TSO_PHYSICAL_SHIFT_BITS: u64 = 18;

This comment has been minimized.

Copy link
@zhangjinpeng1987

zhangjinpeng1987 Nov 12, 2019

Member

I don't think it is a good idea to expose so many details of ts in TiKV.

This comment has been minimized.

Copy link
@zhangjinpeng1987

zhangjinpeng1987 Nov 12, 2019

Member

If some day PD change the rule or change the meaning of ts, TiKV should keep updated, but it is easy to forget.

This comment has been minimized.

Copy link
@nrc

nrc Nov 12, 2019

Author Contributor

This constant existed before it is not new, it has only been moved. Also it is not public.

This comment has been minimized.

Copy link
@MyonKeminta

MyonKeminta Nov 19, 2019

Contributor

Our check_txn_status and txn_heart_beat API needs to determine whether a ts has its TTL expired, so TiKV needs to know how ts represents physical time. This thing is added to TiKV when we are implementing that two APIs.

@sticnarf

This comment has been minimized.

Copy link
Contributor

sticnarf commented Nov 12, 2019

@zhangjinpeng1987 @breeswish FYI, the current transaction model of TiKV depends on the physical part of the timestamp (and the lock's TTL) to decide whether to cleanup a lock. The timestamp semantics is no longer opaque to TiKV.

@@ -325,7 +324,7 @@ impl<T: PdClient> Runner<T> {
is_hb_receiver_scheduled: false,
region_peers: HashMap::default(),
store_stat: StoreStat::default(),
start_ts: time_now_sec(),
start_ts: TimeStamp::now_sec(),

This comment has been minimized.

Copy link
@breeswish

breeswish Nov 12, 2019

Member

I don't think this is a correct refactor.

The ts used in transaction is a defined as a monotonically increasing transaction id, which is implemented by using a physical timestamp and logical counter.

The timestamp in this line however, is a real unix timestamp (unix epoch) which is not a transaction id.

This comment has been minimized.

Copy link
@nrc

nrc Nov 12, 2019

Author Contributor

The function called is the same, it is just named differently. before after

This comment has been minimized.

Copy link
@breeswish

breeswish Nov 12, 2019

Member

I know it's the same. However you made two mistakes.

Let's simply give the structure TimeStamp a more accurate name TransactionId, then you can see the problem:

Mistake 1. The return value of time_now_sec is a unix epoch, but not a thing that is compatible with TransactionId because of missing the logical part.

Mistake 2. The PD client needs a unix epoch, instead of a transaction id.

This comment has been minimized.

Copy link
@nrc

nrc Nov 12, 2019

Author Contributor

I see, thanks. I think this comes back to the question of whether there are two kinds of timestamp. Let me make a proposal in a top-level comment.

@nrc

This comment has been minimized.

Copy link
Contributor Author

nrc commented Nov 12, 2019

I think the best solution here is to have two timestamp types - one for PD time stamps (physical + logical) and one for txn timestamps (local, seconds in Unix epoch). I propose to do that as a follow-up PR. Doing it before landing will mean doing a lot of work rebasing (this PR bitrots very quickly). I think it is OK to land this PR first because it makes things better, even though it is not 100% correct. Currently not only do we not distinguish between the two kinds of timestamp (the variable names are mostly ts for both kinds and the types are u64), but we also don't distinguish the types from other u64s such as region IDs, ttls, or counts. I'll work on the timestamp split as soon as this lands, so we will not have the one-timestamp situation for very long.

@breeswish @sticnarf @zhangjinpeng1987 does that sound OK?

@nrc nrc force-pushed the nrc:refactor3 branch 4 times, most recently from 6f87d22 to b97c75c Nov 12, 2019
@nrc nrc force-pushed the nrc:refactor3 branch from b97c75c to b23ce2d Nov 14, 2019
src/storage/mvcc/txn.rs Outdated Show resolved Hide resolved
src/server/service/kv.rs Outdated Show resolved Hide resolved
src/storage/mod.rs Outdated Show resolved Hide resolved
src/storage/mvcc/txn.rs Outdated Show resolved Hide resolved
Copy link
Contributor

youjiali1995 left a comment

Rest lgtm.

@nrc nrc force-pushed the nrc:refactor3 branch 3 times, most recently from 45410b1 to 930169e Nov 18, 2019
@nrc

This comment has been minimized.

Copy link
Contributor Author

nrc commented Nov 18, 2019

@youjiali1995 comments are addressed in the last commit. PTAL

@nrc nrc force-pushed the nrc:refactor3 branch 2 times, most recently from 10952ed to 1ff977f Nov 18, 2019
components/keys/src/time.rs Outdated Show resolved Hide resolved
components/keys/src/time.rs Outdated Show resolved Hide resolved
benches/deadlock_detector/mod.rs Outdated Show resolved Hide resolved
entry.set_key_hash(self.rng.gen());
entries.push(entry);
});
self.timestamp += 1;
self.timestamp = self.timestamp.incr();

This comment has been minimized.

Copy link
@MyonKeminta

MyonKeminta Nov 19, 2019

Contributor

How about impl Add<u64> for TimeStamp(Output = TimeStamp) and impl Sub<TimeStamp> for TimeStamp(Output = u64) , and also AddAssign and SubAssign so that it will be more convenient to write arithmetic calculations for ts?

This comment has been minimized.

Copy link
@nrc

nrc Nov 20, 2019

Author Contributor

This particular instance has gone, but I think an increment in place function would be nice. I prefer incr to using + 1 the way that a timestamp is incremented should be an implementation detail.

components/backup/src/endpoint.rs Outdated Show resolved Hide resolved
let len = key.len();
if len < number::U64_SIZE {
return Err(codec::Error::KeyLength);
}
let mut ts = &key[len - number::U64_SIZE..];
number::decode_u64_desc(&mut ts)
Ok(TimeStamp(number::decode_u64_desc(&mut ts)?))

This comment has been minimized.

Copy link
@MyonKeminta

MyonKeminta Nov 19, 2019

Contributor

Can we write number::decode_u64_desc(&mut ts).map(TimeStamp::new) here?

This comment has been minimized.

Copy link
@nrc

nrc Nov 20, 2019

Author Contributor

No because the error types are different we would need to map the error through into too.

let bypass_locks = TsSet::new(
context
.take_resolved_locks()
.into_iter()
.map(Into::into)
.collect(),
);
Comment on lines 120 to 126

This comment has been minimized.

Copy link
@MyonKeminta

MyonKeminta Nov 19, 2019

Contributor

How about keep TsSet's inner Vec unchanged, and do the conversion in TsSet's contains method? Otherwise, this conversion seems causes an extra allocation.

This comment has been minimized.

Copy link
@nrc

nrc Nov 20, 2019

Author Contributor

I've used transmute.

@nrc nrc force-pushed the nrc:refactor3 branch from 1ff977f to 8c8bbb2 Nov 19, 2019
@nrc nrc force-pushed the nrc:refactor3 branch from 8c8bbb2 to f627b6a Nov 20, 2019
pub fn from_u64s(ts: Vec<u64>) -> Self {
Self::new(ts.into_iter().map(Into::into).collect())
// This conversion is safe because TimeStamp is a transparent wrapper over u64.
let ts = unsafe { ::std::mem::transmute::<Vec<u64>, Vec<TimeStamp>>(ts) };
Self::new(ts)
}
Comment on lines 62 to 65

This comment has been minimized.

Copy link
@MyonKeminta

MyonKeminta Nov 20, 2019

Contributor

It seems that this requires that the TimeStamp is the same with u64 im memory, which relies on TimeStamp's internal implementation. How do you think?

This comment has been minimized.

Copy link
@nrc

nrc Nov 20, 2019

Author Contributor

I think it is OK - it would be hard to change TimeStamp without causing a compile error and TsSet seems very closely related to TimeStamp. I could move the Vec -> Vec conversion to TimeStamp if you prefer that?

Copy link
Contributor

MyonKeminta left a comment

This PR mostly LGTM. I'm not sure if TimeStamp's next and prev, and TsSet's transmute are good ideas. I'll accept this if other ones think it's ok.

Copy link
Contributor

sticnarf left a comment

LGTM

nrc added 3 commits Nov 7, 2019
Signed-off-by: Nick Cameron <nrc@ncameron.org>
Signed-off-by: Nick Cameron <nrc@ncameron.org>
Signed-off-by: Nick Cameron <nrc@ncameron.org>
@nrc nrc dismissed stale reviews from sticnarf and MyonKeminta via 56cf016 Nov 20, 2019
@nrc nrc force-pushed the nrc:refactor3 branch from f627b6a to 56cf016 Nov 20, 2019
Copy link
Contributor

youjiali1995 left a comment

LGTM

Copy link
Contributor

sticnarf left a comment

LGTM

@youjiali1995 youjiali1995 merged commit 6fff85c into tikv:master Nov 21, 2019
3 checks passed
3 checks passed
DCO All commits are signed off!
Details
idc-jenkins-ci-tikv/integration-common-test Jenkins job succeeded.
Details
idc-jenkins-ci/test Jenkins job succeeded.
Details
hawkingrei added a commit to hawkingrei/tikv that referenced this pull request Nov 22, 2019
Signed-off-by: Nick Cameron <nrc@ncameron.org>
hawkingrei added a commit to hawkingrei/tikv that referenced this pull request Nov 22, 2019
Signed-off-by: Nick Cameron <nrc@ncameron.org>
hawkingrei added a commit to hawkingrei/tikv that referenced this pull request Nov 22, 2019
Signed-off-by: Nick Cameron <nrc@ncameron.org>
hawkingrei added a commit to hawkingrei/tikv that referenced this pull request Dec 1, 2019
Signed-off-by: Nick Cameron <nrc@ncameron.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.