Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Record mutations in transactions #97

Merged
merged 12 commits into from Aug 15, 2019

Conversation

@sticnarf
Copy link
Contributor

sticnarf commented Aug 7, 2019

A BTreeMap (prior to HashMap because scan needs supporting) is added to record all key mutations. Mutation inside the library now doesn't contain the key. The Mutations used in RPC is generated in prewrite.

Mutations in a transaction should be queried first on read operations. Although Snapshot is not implemented yet, we can return values in the mutation map now.

@sticnarf sticnarf requested review from nrc, AndreMouche and MyonKeminta Aug 7, 2019
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@sticnarf sticnarf force-pushed the sticnarf:handle-mutations branch from 59d7f45 to ad49e78 Aug 8, 2019
keys_from_snapshot.push(key);
result_indices_from_snapshot.push(result.len());
// Push a placeholder
result.push(KvPair::default());

This comment has been minimized.

Copy link
@nrc

nrc Aug 8, 2019

Contributor

Can we use an Option rather than a null value please>

This comment has been minimized.

Copy link
@sticnarf

sticnarf Aug 8, 2019

Author Contributor

Finally we have to return a Vec<KvPair>. If we use Option, we need one more Vec.

This comment has been minimized.

Copy link
@nrc

nrc Aug 8, 2019

Contributor

I think that we should be returning a pair of a Key and Option, and that that would be fine to use here too. There might be opportunity later to do some interesting optimisations here (if we end up using streams, or with whatever the type signatures end up looking like).

Anyway, I think it is better to use an Option for now since that is the 'right' thing to do, using a null value is premature optimisation, IMO. We can optimise later.

src/transaction/transaction.rs Outdated Show resolved Hide resolved
unimplemented!()
let mut result = Vec::new();
let mut keys_from_snapshot = Vec::new();
let mut result_indices_from_snapshot = Vec::new();

This comment has been minimized.

Copy link
@nrc

nrc Aug 8, 2019

Contributor

Prefer using one Vec of (KvPair, usize)

This comment has been minimized.

Copy link
@sticnarf

sticnarf Aug 8, 2019

Author Contributor

Tt's not convenient to use a Vec<(KvPair, usize)> now. If we want to avoid cloning the key, we have to consume the Vec to get an iterator. Meanwhile, the indices are also consumed but we want to use them sometime later.

This comment has been minimized.

Copy link
@nrc

nrc Aug 12, 2019

Contributor

Hmm, yeah. Makes sense.

@@ -198,6 +244,28 @@ impl Transaction {
pub fn snapshot(&self) -> &Snapshot {
&self.snapshot
}

async fn prewrite(&mut self) -> Result<()> {
// TODO: Too many clones. Consider using bytes::Byte.

This comment has been minimized.

Copy link
@nrc

nrc Aug 8, 2019

Contributor

Depending what we do in the unimplemented parts, we could take self by value to avoid cloning.

use crate::{Key, Value};

mod client;
mod requests;
#[allow(clippy::module_inception)]
mod transaction;

#[derive(Debug, Clone)]
pub enum Mutation {

This comment has been minimized.

Copy link
@nrc

nrc Aug 8, 2019

Contributor

Does Mutation need to be pub any more?

This comment has been minimized.

Copy link
@sticnarf

sticnarf Aug 8, 2019

Author Contributor

No. I'll remove it.

unimplemented!()
pub async fn get(&self, key: impl Into<Key>) -> Result<Value> {
let key = key.into();
if let Some(value) = self.get_from_mutations(&key) {

This comment has been minimized.

Copy link
@nrc

nrc Aug 8, 2019

Contributor

The raw API returns Result<Option<Value>> and I think the transactional one should too. That means get_value will need to be changed to handle deleted values properly.

// Try to fill the result vector from mutations
for key in keys {
let key = key.into();
if let Some(value) = self.get_from_mutations(&key) {

This comment has been minimized.

Copy link
@nrc

nrc Aug 8, 2019

Contributor

nit: prefer match to if let ... else

}

impl Mutation {
fn with_key(self, key: impl Into<Key>) -> kvrpcpb::Mutation {

This comment has been minimized.

Copy link
@nrc

nrc Aug 8, 2019

Contributor

I think this will only ever be called with Keys so you can skip the impl Into.

sticnarf added 3 commits Aug 12, 2019
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@sticnarf sticnarf force-pushed the sticnarf:handle-mutations branch from a41fb07 to 52e9cb2 Aug 12, 2019
Fix tests
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@sticnarf

This comment has been minimized.

Copy link
Contributor Author

sticnarf commented Aug 12, 2019

Now, batch_get API returns a HashMap. I hope this will make it easier for both users and us. Unit tests for the added part of this PR are missing. However, I want to refactor the kv api to be request based first.
PTAL @nrc @youjiali1995

@sticnarf sticnarf requested a review from youjiali1995 Aug 12, 2019
@nrc

This comment has been minimized.

Copy link
Contributor

nrc commented Aug 12, 2019

Now, batch_get API returns a HashMap. I hope this will make it easier for both users and us

Do you have any examples of how real users use batch_get? Another possibility is to return an iterator, I'm not sure which would be more ergonomic.

@sticnarf

This comment has been minimized.

Copy link
Contributor Author

sticnarf commented Aug 12, 2019

Now, batch_get API returns a HashMap. I hope this will make it easier for both users and us

Do you have any examples of how real users use batch_get? Another possibility is to return an iterator, I'm not sure which would be more ergonomic.

BatchGet in the go client of TiKV returns a map: https://github.com/tikv/client-go/blob/master/txnkv/txn.go#L81

I have little idea how real users use batch_get. The only examples I can find is TiDB: https://github.com/pingcap/tidb/blob/master/ddl/index.go#L767 and titan in Meitu: https://github.com/distributedio/titan/blob/3af3679f4ed5d1ec6cc0527ec17243f010539faf/db/db.go#L87

I think it's also fine to return an Iterator of (Key, Option<Value>). Do you think the type will be too complex here?

sticnarf added 2 commits Aug 14, 2019
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@sticnarf

This comment has been minimized.

Copy link
Contributor Author

sticnarf commented Aug 14, 2019

I've changed the return type of batch_get to an iterator of (Key, Option<Value>). And I also add a unit test for it.
@nrc @youjiali1995 PTAL again! Thanks.

Copy link

youjiali1995 left a comment

LGTM

@sticnarf sticnarf referenced this pull request Aug 14, 2019
1 of 1 task complete
pb
}

/// Returns a `Some` if the value can be determined by this mutation. Otherwise, returns `None`.

This comment has been minimized.

Copy link
@mapleFU

mapleFU Aug 14, 2019

Maybe Determined and Undetermined here?

This comment has been minimized.

Copy link
@sticnarf

sticnarf Aug 14, 2019

Author Contributor

Oh, I forgot to modify it.

This comment has been minimized.

Copy link
@sticnarf

sticnarf Aug 14, 2019

Author Contributor

I think the type itself is clear enough so let's just remove this comment.

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@nrc
nrc approved these changes Aug 14, 2019
Copy link
Contributor

nrc left a comment

A few minor comments, otherwise it lgtm.

}

impl Mutation {
fn with_key(self, key: Key) -> kvrpcpb::Mutation {

This comment has been minimized.

Copy link
@nrc

nrc Aug 14, 2019

Contributor

I would rename this into_proto or into_proto_with_key (with_* is the idiomatic name for a method which takes a function to operate on an internal value).

(key, mutation_value)
})
.collect::<Vec<_>>()
.into_iter();

This comment has been minimized.

Copy link
@nrc

nrc Aug 14, 2019

Contributor

There should be a better way than to collect then immediately iterate again. Maybe cloned() or add a to_owned to the map function? If you're doing this for the side effects, then please add a comment

This comment has been minimized.

Copy link
@sticnarf

sticnarf Aug 15, 2019

Author Contributor

It's just an eager evaluation of the iterator. Then, the new iterator does not hold a reference to self.

This comment has been minimized.

Copy link
@sticnarf

sticnarf Aug 15, 2019

Author Contributor

Yes, and there is side effect too. Now I think it's better to use a for loop...

MutationValue::Undetermined => match results_from_snapshot.peek() {
Some((key_from_snapshot, _)) if &key == key_from_snapshot => {
results_from_snapshot.next()
}

This comment has been minimized.

Copy link
@nrc

nrc Aug 14, 2019

Contributor

I think it would be good to add a comment inside this method somewhere to explain what you're doing.

This comment has been minimized.

Copy link
@sticnarf

sticnarf Aug 15, 2019

Author Contributor

Sorry, I find I am doing wrong here. These logic should belong to the batch_get in snapshot instead of here.

Signed-off-by: Yilin Chen <sticnarf@gmail.com>
@sticnarf

This comment has been minimized.

Copy link
Contributor Author

sticnarf commented Aug 15, 2019

@nrc @youjiali1995 PTAL again.

@nrc
nrc approved these changes Aug 15, 2019
Copy link
Contributor

nrc left a comment

lgtm

Copy link

youjiali1995 left a comment

LGTM

@sticnarf sticnarf merged commit 54b14b8 into tikv:master Aug 15, 2019
2 checks passed
2 checks passed
DCO All commits are signed off!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@sticnarf sticnarf deleted the sticnarf:handle-mutations branch Aug 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.