Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TiKV Rust Client RFC #7

Open
wants to merge 36 commits into
base: master
from
Open
Changes from 7 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
7419c9b
TiKV Rust Client
sunxiaoguang Oct 25, 2018
64aad0b
Add more about API usage
sunxiaoguang Oct 31, 2018
57cc178
Fix some gramma mistakes
sunxiaoguang Nov 2, 2018
e82ac4e
Fix indentation
sunxiaoguang Nov 2, 2018
43ed161
Fixed a typo
sunxiaoguang Nov 5, 2018
5368649
Do not move key for get, scan and delete methods
sunxiaoguang Nov 8, 2018
0cfcaff
Remove KeyRange and use RangeBounds trait instead
sunxiaoguang Nov 9, 2018
aea8d85
Use builder style interface for raw kv
sunxiaoguang Nov 14, 2018
510791f
Use impl Trait whenever possible
sunxiaoguang Nov 14, 2018
bf8ddab
Update example code
sunxiaoguang Nov 15, 2018
5cb6085
Update crate name to 'tikv-client'
sunxiaoguang Nov 15, 2018
584e88a
Add example code to create Config with security
sunxiaoguang Nov 15, 2018
b239698
Add transaction isolation level example
sunxiaoguang Nov 15, 2018
7dd712d
Merge branch 'master' into tikv-client-rust
Hoverbear Nov 15, 2018
78d6100
Clarify that Raw/Transaction APIs are mutually exclusive
sunxiaoguang Nov 16, 2018
cbfceb5
Do not use trait in APIs
sunxiaoguang Nov 16, 2018
9215f8c
Fix formatting issue with wrong indention
sunxiaoguang Nov 16, 2018
6b22b3e
Fix indention and threading model
sunxiaoguang Nov 18, 2018
9b892a3
Use concrete future type for transactional api
sunxiaoguang Nov 18, 2018
c8b0167
Merge branch 'master' into tikv-client-rust
Hoverbear Nov 19, 2018
80457b2
Change example to reflect changes of Config
sunxiaoguang Nov 19, 2018
b68e34b
Remove `Programming Model` section and make it TBD
sunxiaoguang Nov 20, 2018
bd05fde
Fix a typo
sunxiaoguang Nov 21, 2018
e3c080c
Format text to make lines fit into 80 characters
sunxiaoguang Nov 22, 2018
04569e8
provide a C-compatible binding for future clients
sunxiaoguang Nov 22, 2018
e692e80
Merge branch 'master' into tikv-client-rust
Hoverbear Dec 20, 2018
1a7c67f
Fix lints
Hoverbear Dec 20, 2018
e4ef825
add schedule limit design (#13)
disksing Dec 24, 2018
aa950ff
Unified Log Format RFC (#18)
breeswish Jan 7, 2019
5eb89c3
RFC: PD Simulator (#20)
rleungx Jan 28, 2019
252eb33
Update RFC to match code
Hoverbear Feb 20, 2019
9d2e4c3
Merge branch 'master' into tikv-client-rust
Hoverbear Feb 21, 2019
41d029d
Fix lints
Hoverbear Feb 25, 2019
2c77911
Merge branch 'tikv-client-rust' of github.com:sunxiaoguang/rfcs into …
Hoverbear Feb 25, 2019
bbb504c
Merge branch 'master' into tikv-client-rust
Hoverbear May 9, 2019
1ff37ef
Merge branch 'master' into tikv-client-rust
Hoverbear Jul 29, 2019
File filter...
Filter file types
Jump to…
Jump to file or symbol
Failed to load files and symbols.

Always

Just for now

@@ -0,0 +1,395 @@
# Summary

Introduce a full featured, official TiKV (and PD) Rust client. It is intended to be used as a reference implementation, or to provide C-compatible binding for future clients.

# Motivation

Currently, users of TiKV must use [TiDB's Go Client](https://github.com/pingcap/tidb/blob/master/store/tikv/client.go), which is not well packaged or documented. We would like to ensure that users can easily use TiKV and PD without needing to use TiDB.

This comment has been minimized.

Copy link
@crazycoder1988

crazycoder1988 Nov 14, 2018

Suggested change
Currently, users of TiKV must use [TiDB's Go Client](https://github.com/pingcap/tidb/blob/master/store/tikv/client.go), which is not well packaged or documented. We would like to ensure that users can easily use TiKV and PD without needing to use TiDB.
Currently, TiKV users must use [TiKV's Go Client provided by TiDB](https://github.com/pingcap/tidb/blob/master/store/tikv/client.go), which is not well packaged or documented. We would like to ensure that users can easily use TiKV and PD without needing to use TiDB.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 15, 2018

Author

@lilin90 Any comments?

This comment has been minimized.

Copy link
@Hoverbear

Hoverbear Nov 15, 2018

Member

Both are OK. But I think we can add the Java client now: https://github.com/tikv/client-java


We think this would help encourage community participation in the TiKV project and associated libraries. During talks with several potential corporate users we discovered that there was an interest in using TiKV to resolve concerns such as caching and raw key-value stores.

# Detailed design
## Supported targets
This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@crazycoder1988

We will target the `stable` channel of Rust starting in the Rust 2018 edition. We choose to begin with Rust 2018 so we do not need to concern ourselves with an upgrade path.

We will also support the most recent `nightly` version of Rust, but users should not feel the need to reach for nightly unless they are already using it.

## Naming

The [Rust API Guidelines](https://rust-lang-nursery.github.io/api-guidelines/naming.html) do not perscribe any particular crate name convention.
This conversation was marked as resolved by Hoverbear

This comment has been minimized.

Copy link
@sticnarf

sticnarf Nov 21, 2018

Suggested change
The [Rust API Guidelines](https://rust-lang-nursery.github.io/api-guidelines/naming.html) do not perscribe any particular crate name convention.
The [Rust API Guidelines](https://rust-lang-nursery.github.io/api-guidelines/naming.html) do not prescribe any particular crate name convention.

Fix a typo

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 21, 2018

Author

Fixed, Thanks a lot


We choose to name the crate `tikv_client` to conform to the constraints presented to us by the Rust compiler. Cargo permits `tikv-client` and `tikv_client`, but `rustc` does not permit `tikv-client` so we choose to use `tikv_client` to reduce mental overhead.
This conversation was marked as resolved by Hoverbear

This comment has been minimized.

Copy link
@Hoverbear

Hoverbear Oct 31, 2018

Member

We chose tikv-client.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 15, 2018

Author

@Hoverbear PTAL about this line. I'm not sure if I made it clear.


Choosing to seperate `tikv` and `client` helps potentially unfamiliar users to immediately understand the intent of the package. `tikvclient`, while understandable, is not immediately parsable by a human.

All structures and functions will otherwise follow the [Rust API Guidelines](https://rust-lang-nursery.github.io/api-guidelines/), some of which will be enforced by `clippy`.

## Installation

To utilize the client programmatically, users will be able to add the `tikv-client` crate to their `Cargo.toml`'s dependencies. Then they must use the crate with `use tikv_client;`. Unfortunately due to Rust’s naming conventions this inconsistency is in place.

To utilize the command line client, users will be able to install the binary via `cargo install tikv-client`. They will then be able to access the client through the binary `tikv-client`. If they wish for a different name they can alias it in their shell.
This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@Hoverbear

Hoverbear Oct 31, 2018

Member

Maybe we should leave this for a future RFC to prevent this one from becoming huge. Besides we cannot effectively start on this until after the library is done.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 15, 2018

Author

This CLI section was removed.

This conversation was marked as resolved by Hoverbear

This comment has been minimized.

Copy link
@crazycoder1988

crazycoder1988 Nov 14, 2018

What's the scope of a command line client? Is it like tikv-ctl or something else?

This comment has been minimized.

Copy link
@Hoverbear

Hoverbear Nov 14, 2018

Member

@sunxiaoguang I think we talked about leaving the CLI client to another RFC, do you agree? If so lets remove this.

This comment has been minimized.

Copy link
@gregwebs

gregwebs Nov 14, 2018

tikv-ctl means administrative operations of TiKV.
I think it is best to work on that independently from this data API to the extent it is possible.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 15, 2018

Author

Sure. Let me remove the tikv-ctl part.


## Usage

## Two types of APIs

This comment has been minimized.

Copy link
@Hoverbear

Hoverbear Nov 15, 2018

Member

If I recall correctly we need to advise users that they are mutually exclusive, and they must choose one or the other. I think that is worth mentioning in this section.

This comment has been minimized.

Copy link
@sunxiaoguang
TiKV provides two types of APIs for developers:
This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@crazycoder1988
- The Raw Key-Value API

If your application scenario does not need distributed transactions or MVCC (Multi-Version Concurrency Control) and only needs to guarantee the atomicity towards one key, you can use the Raw Key-Value API.

- The Transactional Key-Value API

If your application scenario requires distributed ACID transactions and the atomicity of multiple keys within a transaction, you can use the Transactional Key-Value API.

Generally, the Raw Key-Value API has higher throughput and lower latency compared to the Transactional Key-Value API. If distributed ACID transactions are not required, Raw Key-Value API is preferred over Transactional Key-Value API for better performance and ease of use.

The client provides two types of APIs in two separate modules for developers to choose from.

### The common data types

- Key: raw binary data
This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@crazycoder1988

crazycoder1988 Nov 14, 2018

Better to leave the actual Rust type as well. If it is already a type name, please surround it with ``.

- Value: raw binary data
- KvPair: Key-value pair type
- KeyRange: Half-open interval of keys
This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@crazycoder1988

crazycoder1988 Nov 14, 2018

Please define "Half-open"

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 15, 2018

Author

Sorry that I forgot to remove KeyRange here. It's been removed and changed to standard RangeBounds trait so it can represent both half-open and closed intervals after edition.

- Config: Configuration for client

### Raw Key-Value API Basic Usage

To use the Raw Key-Value API, take the following steps:

1. Create an instance of Config to specify endpoints of PD (Placement Driver) and optional security config

```rust
let config = Config::new(vec!["127.0.0.1:2379"]);
```

2. Create a Raw Key-Value client.

```rust
let client = RawClient::new(&config);
```

3. Call the Raw Key-Value client methods to access the data on TiKV. The Raw Key-Value API contains following methods

```rust
fn get<K, C>(&self, key: K, cf: C) -> KvFuture<Value>
This conversation was marked as resolved by Hoverbear

This comment has been minimized.

Copy link
@gregwebs

gregwebs Nov 12, 2018

what is the cf parameter? As someone new to TiKV I don't know what this abbreviation means.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 12, 2018

Author

It means column family. Maybe adding document is better than column_family?

This comment has been minimized.

Copy link
@gregwebs

gregwebs Nov 12, 2018

A newtype could be helpful here so that the type is ColumnFamily rather than String.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 13, 2018

Author

That sounds great.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 14, 2018

Author

ColumnFamily is added. PTAL @gregwebs

where
K: AsRef<Key>,
C: Into<Option<String>>;
fn batch_get<K, C>(&self, keys: K, cf: C) -> KvFuture<Vec<KvPair>>
where
K: AsRef<[Key]>,
C: Into<Option<String>>;
fn put<P, C>(&self, pair: P, cf: C) -> KvFuture<()>
where
P: Into<KvPair>,
C: Into<Option<String>>;
fn batch_put<I, P, C>(&self, pairs: I, cf: C) -> KvFuture<()>
This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@gregwebs

gregwebs Nov 12, 2018

These type signatures are aesthetically pleasing but at the cost of performing multiple lookups/substitutions. That is, <I, P, C> has no meaning without doing a lookup in the where clause.

I think this mostly comes down to the rust community often adopting one letter type variable names. I am familiar with a community adopting one-letter type variables from my experience with Haskell (that is often done without any explanatory bound), and the practice was never friendly to new users, or just new users of the code.

Coming from first principles, one-letter variables create more reading difficulty, particularly when they are not 100% consistent. It would help me greatly to expand some of them to the point where the variable name is at least a syllable of a concept instead of a one-letter abbreviation for a concept (which has no meaning without doing a lookup).

The easiest change to make here would be

  • P -> Pair
  • I -> Pairs or Iter or IterPair

Type variable consistency: there are two meanings for K

  • K: AsRef
  • K: AsRef<[Key]>

The minimum change to distinguish these is to pluralize the list version:

  • Keys: AsRef<[Key]>

For he single key variable, Key is already taken as a type, so I think there are three options

  • keep using K
  • AKey or KeyRef.
  • inline K (in both forms) instead of putting it in where.

In general I would actually prefer in-lining these in a vertical format rather than using where clauses, but the doc generators won't preserve that formatting?

fn batch_put(&self,
    pairs: IntoIterator<Item = Into<KvPair>>,
    cf: Into<Option<String>>
) -> KvFuture<()>

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 12, 2018

Author

Good idea. It definitely looks better than many single letter type variables.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 13, 2018

Author

BTW: The inlining style you mentioned? Did you mean this?

    fn batch_put<
        IntoIter: IntoIterator<Item = Pair>,
        Pair: Into<KvPair>,
        ColumnFamily: Into<Option<String>>,
    >(  
        &self,
        _pairs: Pair,
        _cf: ColumnFamily,
    ) -> KvFuture<()>;

Using trait bound directly as argument type is prohibited.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 13, 2018

Author

Oh, I see what you mean now. It would be something like this

    fn batch_put(                                                                                                                                              
        &self,                                                                                                                                                 
        pairs: impl IntoIterator<Item = impl Into<KvPair>>,                                                                                                    
        cf: impl Into<Option<String>>,                                                                                                                         
    ) -> KvFuture<()>;

It's the first time I realize the reason why impl trait exists for arguments. It's definitely easier to read than where syntax. Thanks.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 13, 2018

Author

Sadly, this particular case isn't supported.

    fn batch_scan(
        &self,
        ranges: impl AsRef<[impl RangeBounds<Key>]>,
        each_limit: u32,
        key_only: bool,
        cf: impl Into<Option<String>>,
    ) -> KvFuture<Vec<KvPair>>;

I'm going to leave this one with the old generic syntax and change all other methods to impl trait syntax.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 13, 2018

Author

And this way

    fn batch_scan<Ranges: AsRef<[Bounds]>, Bounds: RangeBounds<Key>, CF: Into<Option<String>>>(                                                                
        &self,
        ranges: Ranges,
        each_limit: u32,
        key_only: bool,
        cf: CF,
    ) -> KvFuture<Vec<KvPair>>;  

looks less readable than where style for me

    fn batch_scan<Ranges, Bounds, CF>(
        &self,
        ranges: Ranges,
        each_limit: u32,
        key_only: bool,
        cf: CF,
    ) -> KvFuture<Vec<KvPair>>
    where
        Ranges: AsRef<[Bounds]>,
        Bounds: RangeBounds<Key>,
        CF: Into<Option<String>>;

What do you think? @gregwebs

This comment has been minimized.

Copy link
@gregwebs

gregwebs Nov 13, 2018

Great! I agree on batch_scan. I am not against where at all, just against the common practice of using one letter variable names, particularly when they are not 100% consistent in their meaning.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 14, 2018

Author

It's somewhat like this now

pub struct BatchScan<'a, AClient: Client>;

impl<'a, AClient: Client> BatchScan<'a, AClient> {
    fn new(client: &'a AClient, ranges: Vec<(Key, Key)>, each_limit: u32) -> Self;
    pub fn key_only(mut self) -> Self; 
    pub fn cf(mut self, cf: impl Into<ColumnFamily>) -> Self;
}

impl<'a, AClient: Client> Future for BatchScan<'a, AClient> {
    type Item = Vec<KvPair>;
    type Error = (); 
    fn poll(&mut self) -> Poll<Self::Item, Self::Error>;
}

pub trait Client {
    type AClient: Client;

    fn batch_scan<Ranges, Bounds>(
        &self,
        ranges: Ranges,
        each_limit: u32,
    ) -> BatchScan<Self::AClient>
    where
        Ranges: AsRef<[Bounds]>,
        Bounds: RangeBounds<Key>;
}

You can find the complete interface at here

PLTA @gregwebs

This comment has been minimized.

Copy link
@Hoverbear

Hoverbear Nov 14, 2018

Member

Hmmm, I don't think the AClient: Client reads very well. In this case since it's pretty clear we expect a Client (there aren't multiple bounds, and the trait name is very clear) so I think just C: Client is sufficient. This is fairly idiomatic for Rust code.

This comment has been minimized.

Copy link
@Hoverbear

Hoverbear Nov 14, 2018

Member

Notably this looks strange as the associated type in trait Client { type ... }

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 15, 2018

Author

Makes sense, let's change it first and see how it looks.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 15, 2018

Author

Oh, just want to double check if you meant only the AClient: Client inside Client trait.
There are quite some AClient: Client out there. @Hoverbear

This comment has been minimized.

Copy link
@Hoverbear

Hoverbear Nov 15, 2018

Member

I think the AFoo: Foo semantic in general is not something I've seen much in Rust, and it reads quite strange.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 16, 2018

Author

I changed them to Impl: Client. Maybe that looks better and Impl is widely used to represent an implementation of something which could be appropriate for this scenario.

where
I: IntoIterator<Item = P>,
P: Into<KvPair>,
C: Into<Option<String>>;
fn delete<K, C>(&self, key: K, cf: C) -> KvFuture<()>
where
K: AsRef<Key>,
C: Into<Option<String>>;
fn batch_delete<K, C>(&self, keys: K, cf: C) -> KvFuture<()>
where
K: AsRef<[Key]>,
C: Into<Option<String>>;
fn scan<R, C>(&self, range: R, limit: u32, key_only: bool, cf: C) -> KvFuture<Vec<KvPair>>
where
R: RangeBounds<Key>,
C: Into<Option<String>>;
fn batch_scan<R, B, C>(
&self,
ranges: R,
each_limit: u32,
key_only: bool,
cf: C,
) -> KvFuture<Vec<KvPair>>
where
R: AsRef<[B]>,
B: RangeBounds<Key>,
C: Into<Option<String>>;
fn delete_range<R, C>(&self, range: R, cf: C) -> KvFuture<()>
where
R: RangeBounds<Key>,
C: Into<Option<String>>;
```

#### Usage example of the Raw Key-Value API

```rust
extern crate futures;
extern crate tikv_client;
use futures::future::Future;
use tikv_client::raw::Client;
use tikv_client::*;
fn main() {
let config = Config::new(vec!["127.0.0.1:3379"]);
let raw = raw::RawClient::new(&config)
.wait()
.expect("Could not connect to tikv");
let key: Key = b"Company".to_vec().into();
let value: Value = b"PingCAP".to_vec().into();
raw.put((Clone::clone(&key), Clone::clone(&value)), None)
This conversation was marked as resolved by Hoverbear

This comment has been minimized.

Copy link
@gregwebs

gregwebs Nov 12, 2018

Adding None to all the API calls is unsatisfying. Does it make sense to put the column family in the Config? Alternatively, if we have such a small core API surface we could expose two functions with the different signatures.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 12, 2018

Author

Hm, maybe adding another method sounds better? User may use multiple column family for different data, so it makes better sense to let user choose which cf to use at per call basis.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 12, 2018

Author

Like put_with_cf or put_with_column_family. Or maybe switch back to use a builder style interface which could benefit when we want to add more optional arguments later on.

This comment has been minimized.

Copy link
@gregwebs

gregwebs Nov 12, 2018

I would probably favor put_cf so the method name is not really long. Or if we should encourage use of column families, make that the shorter api and have put_def for using the default column family.

I would be interested in seeing the builder style. I am actually wondering if it makes sense to try to build it into the Key type since it partitions keys.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 13, 2018

Author

CF is only available in Raw KV interface. So we can move Key/Value/KvPair definition to raw and transactional module so they can have different definition.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 13, 2018

Author

I'm planning to change it to request builder style interface. Haven't done it yet, will try to fix couple of issues we discussed all together.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 14, 2018

Author

None is eliminated by builder api

.wait()
.expect("Could not put kv pair to tikv");
println!("Successfully put {:?}:{:?} to tikv", key, value);
let value = raw.get(&key, None).wait().expect("Could not get value");
println!("Found val: {:?} for key: {:?}", value, key);
raw.delete(&key, None)
.wait()
.expect("Could not delete value");
println!("Key: {:?} deleted", key);
raw.get(&key, None)
.wait()
.expect_err("Get returned value for not existing key");
let keys = vec![b"k1".to_vec().into(), b"k2".to_vec().into()];
let values = raw
.batch_get(&keys, None)
.wait()
.expect("Could not get values");
println!("Found values: {:?} for keys: {:?}", values, keys);
let start: Key = b"k1".to_vec().into();
let end: Key = b"k2".to_vec().into();
raw.scan(&start..&end, 10, false, None);
let ranges = [&start..&end, &start..&end];
raw.batch_scan(&ranges, 10, false, None);
}
```

The result is like:

```bash
Successfully put Key([67, 111, 109, 112, 97, 110, 121]):Value([80, 105, 110, 103, 67, 65, 80]) to tikv
Found val: Value([80, 105, 110, 103, 67, 65, 80]) for key: Key([67, 111, 109, 112, 97, 110, 121])
Key: Key([67, 111, 109, 112, 97, 110, 121]) deleted
```

Raw Key-Value client is a client of the TiKV server and only supports the GET/BATCH_GET/PUT/BATCH_PUT/DELETE/BATCH_DELETE/SCAN/BATCH_SCAN/DELETE_RANGE commands. The Raw Key-Value client can be safely and concurrently accessed by multiple threads. Therefore, for one process, one client is enough generally.
This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@Hoverbear

Hoverbear Nov 15, 2018

Member

"...a client of the TiKV server" I feel this is obvious. Maybe it can be removed.

This comment has been minimized.

Copy link
@sunxiaoguang

### Try the Transactional Key-Value API
This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@crazycoder1988

crazycoder1988 Nov 14, 2018

The previous section is called "Usage ... " and this section might follow the same style as well


The Transactional Key-Value API is more complicated than the Raw Key-Value API. Some transaction related concepts are listed as follows.

- Client

Like the Raw Key-Value client, a Client is client to a TiKV cluster.

- Snapshot

A Snapshot is the state of a Client at a particular point of time, which provides some readonly methods. The multiple reads of the same Snapshot is guaranteed consistent.
This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@crazycoder1988

crazycoder1988 Nov 14, 2018

Is it really "the state of a Client at a particular point of time" instead of "the state of the TiKV data ..."?


- Transaction

Like the Transaction in SQL, a Transaction symbolizes a series of read and write operations performed within the Client. Internally, a Transaction consists of a Snapshot for reads, and a buffer for all writes. The default isolation level of a Transaction is Snapshot Isolation.
This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@Hoverbear

Hoverbear Oct 31, 2018

Member

Since we're talking about a default we should detail how to configure it

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 15, 2018

Author

Sure, let me add it.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 15, 2018

Author

Added an optional step in example.

This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@Hoverbear

Hoverbear Nov 15, 2018

Member
Suggested change
Like the Transaction in SQL, a Transaction symbolizes a series of read and write operations performed within the Client. Internally, a Transaction consists of a Snapshot for reads, and a buffer for all writes. The default isolation level of a Transaction is Snapshot Isolation.
Like Transactions in SQL, a Transaction in TiKV symbolizes a series of read and write operations performed within the service. Internally, a Transaction consists of a Snapshot for reads, and a buffer for all writes. The default isolation level of a Transaction is Snapshot Isolation.

This comment has been minimized.

Copy link
@sunxiaoguang

To use the Transactional Key-Value API, take the following steps:

1. Create an instance of Config to specify endpoints of PD (Placement Driver) and optional security config

```rust
let config = Config::new(vec!["127.0.0.1:2379"]);
```

2. Create a Transactional Key-Value client.

```rust
let client = TxnClient::new(&config);
```

3. (Optional) Modify data using a Transaction.

The lifecycle of a Transaction is: _begin → {get, set, delete, scan} → {commit, rollback}_.

4. Call the Transactional Key-Value API's methods to access the data on TiKV. The Transactional Key-Value API contains the following methods:

```rust
fn begin(&self) -> KvFuture<Transaction>;
fn get<K>(&self, key: K) -> KvFuture<Value>
where
K: AsRef<Key>;
fn set<P>(&mut self, pair: P) -> KvFuture<()>
where
P: Into<KvPair>;
fn delete<K>(&mut self, key: K) -> KvFuture<()>
where
K: AsRef<Key>;
fn seek<K>(&self, key: K) -> KvFuture<Scanner>
where
K: AsRef<Key>;
fn commit(self) -> KvFuture<()>;
fn rollback(self) -> KvFuture<()>;
This conversation was marked as resolved by Hoverbear

This comment has been minimized.

Copy link
@gregwebs

gregwebs Nov 12, 2018

Will there be a higher-level transaction function ?

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 12, 2018

Author

Hm, there should be. Let me add it.

```

### Usage example of the Transactional Key-Value API

```rust
extern crate futures;
extern crate tikv_client;
use futures::{Async, Future, Stream};
use tikv_client::transaction::{Client, Mutator, Retriever, TxnClient};
use tikv_client::*;
fn puts<P, I>(client: &TxnClient, pairs: P)
where
P: IntoIterator<Item = I>,
I: Into<KvPair>,
{
let mut txn = client.begin().wait().expect("Could not begin transaction");
let _: Vec<()> = pairs
.into_iter()
.map(Into::into)
.map(|p| {
txn.set(p).wait().expect("Could not set key value pair");
}).collect();
txn.commit().wait().expect("Could not commit transaction");
}
fn get(client: &TxnClient, key: &Key) -> Value {
let txn = client.begin().wait().expect("Could not begin transaction");
txn.get(key).wait().expect("Could not get value")
This conversation was marked as resolved by Hoverbear

This comment has been minimized.

Copy link
@gregwebs

gregwebs Nov 12, 2018

What happens to the transaction at this point?

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 12, 2018

Author

I assume that transaction will be dropped before returning to caller. So basically it is trying to read something in a transaction.

}
fn scan(client: &TxnClient, start: &Key, limit: usize) {
let txn = client.begin().wait().expect("Could not begin transaction");
let mut scanner = txn.seek(start).wait().expect("Could not seek to start key");
let mut limit = limit;
loop {
if limit == 0 {
break;
}
match scanner.poll() {
This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@crazycoder1988

crazycoder1988 Nov 14, 2018

As a usage example, prefer to avoid calling the low level poll() API.

Ok(Async::Ready(None)) => return,
Ok(Async::Ready(Some(pair))) => {
limit -= 1;
println!("{:?}", pair);
}
_ => break,
}
}
}
fn dels<P>(client: &TxnClient, pairs: P)
This conversation was marked as resolved by sunxiaoguang

This comment has been minimized.

Copy link
@crazycoder1988

crazycoder1988 Nov 14, 2018

It is not pairs, only keys?

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 15, 2018

Author

Oops. That was a copy paste mistake. Fixed.

where
P: IntoIterator<Item = Key>,
{
let mut txn = client.begin().wait().expect("Could not begin transaction");
let _: Vec<()> = pairs
.into_iter()
.map(|p| {
txn.delete(p).wait().expect("Could not delete key");
}).collect();
txn.commit().wait().expect("Could not commit transaction");
}
fn main() {
let config = Config::new(vec!["127.0.0.1:3379"]);
let txn = TxnClient::new(&config)
.wait()
.expect("Could not connect to tikv");
// set
let key1: Key = b"key1".to_vec().into();
let value1: Value = b"value1".to_vec().into();
let key2: Key = b"key2".to_vec().into();
let value2: Value = b"value2".to_vec().into();
puts(&txn, vec![(key1, value1), (key2, value2)]);
// get
let key1: Key = b"key1".to_vec().into();
let value1 = get(&txn, &key1);
println!("{:?}", (key1, value1));
// scan
let key1: Key = b"key1".to_vec().into();
scan(&txn, &key1, 10);
// delete
let key1: Key = b"key1".to_vec().into();
let key2: Key = b"key2".to_vec().into();
dels(&txn, vec![key1, key2]);
}
```

The result is like:

```bash
(Key([107, 101, 121, 49]), Value([118, 97, 108, 117, 101, 49]))
(Key([107, 101, 121, 49]), Value([118, 97, 108, 117, 101, 49]))
(Key([107, 101, 121, 49]), Value([118, 97, 108, 117, 101, 49]))
```

## Programming model

The client instance is thread safe and all the interfaces return futures so users can use the client in either asynchronous or synchronous way. A dedicated event loop thread is created at a per client instance basis to drive the reactor to make progress.
This conversation was marked as resolved by overvenus

This comment has been minimized.

Copy link
@overvenus

overvenus Nov 18, 2018

Contributor

The client instance is thread safe ...

Being thread safe is kind of expensive. If the client has some mutable state we may end up with a Mutex inside and every APIs require the Mutex. I suggest the client should impls Send + Clone(another concurrency schema but more flexible). It allows private mutable state, eg, we change client config on the fly.

dedicated event loop thread is created at a per client instance basis to drive the reactor to make progress.

I suggest we should not specify thread model, because it is implementation detail. We should leave some space to make sure we dont back ourselves into a corner.


Based on the model above, here is the code I (as a user) can imagine:

let client = Arc::new(TikvClient::new());
for _ in 0..4 { // My app has 4 workers.
  let cli = client.clone();
  thread::spawn(move || {
    loop {
      let resp = cli.request().wait();
      handle(resp);
    }
  }).unwrap();
}

Users may always wrap the client in an Arc to avoid redundant event loop threads.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 18, 2018

Author

The reason that there is one dedicated thread is because the tso interface of PD is a bidirectional stream. Using a dedicated thread is a way I figured out to keep the order of responses the same as requests yet can still do batching. Not sure if there are other ways to implement that.

Agree with Send + Clone part. It actually has methods that takes &mut self so I think it is not correct to say client is thread safe. Let me fix it.

This comment has been minimized.

Copy link
@overvenus

overvenus Nov 20, 2018

Contributor

If we keep the thread model while impls Send + Clone, my intuition is clone spawns a new event loop thread. It looks odd to me though.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 20, 2018

Author

How about we remove this section for now and let it to be determined? We might figure out a better way later on.

This comment has been minimized.

Copy link
@overvenus

overvenus Nov 20, 2018

Contributor

Yes, I agree.

This comment has been minimized.

Copy link
@sunxiaoguang

sunxiaoguang Nov 20, 2018

Author

It's removed


## Tooling

The `tikv_client` crate will be tested with Travis CI using Rust's standard testing framework. We will also include benchmark with criterion in the future. For public functions which process user input, we will seek to use fuzz testing such as `quickcheck` to find subtle bugs.

The CI will validate all code is warning free, passes `rustfmt`, and passes a `clippy` check without lint warnings.

All code that reaches the `master` branch should not output errors when the following commands are run:

```shell
cargo fmt --all -- --check
cargo clippy --all -- -D clippy
cargo test --all -- --nocapture
cargo bench --all -- --test
```

# Drawbacks

Choosing not to create a Rust TiKV client would mean the current state of clients remains the same.

It is likely that in the future we would end up creating a client in some other form due to customer demand.

# Alternatives

## Package the Go client

Choosing to do this would likely be considerably much less work. The code is already written, so most of the work would be documenting and packaging. Unfortunately, Go does not share the same performance characteristics and FFI capabilities as Rust, so it is a poor core binding for future libraries. Due to the limited abstractions available in Go (it does not have a Linear Type System) we may not be able to create the semantic abstractions possible in a Rust client, reducing the quality of implementations referencing the client.

## Choose another language

We can choose another language such as C, C++, or Python.

A C client would be the most portable and allow future users and customers to bind to the library as they wish. This quality is maintained in Rust, so it is not an advantage for C. Choosing to implement this client in C or C++ means we must take extra steps to support multiple packaging systems, string libraries, and other choices which would not be needed in languages like Ruby, Node.js, or Python.

Choosing to use Python or Ruby for this client would likely be considerably less work than C/C++ as there is a reduced error surface. These languages do not offer good FFI bindings, and often require starting up a language runtime. We suspect that if we implement a C/C++/Rust client, future dynamic language libraries will be able to bind to the Rust client, allowing them to be written quickly and easily.

# Unresolved questions

There are some concerns about integration testing. While we can use the mock `Pd` available in TiKV to mock PD, we do not currently have something similar for TiKV. We suspect implementing a mock TiKV will be the easiest method.

ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.