diff --git a/README.md b/README.md index 891c545e15e5f..306481c3506ca 100644 --- a/README.md +++ b/README.md @@ -111,7 +111,15 @@ - [PD Control](tools/pd-control.md) - [TiKV Control](tools/tikv-control.md) - [TiDB Controller](tools/tidb-controller.md) -+ TiSpark ++ TiKV Documentation + - [Overview](tikv/tikv-overview.md) + + Install and Deploy TiKV + - [Prerequisites](op-guide/recommendation.md) + - [Install and Deploy TiKV Using Docker Compose](tikv/deploy-tikv-docker-compose.md) + - [Install and Deploy TiKV Using Binary Files](tikv/deploy-tikv-using-binary.md) + + Client Drivers + - [Go](tikv/go-client-api.md) ++ TiSpark Documentation - [Quick Start Guide](tispark/tispark-quick-start-guide.md) - [User Guide](tispark/tispark-user-guide.md) - [Frequently Asked Questions (FAQ)](FAQ.md) diff --git a/media/tikv_stack.png b/media/tikv_stack.png new file mode 100644 index 0000000000000..4f8b1b6d4d45e Binary files /dev/null and b/media/tikv_stack.png differ diff --git a/tikv/deploy-tikv-docker-compose.md b/tikv/deploy-tikv-docker-compose.md new file mode 100644 index 0000000000000..6f5ab9d3e58e6 --- /dev/null +++ b/tikv/deploy-tikv-docker-compose.md @@ -0,0 +1,57 @@ +--- +title: Install and Deploy TiKV Using Docker Compose +category: user guide +--- + +# Install and Deploy TiKV Using Docker Compose + +This guide describes how to quickly deploy a TiKV cluster using [Docker Compose](https://github.com/pingcap/tidb-docker-compose/). Currently, this installation method only supports the Linux system. + +## Prerequisites + +- Install Docker and Docker Compose. + + ``` + sudo yum install docker docker-compose + ``` + +- Install Helm. + + ``` + curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get | bash + ``` + +## Install and deploy + +1. Download `tidb-docker-compose`. + + ``` + git clone https://github.com/pingcap/tidb-docker-compose.git + ``` + +2. Edit the `compose/values.yaml` file to configure `networkMode` to `host` and comment the TiDB section out. + + ``` + cd tidb-docker-compose/compose + networkMode: host + ``` + +3. Generate the `generated-docker-compose.yml` file. + + ``` + helm template compose > generated-docker-compose.yml + ``` + +4. Create and start the cluster using the `generated-docker-compose.yml` file. + + ``` + docker-compose -f generated-docker-compose.yml up -d + ``` + +You can check whether the TiKV cluster has been successfully deployed using the following command: + +``` +curl localhost:2379/pd/api/v1/stores +``` + +If the state of all the TiKV instances is "Up", you have successfully deployed a TiKV cluster. \ No newline at end of file diff --git a/tikv/deploy-tikv-using-binary.md b/tikv/deploy-tikv-using-binary.md new file mode 100644 index 0000000000000..eb73053b895a2 --- /dev/null +++ b/tikv/deploy-tikv-using-binary.md @@ -0,0 +1,148 @@ +--- +title: Install and Deploy TiKV Using Binary Files +category: user guide +--- + +# Install and Deploy TiKV Using Binary Files + +This guide describes how to deploy a TiKV cluster using binary files. + +- To quickly understand and try TiKV, see [Deploy the TiKV cluster on a single machine](#deploy-the-tikv-cluster-on-a-single-machine). +- To try TiKV out and explore the features, see [Deploy the TiKV cluster on multiple nodes for test](#deploy-the-tikv-cluster-on-multiple-nodes-for-test). + +## Deploy the TiKV cluster on a single machine + +This section describes how to deploy TiKV on a single machine installed with the Linux system. Take the following steps: + +1. Download the official binary package. + + ```bash + # Download the package. + wget https://download.pingcap.org/tidb-latest-linux-amd64.tar.gz + wget http://download.pingcap.org/tidb-latest-linux-amd64.sha256 + + # Check the file integrity. If the result is OK, the file is correct. + sha256sum -c tidb-latest-linux-amd64.sha256 + + # Extract the package. + tar -xzf tidb-latest-linux-amd64.tar.gz + cd tidb-latest-linux-amd64 + ``` + +2. Start PD. + + ```bash + ./bin/pd-server --name=pd1 \ + --data-dir=pd1 \ + --client-urls="http://127.0.0.1:2379" \ + --peer-urls="http://127.0.0.1:2380" \ + --initial-cluster="pd1=http://127.0.0.1:2380" \ + --log-file=pd1.log + ``` + +3. Start TiKV. + + To start the 3 TiKV instances, open a new terminal tab or window, come to the `tidb-latest-linux-amd64` directory, and start the instances using the following command: + + ```bash + ./bin/tikv-server --pd-endpoints="127.0.0.1:2379" \ + --addr="127.0.0.1:20160" \ + --data-dir=tikv1 \ + --log-file=tikv1.log + + ./bin/tikv-server --pd-endpoints="127.0.0.1:2379" \ + --addr="127.0.0.1:20161" \ + --data-dir=tikv2 \ + --log-file=tikv2.log + + ./bin/tikv-server --pd-endpoints="127.0.0.1:2379" \ + --addr="127.0.0.1:20162" \ + --data-dir=tikv3 \ + --log-file=tikv3.log + ``` + +You can use the [pd-ctl](https://github.com/pingcap/pd/tree/master/pdctl) tool to verify whether PD and TiKV are successfully deployed: + +``` +./bin/pd-ctl store -d -u http://127.0.0.1:2379 +``` + +If the state of all the TiKV instances is "Up", you have successfully deployed a TiKV cluster. + +## Deploy the TiKV cluster on multiple nodes for test + +This section describes how to deploy TiKV on multiple nodes. If you want to test TiKV with a limited number of nodes, you can use one PD instance to test the entire cluster. + +Assume that you have four nodes, you can deploy 1 PD instance and 3 TiKV instances. For details, see the following table: + +| Name | Host IP | Services | +| :-- | :-- | :------------------- | +| Node1 | 192.168.199.113 | PD1 | +| Node2 | 192.168.199.114 | TiKV1 | +| Node3 | 192.168.199.115 | TiKV2 | +| Node4 | 192.168.199.116 | TiKV3 | + +To deploy a TiKV cluster with multiple nodes for test, take the following steps: + +1. Download the official binary package on each node. + + ```bash + # Download the package. + wget https://download.pingcap.org/tidb-latest-linux-amd64.tar.gz + wget http://download.pingcap.org/tidb-latest-linux-amd64.sha256 + + # Check the file integrity. If the result is OK, the file is correct. + sha256sum -c tidb-latest-linux-amd64.sha256 + + # Extract the package. + tar -xzf tidb-latest-linux-amd64.tar.gz + cd tidb-latest-linux-amd64 + ``` + +2. Start PD on Node1. + + ```bash + ./bin/pd-server --name=pd1 \ + --data-dir=pd1 \ + --client-urls="http://192.168.199.113:2379" \ + --peer-urls="http://192.168.199.113:2380" \ + --initial-cluster="pd1=http://192.168.199.113:2380" \ + --log-file=pd1.log + ``` + +3. Log in and start TiKV on other nodes: Node2, Node3 and Node4. + + Node2: + + ```bash + ./bin/tikv-server --pd-endpoints="192.168.199.113:2379" \ + --addr="192.168.199.114:20160" \ + --data-dir=tikv1 \ + --log-file=tikv1.log + ``` + + Node3: + + ```bash + ./bin/tikv-server --pd-endpoints="192.168.199.113:2379" \ + --addr="192.168.199.115:20160" \ + --data-dir=tikv2 \ + --log-file=tikv2.log + ``` + + Node4: + + ```bash + ./bin/tikv-server --pd-endpoints="192.168.199.113:2379" \ + --addr="192.168.199.116:20160" \ + --data-dir=tikv3 \ + --log-file=tikv3.log + ``` + +You can use the [pd-ctl](https://github.com/pingcap/pd/tree/master/pdctl) tool to verify whether PD and TiKV are successfully deployed: + +``` +./pd-ctl store -d -u http://192.168.199.113:2379 +``` + +The result displays the store count and detailed information regarding each store. If the state of all the TiKV instances is "Up", you have successfully deployed a TiKV cluster. \ No newline at end of file diff --git a/tikv/go-client-api.md b/tikv/go-client-api.md new file mode 100644 index 0000000000000..5f1c7078d13aa --- /dev/null +++ b/tikv/go-client-api.md @@ -0,0 +1,295 @@ +--- +title: Try Two Types of APIs +category: user guide +--- + +# Try Two Types of APIs + +To apply to different scenarios, TiKV provides [two types of APIs](tikv-overview.md#two-types-of-apis) for developers: the Raw Key-Value API and the Transactional Key-Value API. This document guides you through how to use the two APIs in TiKV using two examples. + +The usage examples are based on the [deployment of TiKV using binary files on multiple nodes for test](deploy-tikv-using-binary.md#deploy-the-tikv-cluster-on-multiple-nodes-for-test). You can also quickly try the two types of APIs on a single machine. + +## Try the Raw Key-Value API + +To use the Raw Key-Value API in applications developed by golang, take the following steps: + +1. Install the necessary packages. + + ```bash + go get -v -u github.com/pingcap/tidb/store/tikv + ``` + +2. Import the dependency packages. + + ```bash + import ( + "fmt" + "github.com/pingcap/tidb/config" + "github.com/pingcap/tidb/store/tikv" + ) + ``` + +3. Create a Raw Key-Value client. + + ```bash + cli, err := tikv.NewRawKVClient([]string{"192.168.199.113:2379"}, config.Security{}) + ``` + + Description of two parameters in the above command: + + - `string`: a list of PD servers’ addresses + - `config.Security`: used for establishing TLS connections, usually left empty when you do not need TLS + +4. Call the Raw Key-Value client methods to access the data on TiKV. The Raw Key-Value API contains the following methods, and you can also find them at [GoDoc](https://godoc.org/github.com/pingcap/tidb/store/tikv#RawKVClient). + + ```bash + type RawKVClient struct + func (c *RawKVClient) Close() error + func (c *RawKVClient) ClusterID() uint64 + func (c *RawKVClient) Delete(key []byte) error + func (c *RawKVClient) Get(key []byte) ([]byte, error) + func (c *RawKVClient) Put(key, value []byte) error + func (c *RawKVClient) Scan(startKey []byte, limit int) (keys [][]byte, values [][]byte, err error) + ``` + +### Usage example of the Raw Key-Value API + +```bash +package main + +import ( + "fmt" + + "github.com/pingcap/tidb/config" + "github.com/pingcap/tidb/store/tikv" +) + +func main() { + cli, err := tikv.NewRawKVClient([]string{"192.168.199.113:2379"}, config.Security{}) + if err != nil { + panic(err) + } + defer cli.Close() + + fmt.Printf("cluster ID: %d\n", cli.ClusterID()) + + key := []byte("Company") + val := []byte("PingCAP") + + // put key into tikv + err = cli.Put(key, val) + if err != nil { + panic(err) + } + fmt.Printf("Successfully put %s:%s to tikv\n", key, val) + + // get key from tikv + val, err = cli.Get(key) + if err != nil { + panic(err) + } + fmt.Printf("found val: %s for key: %s\n", val, key) + + // delete key from tikv + err = cli.Delete(key) + if err != nil { + panic(err) + } + fmt.Printf("key: %s deleted\n", key) + + // get key again from tikv + val, err = cli.Get(key) + if err != nil { + panic(err) + } + fmt.Printf("found val: %s for key: %s\n", val, key) +} +``` + +The result is like: + +```bash +INFO[0000] [pd] create pd client with endpoints [192.168.199.113:2379] +INFO[0000] [pd] leader switches to: http://127.0.0.1:2379, previous: +INFO[0000] [pd] init cluster id 6554145799874853483 +cluster ID: 6554145799874853483 +Successfully put Company:PingCAP to tikv +found val: PingCAP for key: Company +key: Company deleted +found val: for key: Company +``` + +RawKVClient is a client of the TiKV server and only supports the GET/PUT/DELETE/SCAN commands. The RawKVClient can be safely and concurrently accessed by multiple goroutines, as long as it is not closed. Therefore, for one process, one client is enough generally. + +## Try the Transactional Key-Value API + +The Transactional Key-Value API is complicated than the Raw Key-Value API. Some transaction related concepts are listed as follows. For more details, see the [KV package](https://github.com/pingcap/tidb/tree/master/kv). + +- Storage + + Like the RawKVClient, a Storage is an abstract TiKV cluster. + +- Snapshot + + A Snapshot is the state of a Storage at a particular point of time, which provides some readonly methods. The multiple times read from a same Snapshot is guaranteed consistent. + +- Transaction + + Like the Transaction in SQL, a Transaction symbolizes a series of read and write operations performed within the Storage. Internally, a Transaction consists of a Snapshot for reads, and a MemBuffer for all writes. The default isolation level of a Transaction is Snapshot Isolation. + +To use the Transactional Key-Value API in applications developed by golang, take the following steps: + +1. Install the necessary packages. + + ```bash + go get -v -u github.com/pingcap/tidb/kv + go get -v -u github.com/pingcap/tidb/store/tikv + ``` + +2. Import the dependency packages. + + ```bash + import ( + "github.com/pingcap/tidb/kv" + "github.com/pingcap/tidb/store/tikv" + "fmt" + ) + ``` + +3. Create Storage using a URL scheme. + + ```bash + driver := tikv.Driver{} + storage, err := driver.Open("tikv://192.168.199.113:2379") + ``` + +4. (Optional) Modify the Storage using a Transaction. + + The lifecycle of a Transaction is: _begin → {get, set, delete, scan} → {commit, rollback}_. + +5. Call the Transactional Key-Value API's methods to access the data on TiKV. The Transactional Key-Value API contains the following methods: + + ```bash + Begin() -> Txn + Txn.Get(key []byte) -> (value []byte) + Txn.Set(key []byte, value []byte) + Txn.Seek(begin []byte) -> Iterator + Txn.Delete(key []byte) + Txn.Commit() + ``` + +### Usage example of the Transactional Key-Value API + +```bash +package main + +import ( + "context" + "fmt" + "strconv" + + "github.com/pingcap/tidb/kv" + "github.com/pingcap/tidb/store/tikv" +) + +// if key not found, set value to zero +// else increase the value +func increase(storage kv.Storage, key []byte) error { + txn, err := storage.Begin() + if err != nil { + return err + } + defer txn.Rollback() + var oldValue int + val, err := txn.Get(key) + if err != nil { + if !kv.ErrNotExist.Equal(err) { + return err + } + } else { + oldValue, err = strconv.Atoi(string(val)) + if err != nil { + return err + } + } + + err = txn.Set(key, []byte(strconv.Itoa(oldValue+1))) + if err != nil { + return err + } + err = txn.Commit(context.Background()) + return nil +} + +// lookup value for key +func lookup(storage kv.Storage, key []byte) (int, error) { + var value int + txn, err := storage.Begin() + if err != nil { + return value, err + } + defer txn.Rollback() + val, err := txn.Get(key) + if err != nil { + return value, err + } + value, err = strconv.Atoi(string(val)) + if err != nil { + return value, err + } + return value, nil +} + +func main() { + driver := tikv.Driver{} + storage, err := driver.Open("tikv://192.168.199.113:2379") + if err != nil { + panic(err) + } + defer storage.Close() + + key := []byte("Account") + // lookup account + account, err := lookup(storage, key) + if err != nil { + fmt.Printf("failed to lookup key %s: %v\n", key, err) + } else { + fmt.Printf("Account is %d\n", account) + } + + // increase account + err = increase(storage, key) + if err != nil { + panic(err) + } + + // lookup account again + account, err = lookup(storage, key) + if err != nil { + fmt.Printf("failed to lookup key %s: %v\n", key, err) + } else { + fmt.Printf("Account increased to %d\n", account) + } +} +``` + +The result is like: + +```bash +INFO[0000] [pd] create pd client with endpoints [192.168.199.113:2379] +INFO[0000] [pd] leader switches to: http://127.0.0.1:2379, previous: +INFO[0000] [pd] init cluster id 6554145799874853483 +INFO[0000] [kv] Rollback txn 400197262324006914 +failed to lookup key Account: [kv:2]Error: key not exist +INFO[0000] [kv] Rollback txn 400197262324006917 +Account increased to 1 + +# run the program again +INFO[0000] [pd] create pd client with endpoints [192.168.199.113:2379] +INFO[0000] [pd] leader switches to: http://127.0.0.1:2379, previous: +INFO[0000] [pd] init cluster id 6554145799874853483 +INFO[0000] [kv] Rollback txn 400198364324954114 +Account is 1 +INFO[0000] [kv] Rollback txn 400198364324954117 +Account increased to 2 +``` \ No newline at end of file diff --git a/tikv/tikv-overview.md b/tikv/tikv-overview.md new file mode 100644 index 0000000000000..726e6c2445828 --- /dev/null +++ b/tikv/tikv-overview.md @@ -0,0 +1,59 @@ +--- +title: Overview of TiKV +category: overview +--- + +# Overview of TiKV + +TiKV (The pronunciation is: /'taɪkeɪvi:/ tai-K-V, etymology: titanium) is a distributed Key-Value database which is based on the design of Google Spanner and HBase, but it is much simpler without dependency on any distributed file system. + +As the storage layer of TiDB, TiKV can work separately and does not depend on the SQL layer of TiDB. To apply to different scenarios, TiKV provides [two types of APIs](#two-types-of-apis) for developers: the Raw Key-Value API and the Transactional Key-Value API. + +The key features of TiKV are as follows: + +- **Geo-Replication** + + TiKV uses [Raft](http://raft.github.io/) and the [Placement Driver](https://github.com/pingcap/pd/) to support Geo-Replication. + +- **Horizontal scalability** + + With Placement Driver and carefully designed Raft groups, TiKV excels in horizontal scalability and can easily scale to 100+ TBs of data. + +- **Consistent distributed transactions** + + Similar to Google's Spanner, TiKV supports externally-consistent distributed transactions. + +- **Coprocessor support** + + Similar to HBase, TiKV implements a Coprocessor framework to support distributed computing. + +- **Cooperates with [TiDB](https://github.com/pingcap/tidb)** + + Thanks to the internal optimization, TiKV and TiDB can work together to be a compelling database solution with high horizontal scalability, externally-consistent transactions, and support for RDMBS and NoSQL design patterns. + +## Architecture + +The TiKV server software stack is as follows: + +![The TiKV software stack](../media/tikv_stack.png) + +- **Placement Driver:** Placement Driver (PD) is the cluster manager of TiKV. PD periodically checks replication constraints to balance load and data automatically. +- **Store:** There is a RocksDB within each Store and it stores data into local disk. +- **Region:** Region is the basic unit of Key-Value data movement. Each Region is replicated to multiple Nodes. These multiple replicas form a Raft group. +- **Node:** A physical node in the cluster. Within each node, there are one or more Stores. Within each Store, there are many Regions. + +When a node starts, the metadata of the Node, Store and Region are recorded into PD. The status of each Region and Store is reported to PD regularly. + +## Two types of APIs + +TiKV provides two types of APIs for developers: + +- [The Raw Key-Value API](go-client-api.md#try-the-raw-key-value-api) + + If your application scenario does not need distributed transactions or MVCC (Multi-Version Concurrency Control) and only need to guarantee the atomicity towards one key, you can use the Raw Key-Value API. + +- [The Transactional Key-Value API](go-client-api.md#try-the-transactional-key-value-api) + + If your application scenario requires distributed ACID transactions and the atomicity of multiple keys within a transaction, you can use the Transactional Key-Value API. + +Compared to the Transactional Key-Value API, the Raw Key-Value API is more performant with lower latency and easier to use. \ No newline at end of file