Skip to content

Commit

Permalink
*: update tispark document for tispark-2.1.1 (#1409)
Browse files Browse the repository at this point in the history
* *: update tispark document for tispark-2.1.1

* Apply suggestions from code review

update links
  • Loading branch information
dcalvin committed Aug 5, 2019
1 parent 852a3ea commit 1d79564
Show file tree
Hide file tree
Showing 8 changed files with 12 additions and 918 deletions.
4 changes: 2 additions & 2 deletions dev/how-to/deploy/tispark.md
Expand Up @@ -14,7 +14,7 @@ To make it easy to [try TiSpark](/reference/tispark.md), the TiDB cluster instal
- The TiSpark jar package is deployed by default in the `jars` folder in the Spark deployment directory.

```
spark/jars/tispark-SNAPSHOT-jar-with-dependencies.jar
spark/jars/tispark-${name_with_version}.jar
```

- TiSpark sample data and import scripts are deployed by default in the TiDB-Ansible directory.
Expand Down Expand Up @@ -183,4 +183,4 @@ The result is:
-----------------+---------+------------+--------+-----------+
```

See [more examples](https://github.com/ilovesoup/tpch/tree/master/sparksql).
See [more examples](https://github.com/pingcap/tispark-test/tree/master/tpch/sparksql)).
17 changes: 4 additions & 13 deletions dev/reference/tispark.md
Expand Up @@ -23,8 +23,7 @@ TiSpark is an OLAP solution that runs Spark SQL directly on TiKV, the distribute

## Environment setup

+ The TiSpark 2.x supports Spark 2.3.x. It does not support any versions earlier than 2.3.x. If you want to use Spark 2.1.x, use TiSpark 1.x instead.
+ There are small changes when TiSpark works with different minor versions of Spark 2.3.x. The default version TiSpark supports is Spark 2.3.2. If you want to use TiSpark with Spark 2.3.1 or Spark 2.3.0, you need to build from sources to avoid conflicting APIs. For more details, see [How to build from sources](https://github.com/pingcap/tispark#how-to-build-from-sources).
+ The TiSpark 2.x supports Spark 2.3.x and Spark 2.4.x. If you want to use Spark 2.1.x, use TiSpark 1.x instead.
+ TiSpark requires JDK 1.8+ and Scala 2.11 (Spark2.0 + default Scala version).
+ TiSpark runs in any Spark mode such as YARN, Mesos, and Standalone.

Expand Down Expand Up @@ -75,14 +74,14 @@ For the hybrid deployment of TiKV and TiSpark, add TiSpark required resources to

## Deploy the TiSpark cluster

Download TiSpark's jar package [here](http://download.pingcap.org/tispark-latest-linux-amd64.tar.gz), decompress it, and copy the content to the appropriate folder.
Download TiSpark's jar package [here](https://github.com/pingcap/tispark/releases). Download your desired version of jar package and copy the content to the appropriate folder.

### Deploy TiSpark on the existing Spark cluster

Running TiSpark on an existing Spark cluster does not require a reboot of the cluster. You can use Spark's `--jars` parameter to introduce TiSpark as a dependency:

```sh
spark-shell --jars $TISPARK_FOLDER/tispark-core-${version}-SNAPSHOT-jar-with-dependencies.jar
spark-shell --jars $TISPARK_FOLDER/tispark-${name_with_version}.jar
```

### Deploy TiSpark without the Spark cluster
Expand All @@ -93,7 +92,7 @@ If you do not have a Spark cluster, we recommend using the standalone mode. To u

You can download [Apache Spark](https://spark.apache.org/downloads.html)

For the Standalone mode without Hadoop support, use Spark **2.3.x** and any version of Pre-build with Apache Hadoop 2.x with Hadoop dependencies. If you need to use the Hadoop cluster, please choose the corresponding Hadoop version. You can also choose to build from the [source code](https://spark.apache.org/docs/2.3.0/building-spark.html) to match the previous version of the official Hadoop 2.x.
For the Standalone mode without Hadoop support, use Spark **2.3.x** and any version of Pre-build with Apache Hadoop 2.x with Hadoop dependencies. If you need to use the Hadoop cluster, please choose the corresponding Hadoop version. You can also choose to build from the [source code](https://spark.apache.org/docs/latest/building-spark.html) to match the previous version of the official Hadoop 2.x.

Suppose you already have a Spark binaries, and the current PATH is `SPARKPATH`, please copy the TiSpark jar package to the `${SPARKPATH}/jars` directory.

Expand Down Expand Up @@ -187,14 +186,6 @@ select count(*) from account;
1 row selected (1.97 seconds)
```
## TiSparkR
TiSparkR is a thin layer built to support the R language with TiSpark. Refer to [this document](https://github.com/pingcap/tispark/blob/master/R/README.md) for usage.
## TiSpark on PySpark
TiSpark on PySpark is a Python package built to support the Python language with TiSpark. Refer to [this document](https://github.com/pingcap/tispark/blob/master/python/README.md) for usage.
## Use TiSpark together with Hive
You can use TiSpark together with Hive.
Expand Down
194 changes: 0 additions & 194 deletions dev/tispark/tispark-quick-start-guide_v1.x.md

This file was deleted.

0 comments on commit 1d79564

Please sign in to comment.