Skip to content

Commit

Permalink
[skip travis] Update docs with 2.6.1
Browse files Browse the repository at this point in the history
  • Loading branch information
maziyarpanahi committed Sep 10, 2020
1 parent dc822f6 commit 1b4fa52
Show file tree
Hide file tree
Showing 7 changed files with 693 additions and 314 deletions.
66 changes: 36 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ Take a look at our official Spark NLP page: [http://nlp.johnsnowlabs.com/](http:
* GPU Support
* Full integration with Spark ML functions
* +250 pre-trained models in 45 languages!
* +88 pre-trained pipelines in 13 languages!
* +90s pre-trained pipelines in 13 languages!
* Multi-lingual NER models: Danish, Dutch, English, Finnish, French, German, Italian, Norwegian, Polish, Portuguese, Russian, Spanish, and Swedish

## Requirements
Expand All @@ -100,7 +100,7 @@ $ java -version
# should be Java 8 (Oracle or OpenJDK)
$ conda create -n sparknlp python=3.6 -y
$ conda activate sparknlp
$ pip install spark-nlp==2.6.0 pyspark==2.4.4
$ pip install spark-nlp==2.6.1 pyspark==2.4.4
```

In Python console or Jupyter `Python3` kernel:
Expand Down Expand Up @@ -144,7 +144,7 @@ For more examples, you can visit our dedicated [repository](https://github.com/J

## Apache Spark Support

Spark NLP *2.6.0* has been built on top of Apache Spark 2.4.x and fully supports Apache Spark 2.3.x:
Spark NLP *2.6.1* has been built on top of Apache Spark 2.4.x and fully supports Apache Spark 2.3.x:

| Spark NLP | Apache Spark 2.3.x | Apache Spark 2.4.x |
|-------------|-----------------------|--------------------|
Expand All @@ -162,7 +162,7 @@ Find out more about `Spark NLP` versions from our [release notes](https://github

## Databricks Support

Spark NLP 2.6.0 has been tested and is compatible with the following runtimes:
Spark NLP 2.6.1 has been tested and is compatible with the following runtimes:

- 6.2
- 6.2 ML
Expand All @@ -175,7 +175,7 @@ Spark NLP 2.6.0 has been tested and is compatible with the following runtimes:

## EMR Support

Spark NLP 2.6.0 has been tested and is compatible with the following EMR releases:
Spark NLP 2.6.1 has been tested and is compatible with the following EMR releases:

- 5.26.0
- 5.27.0
Expand All @@ -192,28 +192,34 @@ This library has been uploaded to the [spark-packages repository](https://spark-

The benefit of spark-packages is that makes it available for both Scala-Java and Python

To use the most recent version on Apache Spark 2.4.x just add the `--packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0` to you spark command:
To use the most recent version on Apache Spark 2.4.x just add the `--packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1` to you spark command:

```sh
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1
```

```sh
pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0
pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1
```

```sh
spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0
spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1
```

This can also be used to create a SparkSession manually by using the `spark.jars.packages` option in both Python and Scala.

**NOTE**: To use Spark NLP with GPU you can use the dedicated GPU package `com.johnsnowlabs.nlp:spark-nlp-gpu_2.11:2.6.0`
**NOTE**: To use Spark NLP with GPU you can use the dedicated GPU package `com.johnsnowlabs.nlp:spark-nlp-gpu_2.11:2.6.1`

**NOTE**: To use Spark NLP on Apache Spark 2.3.x you should instead use the following packages:

- CPU: `com.johnsnowlabs.nlp:spark-nlp-spark23_2.11:2.6.0`
- GPU: `com.johnsnowlabs.nlp:spark-nlp-gpu-spark23_2.11:2.6.0`
- CPU: `com.johnsnowlabs.nlp:spark-nlp-spark23_2.11:2.6.1`
- GPU: `com.johnsnowlabs.nlp:spark-nlp-gpu-spark23_2.11:2.6.1`

**NOTE**: In case you are using large pretrained models like UniversalSentenceEncoder, you need to have the following set in your SparkSession:

```sh
spark-shell --driver-memory 16g --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1 spark.kryoserializer.buffer.max 1000M
```

## Scala

Expand All @@ -228,7 +234,7 @@ Our package is deployed to maven central. To add this package as a dependency in
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp_2.11</artifactId>
<version>2.6.0</version>
<version>2.6.1</version>
</dependency>
```

Expand All @@ -239,7 +245,7 @@ Our package is deployed to maven central. To add this package as a dependency in
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-gpu_2.11</artifactId>
<version>2.6.0</version>
<version>2.6.1</version>
</dependency>
```

Expand All @@ -250,7 +256,7 @@ Our package is deployed to maven central. To add this package as a dependency in
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-spark23_2.11</artifactId>
<version>2.6.0</version>
<version>2.6.1</version>
</dependency>
```

Expand All @@ -261,7 +267,7 @@ Our package is deployed to maven central. To add this package as a dependency in
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-gpu-spark23_2.11</artifactId>
<version>2.6.0</version>
<version>2.6.1</version>
</dependency>
```

Expand All @@ -271,28 +277,28 @@ Our package is deployed to maven central. To add this package as a dependency in

```sbtshell
// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp" % "2.6.0"
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp" % "2.6.1"
```

**spark-nlp-gpu:**

```sbtshell
// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-gpu
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-gpu" % "2.6.0"
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-gpu" % "2.6.1"
```

**spark-nlp** on Apacahe Spark 2.3.x:

```sbtshell
// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-spark23
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-spark23" % "2.6.0"
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-spark23" % "2.6.1"
```

**spark-nlp-gpu:**

```sbtshell
// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-gpu-spark23
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-gpu-spark23" % "2.6.0"
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-gpu-spark23" % "2.6.1"
```

Maven Central: [https://mvnrepository.com/artifact/com.johnsnowlabs.nlp](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp)
Expand All @@ -308,7 +314,7 @@ If you installed pyspark through pip/conda, you can install `spark-nlp` through
Pip:

```bash
pip install spark-nlp==2.6.0
pip install spark-nlp==2.6.1
```

Conda:
Expand All @@ -335,7 +341,7 @@ spark = SparkSession.builder \
.master("local[4]")\
.config("spark.driver.memory","16G")\
.config("spark.driver.maxResultSize", "2G") \
.config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0")\
.config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1")\
.config("spark.kryoserializer.buffer.max", "1000M")\
.getOrCreate()
```
Expand Down Expand Up @@ -410,7 +416,7 @@ Use either one of the following options
- Add the following Maven Coordinates to the interpreter's library list

```bash
com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0
com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1
```

- Add path to pre-built jar from [here](#compiled-jars) in the interpreter's library list making sure the jar is available to driver path
Expand All @@ -420,7 +426,7 @@ com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0
Apart from previous step, install python module through pip

```bash
pip install spark-nlp==2.6.0
pip install spark-nlp==2.6.1
```

Or you can install `spark-nlp` from inside Zeppelin by using Conda:
Expand All @@ -445,7 +451,7 @@ export PYSPARK_PYTHON=python3
export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS=notebook

pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0
pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1
```

Alternatively, you can mix in using `--jars` option for pyspark + `pip install spark-nlp`
Expand All @@ -471,7 +477,7 @@ os.environ["PATH"] = os.environ["JAVA_HOME"] + "/bin:" + os.environ["PATH"]
! pip install --ignore-installed pyspark==2.4.4

# Install Spark NLP
! pip install --ignore-installed spark-nlp==2.6.0
! pip install --ignore-installed spark-nlp==2.6.1

# Quick SparkSession start
import sparknlp
Expand Down Expand Up @@ -500,7 +506,7 @@ spark.serializer org.apache.spark.serializer.KryoSerializer

3.1. Insatll New -> PyPI -> `spark-nlp` -> Install

3.2. Install New -> Maven -> Coordinates -> `com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0` -> Install
3.2. Install New -> Maven -> Coordinates -> `com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1` -> Install

4. Now you can attach your notebook to the cluster and use Spark NLP!

Expand Down Expand Up @@ -531,7 +537,7 @@ sparknlp {

### Pipelines

Spark NLP offers more than `88 pre-trained pipelines` in `13 languages`.
Spark NLP offers more than `90 pre-trained pipelines` in `13 languages`.

**English pipelines:**
| Pipeline | Name | Build | lang |
Expand Down Expand Up @@ -601,7 +607,7 @@ annotation.select("entities.result").show(false)
*/
```

#### Please check our dedicated repository for the full list of [pre-trained pipelines](https://github.com/JohnSnowLabs/spark-nlp-models)
#### Please check our dedicated repository for the full list of 90+ [pre-trained pipelines](https://github.com/JohnSnowLabs/spark-nlp-models)

### Models

Expand Down Expand Up @@ -683,7 +689,7 @@ val french_pos = PerceptronModel.load("/tmp/pos_ud_gsd_fr_2.0.2_2.4_155653145734
.setOutputCol("pos")
```

#### Please check our dedicated repository for the full list of [pre-trained models](https://github.com/JohnSnowLabs/spark-nlp-models)
#### Please check our dedicated repository for the full list of 250+ [pre-trained models](https://github.com/JohnSnowLabs/spark-nlp-models)

## Examples

Expand Down
12 changes: 6 additions & 6 deletions docs/_layouts/landing.html
Original file line number Diff line number Diff line change
Expand Up @@ -49,22 +49,22 @@ <h1>{{ _section.title }}</h1>
<div class="cell cell--12 cell--lg-12" style="text-align: left; background-color: #2d2d2d; padding: 10px">
{% highlight bash %}
# Install Spark NLP from PyPI
$ pip install spark-nlp==2.6.0
$ pip install spark-nlp==2.6.1

# Install Spark NLP from Anaconda/Conda
$ conda install -c johnsnowlabs spark-nlp

# Load Spark NLP with Spark Shell
$ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0
$ spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1

# Load Spark NLP with PySpark
$ pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0
$ pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1

# Load Spark NLP with Spark Submit
$ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0
$ spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1

# Load Spark NLP as external JAR after comiling and bulding Spark NLP by `sbt assembly`
$ spark-shell --jar spark-nlp-assembly-2.6.0
$ spark-shell --jar spark-nlp-assembly-2.6.1
{% endhighlight %}
</div>
</div>
Expand Down Expand Up @@ -105,7 +105,7 @@ <h3 style="margin-left:-30px">NLP Features</h3>
<li>Easy <strong>TensorFlow</strong> integration</li>
<li>Full integration with <strong>Spark ML</strong> functions</li>
<li><strong>+250</strong> pre-trained <strong>models </strong> in <strong>46 languages! </strong>
<li><strong>+88</strong> pre-trained <strong>pipelines </strong> in <strong>13 languages! </strong>
<li><strong>+90</strong> pre-trained <strong>pipelines </strong> in <strong>13 languages! </strong>
</ul>
</div>
<div class="cell cell--12 cell--lg-7" style="text-align: left; padding: 10px">
Expand Down
12 changes: 6 additions & 6 deletions docs/en/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,25 +47,25 @@ The easiest way to run the python examples is by starting a pyspark
jupyter notebook including the spark-nlp package:

```sh
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1
```

```sh
pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0
pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1
```

```sh
spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.0
spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.11:2.6.1
```

This can also be used to create a SparkSession manually by using the `spark.jars.packages` option in both Python and Scala.

**NOTE**: To use Spark NLP with GPU you can use the dedicated GPU package `com.johnsnowlabs.nlp:spark-nlp-gpu_2.11:2.6.0`
**NOTE**: To use Spark NLP with GPU you can use the dedicated GPU package `com.johnsnowlabs.nlp:spark-nlp-gpu_2.11:2.6.1`

**NOTE**: To use Spark NLP on Apache Spark 2.3.x you should instead use the following packages:

- CPU: `com.johnsnowlabs.nlp:spark-nlp-spark23_2.11:2.6.0`
- GPU: `com.johnsnowlabs.nlp:spark-nlp-gpu-spark23_2.11:2.6.0`
- CPU: `com.johnsnowlabs.nlp:spark-nlp-spark23_2.11:2.6.1`
- GPU: `com.johnsnowlabs.nlp:spark-nlp-gpu-spark23_2.11:2.6.1`

### Explain Document ML

Expand Down
Loading

0 comments on commit 1b4fa52

Please sign in to comment.