Skip to content
Branch: master
Find file History
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
annotation/english Update for Spark NLP 2.0.4 May 20, 2019
training Add new examples to Jupyter Mar 22, 2019
README.md Update Spark NLP version to 2.0.1 Apr 12, 2019
quick_start.ipynb Add quick_strat notebook Apr 12, 2019

README.md

Spark-NLP Python

There are two directories: training and annotation. Inside traning you will find all the examples which help you to train Spark-NLP models and pipelines. On the other hand, the examples inside annotation demonstrate how to use Spark-NLP annotators, pre-trained models, and pre-trained pipelines.

Setup Spark-NLP

Pip

If you installed pyspark through pip, you can install spark-nlp through pip as well.

pip install spark-nlp==2.0.1

PyPI spark-nlp package

Conda

If you are using Anaconda/Conda for managing Python packages, you can install spark-nlp as follow:

conda install -c johnsnowlabs spark-nlp

Anaconda spark-nlp package

Then you'll have to create a SparkSession manually, for example:

spark = SparkSession.builder \
    .appName("ner")\
    .master("local[*]")\
    .config("spark.driver.memory","6G")\
    .config("spark.driver.maxResultSize", "2G") \
    .config("spark.jars.packages", "JohnSnowLabs:spark-nlp:2.0.1")\
    .config("spark.kryoserializer.buffer.max", "500m")\
    .getOrCreate()

If using local jars, you can use spark.jars instead for a comma delimited jar files. For cluster setups, of course you'll have to put the jars in a reachable location for all driver and executor nodes

Setup Jupyter Notebook

Prerequisite: Python

While Jupyter runs code in many programming languages, Python is a requirement (Python 3.3 or greater, or Python 2.7) for installing the Jupyter Notebook itself.

Installing Jupyter using Anaconda

We strongly recommend installing Python and Jupyter using the Anaconda Distribution, which includes Python, the Jupyter Notebook, and other commonly used packages for scientific computing and data science.

First, download Anaconda. We recommend downloading Anaconda’s latest Python 3 version.

Second, install the version of Anaconda which you downloaded, following the instructions on the download page.

Congratulations, you have installed Jupyter Notebook! To run the notebook, run the following command at the Terminal (Mac/Linux) or Command Prompt (Windows):

jupyter notebook

Installing Jupyter with pip

As an existing or experienced Python user, you may wish to install Jupyter using Python’s package manager, pip, instead of Anaconda.

If you have Python 3 installed (which is recommended):

python3 -m pip install --upgrade pip
python3 -m pip install jupyter

Congratulations, you have installed Jupyter Notebook! To run the notebook, run the following command at the Terminal (Mac/Linux) or Command Prompt (Windows):

jupyter notebook

Original reference: https://jupyter.org/install

You can’t perform that action at this time.