Unclear documentation on how to properly use the POSTagger

## Description
The documentation does not provide a clear way to run the POSTagger.  The [annotator documentation](http://nlp.johnsnowlabs.com/components.html) gives the following snippet:
```scala
val posTagger = new PerceptronApproach()
  .setInputCols(Array("sentence", "token"))
  .setOutputCol("pos")
```

However, using this snippet results in a NullPointer exception rather than running.

## Expected Behavior
It would be expected that adding this snippet into a reasonable workflow, such as the one provided in the Quickstart documentation this could be added to the pipeline without crashing.

## Current Behavior
Adding the POSTagger to the pipeline results in a NullPointer exception
```
scala> pipeline.fit(data).transform(data).show()
java.lang.NullPointerException
  at java.io.FilterInputStream.read(FilterInputStream.java:133)
  at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
  at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
  at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
  at java.io.InputStreamReader.read(InputStreamReader.java:184)
  at java.io.BufferedReader.fill(BufferedReader.java:161)
  at java.io.BufferedReader.readLine(BufferedReader.java:324)
  at java.io.BufferedReader.readLine(BufferedReader.java:389)
  at scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:72)
  at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:389)
  at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
  at scala.collection.Iterator$class.foreach(Iterator.scala:893)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
  at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:59)
  at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:104)
  at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:48)
  at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:310)
  at scala.collection.AbstractIterator.to(Iterator.scala:1336)
  at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:302)
  at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1336)
  at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:289)
  at scala.collection.AbstractIterator.toArray(Iterator.scala:1336)
  at com.johnsnowlabs.nlp.annotators.pos.perceptron.PerceptronApproach$.parsePOSCorpusFromDir(PerceptronApproach.scala:227)
  at com.johnsnowlabs.nlp.annotators.pos.perceptron.PerceptronApproach$.retrievePOSCorpus(PerceptronApproach.scala:246)
  at com.johnsnowlabs.nlp.annotators.pos.perceptron.PerceptronApproach.train(PerceptronApproach.scala:84)
  at com.johnsnowlabs.nlp.annotators.pos.perceptron.PerceptronApproach.train(PerceptronApproach.scala:22)
  at com.johnsnowlabs.nlp.AnnotatorApproach.fit(AnnotatorApproach.scala:28)
  at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:153)
  at org.apache.spark.ml.Pipeline$$anonfun$fit$2.apply(Pipeline.scala:149)
  at scala.collection.Iterator$class.foreach(Iterator.scala:893)
  at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
  at scala.collection.IterableViewLike$Transformed$class.foreach(IterableViewLike.scala:44)
  at scala.collection.SeqViewLike$AbstractTransformed.foreach(SeqViewLike.scala:37)
  at org.apache.spark.ml.Pipeline.fit(Pipeline.scala:149)
  ... 54 elided
```

## Possible Solution
The documentation mentions a `setCorpusPath` config method.  From my brief perusal of the code, it appears that using setting the corpus path is required since it does not have a default value.  If that is the case, how to set the corpus path should be explained in the documentation along with a full example.  Ideally one would not need to specify a corpus, or this library would provide pre-trained models on various corpuses.

## Steps to Reproduce
Enter into the spark shell using
```
spark-shell  --packages JohnSnowLabs:spark-nlp:1.2.2
```
and then run the following code

```scala
import com.johnsnowlabs.nlp._
import com.johnsnowlabs.nlp.annotators._
import com.johnsnowlabs.nlp.annotators.pos.perceptron.PerceptronApproach
import com.johnsnowlabs.nlp.annotators.sbd.pragmatic.SentenceDetectorModel
import org.apache.spark.ml.Pipeline

import spark.implicits._
import spark.sql

// Used my own data, adding the data from the notebook as an example
data = spark.read.parquet("../sentiment.parquet").limit(1000)

val documentAssembler = new DocumentAssembler().setInputCol("text").setOutputCol("document")


val sentenceDetector = new SentenceDetectorModel().setInputCols(Array("document")).setOutputCol("sentence")

val regexTokenizer = new RegexTokenizer().setInputCols(Array("sentence")).setOutputCol("token")

val posTagger = new PerceptronApproach().setInputCols(Array("sentence", "token")).setOutputCol("pos")

val finisher = new Finisher().setInputCols("pos").setCleanAnnotations(false)

val pipeline = new Pipeline().setStages(Array(
        documentAssembler,
        sentenceDetector,
        regexTokenizer,
        posTagger,
        finisher
    ))

pipeline.fit(data).transform(data).show()
```

## Context
I was trying to pass data to what I assumed was a pre-trained model for use in an NLP pipeline.

## Your Environment
Spark version: `2.1.1`
spark-nlp version: `1.2.2`
Running on Amazon EMR


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unclear documentation on how to properly use the POSTagger #41

Description

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Context

Your Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unclear documentation on how to properly use the POSTagger #41

Description

Description

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Context

Your Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions