### NLP : Named Entity Tagger in TransmogrifAI
In this sample we will look at how to use TransmogrifAI to extract Names Entities : Person, Date and Organization.

First we import the jars for `transmogrifai-core`, `spark-mllib` and `transmogrifai-models`. `transmogrifai-models` encapsulates OpenNLP models

In [1]:
%classpath add mvn com.salesforce.transmogrifai transmogrifai-core_2.11 0.6.0

In [2]:
%classpath add mvn org.apache.spark spark-mllib_2.11 2.3.2

In [12]:
%classpath add mvn com.salesforce.transmogrifai transmogrifai-models_2.11 0.6.0

#### NameEntityRecognizer
Name Entity: `NameEntityType` text recognizer class which encapsulates `com.salesforce.op.utils.text.OpenNLPAnalyzer`. OpenNLPAnalyzer loads Open NLP models from disk using `com.salesforce.op.utils.text.OpenNLPModels` class

In [13]:
import com.salesforce.op.stages.impl.feature.NameEntityRecognizer

import com.salesforce.op.stages.impl.feature.NameEntityRecognizer


Create a Seq qhich is then fed into `NameEntityRecognizer.Analyzer.analyze(..)` to create tokens from plain text.

In [15]:
import com.salesforce.op.features.types._
import com.salesforce.op.utils.text.Language

val input = Seq(
    "Salesforce was founded in 1999 by former Oracle executive Marc Benioff, Parker Harris, Dave Moellenhoff, and " + 
    "Frank Dominguez as a company specializing in software as a service. Harris, Moellenhoff, and Dominguez,"+
    " three software developers previously at consulting firm Left Coast Software, were introduced to Benioff through"+
    "a friend and former Oracle colleague Bobby Yazdani. Harris and team wrote the initial sales automation software,"+
    " which launched to its first customers during Sept-Nov 1999."
    )
val tokens: Seq[TextList] = input.map(x => NameEntityRecognizer.Analyzer.analyze(x, Language.English).toTextList)

[[TextList(Salesforce, was, founded, in, 1999, by, former, Oracle, executive, Marc, Benioff, ,, Parker, Harris, ,, Dave, Moellenhoff, ,, and, Frank, Dominguez, as, a, company, specializing, in, software, as, a, service, ., Harris, ,, Moellenhoff, ,, and, Dominguez, ,, three, software, developers, previously, at, consulting, firm, Left, Coast, Software, ,, were, introduced, to, Benioff, througha, friend, and, former, Oracle, colleague, Bobby, Yazdani, ., Harris, and, team, wrote, the, initial, sales, automation, software, ,, which, launched, to, its, first, customers, during, Sept-Nov, 1999, .)]]

Instantiate `OpenNLPNameEntityTagger` which is then used to tag tokens to `Person`, `Organization` or `Date`.

In [23]:
import com.salesforce.op.utils.text.OpenNLPNameEntityTagger
import com.salesforce.op.utils.text.NameEntityType
import com.salesforce.op.features.types._

val nerTagger = new OpenNLPNameEntityTagger()

com.salesforce.op.utils.text.OpenNLPNameEntityTagger@2a4edd6d

#### Extract Person tags

We extract by passing following values to `nerTagger` instance defined above

`nerTagger.tag( token, Language.English, Seq(NameEntityType.Person)`

In [53]:
val personEntities = tokens.map { tokenInput => 
      nerTagger.tag(tokenInput.value, Language.English, Seq(NameEntityType.Person)).tokenTags
}
personEntities

[[Map(Parker -> Set(Person), Dominguez -> Set(Person), Benioff -> Set(Person), Yazdani -> Set(Person), Frank -> Set(Person), Marc -> Set(Person), Bobby -> Set(Person), Moellenhoff -> Set(Person), Dave -> Set(Person), Harris -> Set(Person))]]

#### Extract Date

In [55]:
val dateEntities = tokens.map { tokenInput => 
      nerTagger.tag(tokenInput.value, Language.English, Seq(NameEntityType.Date)).tokenTags
}
dateEntities

[[Map(1999 -> Set(Date))]]

#### Extract Organization

In [56]:
val organizationEntities = tokens.map  { tokenInput => 
      nerTagger.tag(tokenInput.value, Language.English, Seq(NameEntityType.Organization)).tokenTags
}
organizationEntities

[[Map(Oracle -> Set(Organization))]]