Skip to content
This repository has been archived by the owner. It is now read-only.

Add entity extraction capability #21

Merged
merged 6 commits into from Jun 22, 2017

Conversation

Projects
None yet
3 participants
@c-w
Copy link
Member

commented Jun 20, 2017

The Cassandra schema contains a field for entities separate from places so this pull request ensures that we have the necessary data available to write the entity information to Cassandra.

@c-w c-w requested a review from kevinhartman Jun 20, 2017

@c-w c-w added the in progress label Jun 20, 2017

@c-w c-w force-pushed the extract-entities branch from 09ff41d to 8ea6649 Jun 21, 2017

@c-w c-w requested a review from erikschlegel Jun 21, 2017

@c-w c-w force-pushed the extract-entities branch 2 times, most recently from dfac34a to 4f00634 Jun 21, 2017

@kevinhartman
Copy link
Contributor

left a comment

LGTM


import scala.collection.JavaConversions._
import scala.util.{Failure, Success, Try}
import com.microsoft.partnercatalyst.fortis.spark.transforms.nlp.OpeNER.entityIsPlace

This comment has been minimized.

Copy link
@kevinhartman

kevinhartman Jun 22, 2017

Contributor

nit: In this case, I think qualifying the name inline would make it more obvious to the reader that this helper is part of OpeNER.

This comment has been minimized.

Copy link
@c-w

c-w Jun 22, 2017

Author Member

Done in 2934106.


def extractPeople(text: String, language: String): List[Tag] = {
entityRecognizer.extractEntities(text, language).filter(entityIsPerson)
.map(entity => Tag(name = entity.getStr, confidence = 1.0))

This comment has been minimized.

Copy link
@kevinhartman

kevinhartman Jun 22, 2017

Contributor

For my own understanding, why are we using tags for people but not place entities?

This comment has been minimized.

Copy link
@c-w

c-w Jun 22, 2017

Author Member

Yeah, not happy with that inconsistency too. Fixed in e0daddc.

@jcjimenez
Copy link
Contributor

left a comment

LGTM

@c-w c-w force-pushed the extract-entities branch from 4f00634 to 2934106 Jun 22, 2017

@c-w c-w merged commit a86af0d into master Jun 22, 2017

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@c-w c-w deleted the extract-entities branch Jun 22, 2017

@c-w c-w removed the in progress label Jun 22, 2017

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.