Skip to content
This repository was archived by the owner on Mar 7, 2018. It is now read-only.

Add entity extraction capability#21

Merged
c-w merged 6 commits intomasterfrom
extract-entities
Jun 22, 2017
Merged

Add entity extraction capability#21
c-w merged 6 commits intomasterfrom
extract-entities

Conversation

@c-w
Copy link
Contributor

@c-w c-w commented Jun 20, 2017

The Cassandra schema contains a field for entities separate from places so this pull request ensures that we have the necessary data available to write the entity information to Cassandra.

@c-w c-w requested a review from kevinhartman June 20, 2017 20:39
@c-w c-w force-pushed the extract-entities branch from 09ff41d to 8ea6649 Compare June 21, 2017 03:24
@c-w c-w requested a review from erikschlegel June 21, 2017 17:51
@c-w c-w force-pushed the extract-entities branch 2 times, most recently from dfac34a to 4f00634 Compare June 21, 2017 21:47
Copy link
Contributor

@kevinhartman kevinhartman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


import scala.collection.JavaConversions._
import scala.util.{Failure, Success, Try}
import com.microsoft.partnercatalyst.fortis.spark.transforms.nlp.OpeNER.entityIsPlace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: In this case, I think qualifying the name inline would make it more obvious to the reader that this helper is part of OpeNER.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 2934106.


def extractPeople(text: String, language: String): List[Tag] = {
entityRecognizer.extractEntities(text, language).filter(entityIsPerson)
.map(entity => Tag(name = entity.getStr, confidence = 1.0))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own understanding, why are we using tags for people but not place entities?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, not happy with that inconsistency too. Fixed in e0daddc.

Copy link
Contributor

@jcjimenez jcjimenez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@c-w c-w force-pushed the extract-entities branch from 4f00634 to 2934106 Compare June 22, 2017 15:25
@c-w c-w merged commit a86af0d into master Jun 22, 2017
@c-w c-w deleted the extract-entities branch June 22, 2017 15:48
@c-w c-w removed the in progress label Jun 22, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants