**Relationship extraction (RE)** is the IE task that deals with extracting entities and relationships between them from text documents.

![alt text](https://learning.oreilly.com/library/view/practical-natural-language/9781492054047/assets/pnlp_0510.png)

This consists of modeling it as a two-step classification problem:
1. Whether two entities in a text are related (binary classification).
2. If they are related, what is the relation between them (multiclass classification)?

In scenarios where we cannot procure training data for supervised approaches, we can resort to unsupervised approaches. Unsupervised RE (also known as “open IE”) aims to extract relations from the web without relying on any training data or any list of relations. The relations extracted are in the form of <verb, argument1, argument2> tuples. Sometimes, a verb may have more arguments.

The challenge with this approach lies in mapping the extracted versions to some standardized set of relations (e.g., fatherOf, motherOf, inventorOf, etc.) from a database.

A solution commonly used in NLP projects in the industry is to rely on the Natural Language Understanding service provided by IBM Watson

In [2]:
import json
from watson_developer_cloud import NaturalLanguageUnderstandingV1
from watson_developer_cloud.natural_language_understanding_v1 import Features, RelationsOptions

# not getting the API again :(
natural_language_understanding = NaturalLanguageUnderstandingV1(
    version='2018-11-16',
    iam_apikey='XXXXX',
    url='https://gateway-wdc.watsonplatform.net/natural-language-understanding/api'
)

response = natural_language_understanding.analyze(
    text='Leonardo DiCaprio won Best Actor in a Leading Role for his performance.',
    features=Features(relations=RelationsOptions())).get_result()

print(json.dumps(response, indent=2))

  if __name__ == '__main__':


WatsonApiException: Error: Provided API key could not be found, Code: 400

In [3]:
mytext = "Satya Narayana Nadella currently serves as the Chief Executive Officer (CEO) of Microsoft."

response = natural_language_understanding.analyze(
    text=mytext,
    features=Features(relations=RelationsOptions())).get_result()

result = json.dumps(response)
result

WatsonApiException: Error: Provided API key could not be found, Code: 400

In [4]:
for item in response['relations']:
        print(item['type'])
        for subitem in item['arguments']:
            print(subitem['entities'])
print()

NameError: name 'response' is not defined

In [5]:
mytext2 = "Nadella was born in Hyderabad. His father, Bukkapuram Nadella Yugandher, was a civil servant who worked for the Indian Administrative Service of the Government of India. His mother was a Sanskrit scholar. "
response = natural_language_understanding.analyze(
    text=mytext2,
    features=Features(relations=RelationsOptions())).get_result()
for item in response['relations']:
        print(item['type'])
        for subitem in item['arguments']:
            print(subitem['entities'])
print()

WatsonApiException: Error: Provided API key could not be found, Code: 400

In [6]:
mytext3 = """Nadella attended the Hyderabad Public School, Begumpet [12] before receiving
a bachelor's in electrical engineering[13] from the Manipal Institute of Technology 
(then part of Mangalore University) in Karnataka in 1988."""
response = natural_language_understanding.analyze(
    text=mytext3,
    features=Features(relations=RelationsOptions())).get_result()
for item in response['relations']:
        print(item['type'])
        for subitem in item['arguments']:
            print(subitem['entities'])

WatsonApiException: Error: Provided API key could not be found, Code: 400

TIP: Start with pattern-based approaches and use some form of weak supervision in scenarios where pre-trained supervised models may not work.