# 1. Introduction
## 1.1 Definition
**Relation Extraction (RE)** in Natural Language Processing **(NLP)** is the process of identifying and categorizing semantic relationships between entities in a text.
* It aims to extract meaningful connections, such as relationships between people, organizations, locations, or events.
* These relationships are often represented in the form of triplets: (entity1, relation, entity2), where "relation" describes how the two entities are connected.

## 1.2 Description
**Relation Extraction** is crucial for tasks such as knowledge graph construction, information retrieval, question answering, and enhancing the performance of chatbots. There are different approaches to RE, including:

1. **Rule-Based Methods:** Manually crafted patterns or rules to identify relationships.
2. **Supervised Learning:** Training machine learning models on labeled datasets containing examples of relationships.
3. **Unsupervised and Semi-Supervised Learning:** Identifying relations without extensive labeled data, often using clustering techniques.
4. **Deep Learning Methods:** Utilizing neural networks, such as recurrent neural networks (RNNs), convolutional neural networks (CNNs), or transformers, for end-to-end relation extraction.

Relation Extraction helps automate the extraction of structured data from unstructured text, making it useful for various applications like automated document analysis and building AI-driven systems that understand and use textual information.

# 2. Set Up the Environment
Install **spaCy** and a pre-trained **NLP** model.

In [None]:
!pip install spacy
!python -m spacy download en_core_web_sm

Collecting en-core-web-sm==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.8/12.8 MB[0m [31m71.4 MB/s[0m eta [36m0:00:00[0m
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.


# 3. Import Libraries

In [None]:
import spacy  # Import spaCy, a popular NLP library

# 4. Loading a Pre-trained NLP Model
* Load a small pre-trained English NLP model from **spaCy**
* This model includes capabilities like
 * tokenization,
 * part-of-speech tagging, and
 * named entity recognition (NER)

**en_core_web_sm:** *This is a small English model that comes pre-trained with built-in pipelines for processing English text.*


In [None]:
nlp = spacy.load("en_core_web_sm")
nlp

<spacy.lang.en.English at 0x7e3ceadafaf0>

# 5. Defining the Input Text
The input text includes entities and possible relationships between them that we want to identify.


In [None]:
# Example input text for relation extraction
text = "Barack Obama was born in Honolulu. He was the president of the United States."

# 6. Processing the Text
Use the **spaCy NLP** pipeline to process the input text

In [None]:
doc = nlp(text)
# doc: This object is the processed version of the input text, containing information such as tokens, entities, and their positions in the text.
doc

Barack Obama was born in Honolulu. He was the president of the United States.

# 7. Iterating Over Extracted Entities
* The nested loop compares each entity (`ent1`) with every other entity (`ent2`).
* We skip cases where `ent1` and `ent2` refer to the same entity.

In [None]:
# Loop through all detected named entities in the text
for ent1 in doc.ents:
    # Loop through entities again to compare each entity with others
    for ent2 in doc.ents:
        # Skip if the entities being compared are the same
        if ent1 == ent2:
            continue