<h1>Information Extraction in NLP</h1>

<p>IE aims to extract structured data from unstructured text, which improves efficiency in analyzing and utilizing data.</p>

<h2>Named Entity Recognition (NER)</h2>
<p>NER is a subtask of IE in NLP that identifies and classifies named entities in text into predefined categories.</p>

In [19]:
import spacy

nlp = spacy.load("en_core_web_sm")

Sample = nlp("It was 7 minutes after midnight. The dog was lying on the grass in the middle of the lawn in front of Mrs Shearsâ€™ house.")

for ent in Sample.ents:
    print(f"Entity: {ent.text}, Type: {ent.label_}")

Entity: 7 minutes after midnight, Type: TIME
Entity: Mrs Shears, Type: PERSON


<h2>NER Methods</h2>
<h4>Rule-Based Approaches: Use handcrafted rules based on patterns and dictionaries.</h4>
<h4>Statistical Machine Learning Approaches:Train models on texts labeled with named entities. (HMM, linear chain CRF) </h4>
<h4>Use deep neural networks, such as Bi-directional LSTMs or Transformers to learn from large amounts of text data. </h4>

<h2>Relational Extraction</h2>
<p>Relational Extraction (RE) identifies and extracts relationships between entities in a text.</p>

<h3>Binary Relationship Extraction </h3>
<p>This extraction process focuses on identifying and categorizing relationships between pairs of entities, which is a fundamental form of relationship extraction and is often used when the relationships are simple and can be expressed as pairs.</p>

<h3>Ternary Relationship Extraction</h3>
    <p>Ternary relationship extraction is an advanced natural language processing (NLP) task that extends the concept of binary relationship extraction by identifying and extracting relationships involving three entities from unstructured text data.</p>

<h3>Nested Relationship Extraction</h3>
<p>This extraction process is used when relationships within a text are hierarchical or embedded within one another.</p>

<h3>Temporal Relationship Extraction</h3>
<p>This extraction focuses on identifying relationships with a temporal dimension which includes determining when an event occurred or when a relationship was valid.</p> 

<h2>Event Extraction</h2>
<p>Event extraction is a process that involves identifying and extracting structured representations of events from unstructured text. An event consists of a trigger word and a set of arguments that provide additional information such as the entities involved, locations, times, and causes.</p>

<h2>Coreference Resolution</h2>
<p>Coreference resolution identifies expressions in a text that refer to the same entity. It helps NLP models understand context by linking pronouns, noun phrases, or other references back to the correct entity.</p>

<h3>Pronominal Coreference</h3>
<p>Resolving pronouns to their corresponding nouns.</p>

<h3>Nominal Coreference</h3>
<p>Resolving different noun phrases referring to the same entity.</p>

<h3>Demonstrative Coreference</h3>
<p>Resolving words like this, that, these, those when they refer to a previously mentioned entity.</p>

<h3>Cataphoric Reference</h3>
<p>When a pronoun appears before the noun it refers to.</p>

<h2>Template Filling</h2>
<p>Template filling is an information extraction task where a system identifies key pieces of information from text and fills predefined templates with relevant data. It helps convert unstructured text into structured information, making it useful for databases, reports, and AI-driven applications.</p>

<h2>Open Information Extraction (OpenIE) </h2>
<p>OpenIE is an unsupervised method for extracting structured subject-predicate-object triples from unstructured text. OpenIE does not require predefined schemas and can extract facts from any domain.</p>