In [1]:
import yaml
from openai import OpenAI

with open('../../secrets.yaml', 'r') as f:
    secrets = yaml.safe_load(f)

OPENAI_API_KEY = secrets['OPENAI_API_KEY']
client = OpenAI(api_key=OPENAI_API_KEY)

MODEL = "gpt-4-turbo"

In [2]:
ENTITY_TYPES = '["ORGANIZATION", "PERSON", "GEO", "EVENT"]'

INPUT_TEXT = """
WASHINGTON—Joe Biden made the central purpose of his presidency clear in 
his Inauguration Day address: "We have learned again that democracy is 
precious. Democracy is fragile," he said at the U.S. Capitol, where a 
violent mob had tried that month to overturn his 2020 election victory. 
Biden’s aim would be to unify the nation and shore up its democratic 
institutions.

That is one reason why the president’s pardon of his son, Hunter Biden, 
on Sunday may further damage his already tarnished legacy: The reprieve he 
ordered threatens to undercut one of the main propositions he offered for 
his election.

Biden’s political brand as a presidential candidate—his value proposition 
as a leader—was largely a promise to restore democratic norms and to fight 
the cynicism that had helped Donald Trump build his MAGA movement on claims 
that self-dealing leaders had corrupted the government. Biden had repeatedly 
promised to respect the independence of the justice system and avoid 
interfering with the prosecution of his son, including by issuing a pardon.

His reversal "is not fully consonant with what he ran on," said Jim Kessler, 
executive vice president for policy at Third Way, a centrist Democratic group. 
While Kessler said he empathized with Biden’s impulses to protect his son, 
the pardon comes as Trump will soon retake office on promises to overhaul a 
criminal justice system that he says unfairly targeted him and his followers. 
To lead the Federal Bureau of Investigation, Trump has nominated Kash Patel, 
a loyalist who has said he would fire its senior leaders and prosecute agents 
he thinks abused their authority.
"""

In [3]:
with open("entity_extraction.txt",'r') as f:
    my_prompt = f.read()

my_prompt = my_prompt.format(
    entity_types = ENTITY_TYPES,
    input_text = INPUT_TEXT
)

print(my_prompt)


-Goal-
Given a text document that is potentially relevant to this activity and a list of entity types, identify all entities of those types from the text and all relationships among the identified entities.
 
-Steps-
1. Identify all entities. For each identified entity, extract the following information:
- entity_name: Name of the entity, capitalized
- entity_type: One of the following types: ["ORGANIZATION", "PERSON", "GEO", "EVENT"]
- entity_description: Comprehensive description of the entity's attributes and activities
Format each entity as ("entity"<|><entity_name><|><entity_type><|><entity_description>)
 
2. From the entities identified in step 1, identify all pairs of (source_entity, target_entity) that are *clearly related* to each other.
For each pair of related entities, extract the following information:
- source_entity: name of the source entity, as identified in step 1
- target_entity: name of the target entity, as identified in step 1
- relationship_description: explanat

In [4]:
def get_gpt_completion(prompt):
    completion = client.chat.completions.create(
        model = MODEL, 
        messages = [
            {'role':'user', 'content':prompt}
        ],
        temperature = 0
    )
    return completion.choices[0].message.content

my_completion = get_gpt_completion(my_prompt)

In [5]:
results = my_completion.replace("<|COMPLETE|>","").split("##")
for result in results:
    my_tuple = result.strip()[1:-1].split("<|>")
    for text in my_tuple:
        print(text)
    print("*****************")


"entity"
JOE BIDEN
PERSON
President of the United States who aimed to unify the nation and shore up its democratic institutions, but whose pardon of his son might undermine his presidency's purpose
*****************
"entity"
HUNTER BIDEN
PERSON
Son of Joe Biden, whose pardon by his father has raised questions about the independence of the justice system
*****************
"entity"
U.S. CAPITOL
GEO
Location where Joe Biden delivered his Inauguration Day address and where a violent mob tried to overturn the 2020 election results
*****************
"entity"
THIRD WAY
ORGANIZATION
A centrist Democratic group whose executive vice president commented on Biden's actions regarding his son's pardon
*****************
"entity"
FEDERAL BUREAU OF INVESTIGATION
ORGANIZATION
U.S. government agency involved in law enforcement, mentioned in context of potential leadership changes under Donald Trump's administration
*****************
"entity"
KASH PATEL
PERSON
Nominee by Donald Trump to lead the Federal B