### **Method 1: prompt engineering** This is the more simple method, where we explicitely ask to chatGPT the formatted query

In [10]:
# Define the function to extract the query from the propmt
def extract_president_info(sentence):
    # Find the index of "QUERY HERE"
    query_index = sentence.find("QUERY HERE")

    if query_index == -1:
        return None

    # Extract the substring after "QUERY HERE"
    info_str = sentence[query_index + len("QUERY HERE"):].strip()

    # Remove the outer double quotes and split the information string to extract individual components
    parts = [part.strip() for part in info_str.strip('"').split('", "')]
    
    # Assigning variables
    if len(parts) == 3:
        president_name, pres_num, begin_office = parts
        end_office = None
    elif len(parts) == 4:
        president_name, pres_num, begin_office, end_office = parts
    else:
        raise ValueError("Invalid format")

    return president_name, pres_num, begin_office, end_office

In [5]:
# We add to all the following lines the query string
answer_to_query = """
Notice: For the answer, please just write the answer, then please add at the end of the answer in plain text - not python: "QUERY HERE" and add below the answer in the format:
"president name", "number of the president in the list of president", "Year of begin office", "Year of end office" (for the number of the president, just the number without suffixe). Please also use the same name used in the query.
""" 

#### Example 1: Barack Obama

In [6]:
question_1 = "From when to when was Barack Obama the president of the United States?"+answer_to_query
# Extract the answer from ChatGPT's response
print(question_1)

From when to when was Barack Obama the president of the United States?
Notice: For the answer, please just write the answer, then please add at the end of the answer in plain text - not python: "QUERY HERE" and add below the answer in the format:
"president name", "number of the president in the list of president", "Year of begin office", "Year of end office" (for the number of the president, just the number without suffixe). Please also use the same name used in the query.



In [25]:
input_sentence = """
Barack Obama was the President of the United States from January 20, 2009, to January 20, 2017.
QUERY HERE
"Barack Obama", "44", "2009", "2017"
"""
"""
Barack Obama was the 44th president of the United States, and he served from January 20, 2009, to January 20, 2017.
QUERY HERE
"Barack Obama", "44", "2009", "2017"

"""

result = extract_president_info(input_sentence)

if result:
    president_name, pres_num, begin_office, end_office = result
    print("President Name:", president_name)
    print("President Number:", pres_num)
    print("Begin Office:", begin_office)
    print("End Office:", end_office)
else:
    print("Error: 'QUERY HERE' not found in the input sentence.")

President Name: Barack Obama
President Number: 44
Begin Office: 2009
End Office: 2017


#### Example 2: William (Bill) Clinton

In [7]:
question_1 = "From when to when was William (Bill) Clinton the president of the United States?"+answer_to_query
# Extract the answer from ChatGPT's response
print(question_1)

From when to when was William (Bill) Clinton the president of the United States?
Notice: For the answer, please just write the answer, then please add at the end of the answer in plain text - not python: "QUERY HERE" and add below the answer in the format:
"president name", "number of the president in the list of president", "Year of begin office", "Year of end office" (for the number of the president, just the number without suffixe). Please also use the same name used in the query.



In [27]:

input_sentence = """
William (Bill) Clinton was the President of the United States from January 20, 1993, to January 20, 2001.
QUERY HERE
"William (Bill) Clinton", "42", "1993", "2001"
"""
result = extract_president_info(input_sentence)

if result:
    president_name, pres_num, begin_office, end_office = result
    print("President Name:", president_name)
    print("President Number:", pres_num)
    print("Begin Office:", begin_office)
    print("End Office:", end_office)
else:
    print("Error: 'QUERY HERE' not found in the input sentence.")

President Name: William (Bill) Clinton
President Number: 42
Begin Office: 1993
End Office: 2001


#### Example 3: George W Bush 

In [8]:
question_1 = "From when to when was George W. Bush the president of the United States?"+answer_to_query
# Extract the answer from ChatGPT's response
print(question_1)

From when to when was George W. Bush the president of the United States?
Notice: For the answer, please just write the answer, then please add at the end of the answer in plain text - not python: "QUERY HERE" and add below the answer in the format:
"president name", "number of the president in the list of president", "Year of begin office", "Year of end office" (for the number of the president, just the number without suffixe). Please also use the same name used in the query.



In [29]:

input_sentence = """
George W. Bush was the President of the United States from January 20, 2001, to January 20, 2009.
QUERY HERE
"George W. Bush", "43", "2001", "2009"
"""
result = extract_president_info(input_sentence)

if result:
    president_name, pres_num, begin_office, end_office = result
    print("President Name:", president_name)
    print("Begin Office:", begin_office)
    print("End Office:", end_office)
else:
    print("Error: 'QUERY HERE' not found in the input sentence.")

President Name: George W. Bush
President Number: 43
Begin Office: 2001
End Office: 2009


**Limitations:** Names: work can be done with names - i.e: if we write William (Bill) Clinton, chatGPT will answer Bill Clinton

### **Method 2: Name Entity Resolution** We will use NER from spacy to try to create the formatted query. 

In [37]:
## Install Libraries
# !pip install spacy
# !python -m spacy download en_core_web_sm

In [53]:
import spacy
import re

def extract_president_info(text):
    nlp = spacy.load("en_core_web_sm")
    doc = nlp(text)

    president_name = None
    begin_office = None
    end_office = None

    # Extract entities
    for ent in doc.ents:
        if ent.label_ == "PERSON":
            president_name = ent.text
        elif ent.label_ == "DATE":
            if not begin_office:
                begin_office = ent.text
            else:
                end_office = ent.text


    # Format the year part of the date
    begin_year_match = re.search(r"\d{4}", begin_office)
    begin_year = begin_year_match.group() if begin_year_match else None

    end_year_match = re.search(r"\d{4}", end_office)
    end_year = end_year_match.group() if end_year_match else None

    return president_name, begin_year, end_year

#### Example 1: Barack Obama

In [56]:
question_1 = "From when to when was Barack Obama the president of the United States?"
print(question_1)

From when to when was Barack Obama the president of the United States?


In [62]:
answer_to_query = "Barack Obama was the President of the United States from January 20, 2009, to January 20, 2017."
result = extract_president_info(answer_to_query)

if result:
    print(result)
else:
    print("Information not found.")

('Barack Obama', '2009', '2017')


#### Example 2: William (Bill) Clinton

In [63]:
question_1 = "From when to when was William (Bill) Clinton the president of the United States?"
print(question_1)

From when to when was William (Bill) Clinton the president of the United States?


In [64]:
answer_to_query = "William (Bill) Clinton was the President of the United States from January 20, 1993, to January 20, 2001."
result = extract_president_info(answer_to_query)

if result:
    print(result)
else:
    print("Information not found.")

('Clinton', '1993', '2001')


#### Example 3: George W Bush 

In [65]:
question_1 = "From when to when was George W. Bush the president of the United States?"
print(question_1)

From when to when was George W. Bush the president of the United States?


In [66]:
answer_to_query = "George W. Bush was the President of the United States from January 20, 2001, to January 20, 2009."
result = extract_president_info(answer_to_query)

if result:
    print(result)
else:
    print("Information not found.")

('George W. Bush', '2001', '2009')


**Limitations:** Need to be careful not adding too much info or not enough info in the question, i.e: if we ask the question chatGPT might answer differently, i.e: adding or not the president number