# Extracting quotes using OpenAI and Citron. 

The script below is a short demonstration of how we can use OpenAI's chat completion model to extract quotes from news articles. As we will see, whilst the accuracy and ability to extract indirect quotes is quite high, it is difficult to extract the data in a structured format.

Citron on the other hand is well structured, but not as accurate when it comes to naming persons. 

Part 1. OpenAI

In [17]:
import os
import openai

api_key = 'xxxx'
GPT_MODEL = "gpt-3.5-turbo"
openai.api_key = api_key

In [23]:
def generate_response(article_body):
    """
    Generates a response by using OpenAI Chat Completion API to process the given article body.

    Parameters:
    - article_body (str): The body of the article.

    Returns:
    - str: The generated response.

    Raises:
    - Exception: If an error occurs during the API call.
    """
    retries = 1  # Number of retries in case of a disconnection
    for retry in range(1, retries + 2):
        try:
            response = openai.ChatCompletion.create(
                model=GPT_MODEL,
                messages=[
                    {"role": "system", "content": """You are a journalist finding direct & indirect quotes from news articles. Your answer should be in the form of a python dictionary "{Person:Quote}". The quotes should match the article exactly"""},
                    {"role": "user", "content": f"Body: {article_body}"}
                ],
                max_tokens=500,
                temperature=0.5
            )
            return response.choices[0].message['content']
        except Exception as e:
            print(f"An error occurred: {str(e)}")
            if retry <= retries:
                print(f"Retrying in 5 seconds... (Attempt {retry})")
                time.sleep(2)
            else:
                print("Max retries reached. Skipping this article.")
    return "Error: Unable to generate response"

In [24]:
article = """Title: ‘I warned Muscat against high-rise buildings. I got silence’
Description: Trident Park and the Brewhouse designed to avoid compromising the island’s skyline and historic sightlines between Valletta and Mdina

Celebrated British architect IAN RITCHIE, involved in projects such as the Louvre’s glass pyramid in Paris and now Farsons’ Trident Park, tells Fiona Galea Debono about what he learned from 10 years in Malta.

The world-renowned architect behind Farsons’ low-lying Trident Park and regenerated Brewhouse believes the day will come when high-rise buildings will be torn down as he contemplates the “disfiguration” of Malta.

It has been a 10-year journey on the island for Ian Ritchie, who has worked on the newly inaugurated office space and iconic brewery transformation since 2014, and who now weighs in on the direction the country has headed along the way.

Ritchie recalls alerting former prime minister Joseph Muscat to the impact of high-rise structures in the vicinity of the Farsons project in a meeting about the improvement of Mrieħel through a master plan for the industrial area.

Trident Park and the Brewhouse were designed to avoid compromising the island’s skyline and historic sightlines between Valletta and Mdina.

“I presented to Muscat the view from Hastings Gardens towards Mdina, saying high-rise buildings would destroy one of the world’s great views.”

But that message seemed “contrary to the economic drive of those in charge of the country at the time”, Ritchie relates.

What he got in return from Muscat was “silence”."""

response = generate_response(article)
print(response)


{"Ian Ritchie": "I presented to Muscat the view from Hastings Gardens towards Mdina, saying high-rise buildings would destroy one of the world’s great views.", "Ian Ritchie": "But that message seemed “contrary to the economic drive of those in charge of the country at the time”", "Ian Ritchie": "What he got in return from Muscat was “silence”."}


Part 2. Using Citron

In [20]:
import sys
sys.path.append(r"C:\Users\grupp\Python Files\citron-main")

from citron.citron import Citron
from citron import utils

# Let's try to handle the error where it occurs
try:
    nlp = utils.get_parser()
except ValueError:
    # If 'to_json' extension is already set, reload the 'en_core_web_sm' model
    import spacy
    nlp = spacy.load('en_core_web_sm')

citron = Citron(r"C:\Users\grupp\Python Files\citron-main\models\en_2021-11-15", nlp)


2023-07-16 13:00:33 INFO utils: Loading spacy model
2023-07-16 13:00:34 INFO citron: Loading Citron model: C:\Users\grupp\Python Files\citron-main\models\en_2021-11-15
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations
https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


In [21]:
# Define the text you want to analyze
text = """Title: ‘I warned Muscat against high-rise buildings. I got silence’
Description: Trident Park and the Brewhouse designed to avoid compromising the island’s skyline and historic sightlines between Valletta and Mdina

Celebrated British architect IAN RITCHIE, involved in projects such as the Louvre’s glass pyramid in Paris and now Farsons’ Trident Park, tells Fiona Galea Debono about what he learned from 10 years in Malta.

The world-renowned architect behind Farsons’ low-lying Trident Park and regenerated Brewhouse believes the day will come when high-rise buildings will be torn down as he contemplates the “disfiguration” of Malta.

It has been a 10-year journey on the island for Ian Ritchie, who has worked on the newly inaugurated office space and iconic brewery transformation since 2014, and who now weighs in on the direction the country has headed along the way.

Ritchie recalls alerting former prime minister Joseph Muscat to the impact of high-rise structures in the vicinity of the Farsons project in a meeting about the improvement of Mrieħel through a master plan for the industrial area.

Trident Park and the Brewhouse were designed to avoid compromising the island’s skyline and historic sightlines between Valletta and Mdina.

“I presented to Muscat the view from Hastings Gardens towards Mdina, saying high-rise buildings would destroy one of the world’s great views.”

But that message seemed “contrary to the economic drive of those in charge of the country at the time”, Ritchie relates.

What he got in return from Muscat was “silence”."""


In [30]:
# Use the `extract` method to analyze the text
result = citron.extract(text)

# If you want to use `get_quotes`, you would first need to convert your text to a spaCy Doc object:
doc = nlp(text)

# Print the extracted quotes from the result
quotes = result['quotes']
for quote in quotes:
    sources_text = [source['text'] for source in quote['sources']]
    contents_text = [content['text'] for content in quote['contents']]
    print(f"Quote: {' '.join(contents_text)}, Speaker: {' '.join(sources_text)}")


Quote: the day will come when high-rise buildings will be torn down as he contemplates the “disfiguration” of Malta, Speaker: The world-renowned architect behind Farsons
Quote: alerting former prime minister Joseph Muscat to the impact of high-rise structures in the vicinity of the Farsons project in a meeting about the improvement of Mrieħel through a master plan for the industrial area, Speaker: Ian Ritchie


Part 3 - More complex articles and topic extraction. 