# Data Collection

This section will be conducting test for summarization data and test some metadata that will be used for personalization.

In [None]:
import openai
import pandas as pd

# Set up Dataframes

## Article dataframe

In [20]:
df_article = pd.read_csv('data/tech_articles.csv')
df_article = df_article.rename(columns={'ID': 'article_id'})
df_article.head()

Unnamed: 0,article_id,Date of Article,Category,Title,Body,URL
0,1de70599-86d1-4992-962d-cf89026d5f71,2024-10-24,Technology,Arm is giving Qualcomm the wake-up call it needs,Arm is the company inside every phone you can ...,https://www.androidcentral.com/phones/arm-is-g...
1,219d5b83-0cfc-4d1c-a2e6-fd8cdff1c6f2,2024-11-12,Technology,Chinese vs. American EV: How Xioami's SU7 comp...,How kind of tech do they have?\n\nThe interior...,https://www.businessinsider.com/tesla-model3-v...
2,0fedbc35-680d-4cdb-a97a-4c1ae6160aeb,2024-11-12,Technology,Would you buy an affordable EV made in China?,As the federal government mulls whether to imp...,https://www.cbc.ca/news/business/canada-china-...
3,aef0363e-9a6e-4f0f-a636-97f1200c4257,2024-07-02,Technology,Opinion: Chinese EVs aren’t just an economic t...,Open this photo in gallery: Electric cars for ...,https://www.theglobeandmail.com/opinion/articl...
4,d7f4101a-e988-4013-8f5b-4f3a0cc24231,2024-08-26,Technology,Opinion: Canada must look beyond EV tariffs an...,Open this photo in gallery: BYD electric vehic...,https://www.theglobeandmail.com/business/comme...


## Persona dataframe

In [33]:
df_agent= pd.read_csv('data/agents.csv')

df_agent.head()

Unnamed: 0,agent_id,prompt
0,jc1,You are the Personalization Agent. Your task i...
1,sa1,You are the Summary Agent. Your task is to rea...


## Feedback dataframe

In [36]:
columns_feedback = ['conversation_id','persona_id','article_id' , 'summary','feedback']

#df_feedback = pd.DataFrame(columns=columns_feedback)
#df_feedback.head()

## Prepare agencies

In [13]:
with open('../open_ai_api_key.txt', 'r') as file:
    api_key = file.readline().strip()
openai.api_key = api_key

In [14]:
def agent_turn(system_prompt, conversation_history):
    # Build the messages list starting with the system prompt
    messages = [{"role": "system", "content": system_prompt}] + conversation_history

    response = openai.ChatCompletion.create(
        model="gpt-3.5-turbo",  # or "gpt-4" if you have access
        messages=messages
    )

    assistant_reply = response['choices'][0]['message']

    # Append assistant's reply to the conversation history
    conversation_history.append(assistant_reply)

    return assistant_reply['content']


In [None]:
# For each article
ids = 0
data_feedback = []

for index, article in df_article.iterrows():
    print(f"\nProcessing Article ID: {article['article_id']}")

    # Initialize conversation history
    conversation_history = []

    # Create the user message with the article content
    user_message = {
        "role": "user",
        "content": (
            f"Article ID: {article['article_id']}\n"
            f"Date: {article['Date of Article']}\n"
            f"Title: {article['Title']}\n"
            f"Body: {article['Body']}"
        )
    }
    conversation_history.append(user_message)

    # Number of iterations to improve the summary
    num_iterations = 3


    for i in range(num_iterations):
        print(f"\n--- Iteration {i+1} ---")

        # Summary Agent generates a summary
        summary_response = agent_turn(df_agent.loc[df_agent['agent_id'] == 'sa1', 'prompt'].values[0], conversation_history)
        print(f"Summary Agent Response:\n{summary_response}\n")
        # Append the assistant's reply to the conversation history
        conversation_history.append({
            "role": "assistant",
            "content": summary_response
        })

        # Personalization Agent provides feedback
        feedback_response = agent_turn(df_agent.loc[df_agent['agent_id'] == 'jc1', 'prompt'].values[0], conversation_history)
        print(f"Personalization Agent Feedback:\n{feedback_response}\n")
        # Append the assistant's reply to the conversation history
        conversation_history.append({
            "role": "assistant",
            "content": feedback_response
        })

        data_feedback.append([
            ids, 
            'jc1', 
            article['article_id'], 
            summary_response, 
            feedback_response
        ])

        ids += 1
        
# Convert the 2D list to a DataFrame
columns_feedback = ['conversation_id','persona_id','article_id' , 'summary','feedback']
df_feedback = pd.DataFrame(data_feedback, columns=columns_feedback)



Processing Article ID: 1de70599-86d1-4992-962d-cf89026d5f71

--- Iteration 1 ---
Summary Agent Response:
Summary Agent: Arm, a company crucial in designing processors for numerous tech giants like Qualcomm, Apple, and Nvidia, has issued a 60-day notice to Qualcomm to stop producing chips due to alleged breaches of their agreement. Qualcomm and Arm each defend their positions, but it is anticipated that the companies will reach a last-minute agreement to allow Qualcomm to continue chip production. The tension arose from Qualcomm's integration of Nuvia's chip designs without renegotiating terms with Arm. While Qualcomm heavily relies on Arm, Arm has a diverse customer base and could survive without Qualcomm. A cooperation breakdown would impact various industries, potentially leading to product shortages and market disruption. Ultimately, a compromise is expected due to the mutual importance of the partnership.

Personalization Agent Feedback:
Personalization Agent: The summary provides

Canada faces potential challenges following Trump's election victory, including strained personal relations with Trump, economic impacts from potential tariffs, pressure to increase defense spending, and border security concerns. The future of the USMCA trade pact and addressing specific policies like the Digital Services Tax are key areas of focus as Canada navigates its relationship with the incoming Trump administration.

Canada's relationship with the US under Trump's presidency may face challenges, including strained personal ties, economic impacts from potential tariffs, pressure to increase defense spending, and border security concerns. Key areas of focus for Canada include the USMCA trade pact, policies like the Digital Services Tax, and ensuring border security cooperation.

In [None]:
df_feedback.to_csv('data/feedback_jc1_sa1.csv', index=False)
#df_agent.to_csv('data/agents.csv', index=False)

In [25]:
df_feedback[df_feedback['conversation_id'] == 0]

Unnamed: 0,conversation_id,persona_id,article_id,summary,feedback
0,0,jc,1de70599-86d1-4992-962d-cf89026d5f71,I can see your interest in the dynamics betwee...,Personalization Agent: The summary provided by...


You are the Personalization Agent. Your task is to review the summary provided by the Summary Agent and give constructive feedback to enhance its alignment with the user’s interests.

Read and Analyze: Carefully review the summary provided by the Summary Agent.
- Feedback on Relevance: Provide specific feedback on how well the summary meets the user's interests and background, particularly:
    - The user’s interest in technology, with a strong focus on electric vehicles and Tesla.
    - The user's background in Artificial Intelligence and graduate-level expertise.
    - The user’s interest in semiconductors and the latest advancements in these fields.
- Suggest Improvements: Point out areas where additional clarity, context, or detail might be added, especially if the summary lacks:
    - Completeness: Key points or nuances that are critical to understanding the article.
    - Background Knowledge: Relevant supplemental information that could enhance the user’s understanding, especially if the article assumes basic knowledge not explicitly provided.
    - Further Details: Ask the Summary Agent to elaborate on any technical concepts or emerging trends that may be of particular interest to the user.
- Request Outside Information: If the article lacks context that would benefit the user’s interests, suggest adding an external “Background” section that could briefly summarize relevant general knowledge.
- Do not repeat the feedback unless there are no improvements
In your response, start with "Personalization Agent:" followed by your feedback.

You are the Summary Agent. Your task is to read the article and generate a concise, accurate, and unbiased summary that incorporates feedback and aligns with the user’s interests.

- Read the Article: Carefully review the content provided by the user to understand the main points and essential information.
- Generate a Summary: Produce a clear and objective summary that accurately captures the key information from the article without introducing personal bias.
- Incorporate User Interests: Tailor the summary to reflect the user's interests, especially in areas like technology, electric vehicles, artificial intelligence, and semiconductors, if relevant to the article.
- Include Base Knowledge (If Needed): If the user requests foundational knowledge, provide this as additional context, clearly labeling it as information that is outside the article’s content.
- Respond to Feedback: If you have received feedback from the Personalization Agent, revise your summary to address specific suggestions. Be mindful of any recurring feedback to ensure continuous improvement across iterations.

In your response, start with "Summary Agent:" followed by your summary. If applicable, add a "Background Section:" to include any supplementary information that enhances understanding.

You are the Summary Agent. Your task is to read the article provided by the user, generate a concise, accurate, and unbiased summary, and, if necessary, add a brief "Background Section" for supplementary information.

- Read and Summarize: Carefully review the article and create a summary that captures its main points without personal bias.
- Incorporate User Interests: Focus on the user's interests, such as technology, electric vehicles, artificial intelligence, and semiconductors, if they relate to the article.
- Add Background (First Iteration Only): If additional context is beneficial, add a "Background Section" with supplementary information relevant to the user's understanding. After the first iteration, this section should remain unchanged unless explicitly requested.
- Refine Only the Summary in Later Iterations: For all following iterations, focus on improving the main summary based on feedback from the Personalization Agent, but do not modify the "Background Section."
- Structure Response: Start with "Summary Agent:" followed by your main summary. If applicable, include "Background Section:" in the first response only.
In your response, begin with "Summary Agent:" for the main summary, and only include "Background Section:" if you are providing new background information.