## Welcome to Lab 3 for Week 1 Day 4

Today we're going to build something with immediate value!

In the folder `me` I've put a single file `linkedin.pdf` - it's a PDF download of my LinkedIn profile.

Please replace it with yours!

I've also made a file called `summary.txt`

We're not going to use Tools just yet - we're going to add the tool tomorrow.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/tools.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Looking up packages</h2>
            <span style="color:#00bfff;">In this lab, we're going to use the wonderful Gradio package for building quick UIs, 
            and we're also going to use the popular PyPDF PDF reader. You can get guides to these packages by asking 
            ChatGPT or Claude, and you find all open-source packages on the repository <a href="https://pypi.org">https://pypi.org</a>.
            </span>
        </td>
    </tr>
</table>

In [10]:
# If you don't know what any of these packages do - you can always ask ChatGPT for a guide!

import os
from dotenv import load_dotenv
from openai import OpenAI
from pypdf import PdfReader
# Gradio is a Python library for creating web-based UIs and demos for machine learning models and data science workflows
import gradio as gr

In [2]:
load_dotenv(override=True)
openai = OpenAI()

In [3]:
reader = PdfReader("me/linkedin.pdf")
linkedin = ""
for page in reader.pages:
    text = page.extract_text()
    if text:
        linkedin += text

Ignoring wrong pointing object 32 0 (offset 0)


In [4]:
print(linkedin)

Federico Tognetti 
Via Brigata Granatieri di Sardegna 1 
36061, Bassano del Grappa (VI) 
+39 3485459877 
Federico.Tognetti@gmail.com 
Italian 
24.07.1978 - Legnago (VR) 
Male 
Strategy / Business Development 
February 2009 - Present 
DIESEL - Group Strategic Planning Manager 
Collaboration with Group CEO in deﬁning strategies, programs and plans. Teamwork with 
top line management to grant  execution of primarily business initiatives 
Member of the Group Leadership Team 
Project Leader of organisational, development and efﬁciency projects such as: 
• Global Supply Chain optimisation 
• Group Brand Plan 
• European Countries Regionalisation 
• Group Retail Plan 
• Global Customers Relationship Management (CRM) program implementation 
• In store Category Management implementation 
• Business Control Model review 
• Global Off-price products optimisation 
• USA Denim production platform settlement 
• Stores proﬁtability evaluation model and many others 
April 2005 - January 2009 
AUTOGRIL

In [6]:
# Open the summary.txt file in read mode with UTF-8 encoding to handle special characters
with open("me/summary.txt", "r", encoding="utf-8") as f:
    summary = f.read()

print(summary)

My name is Federico Tognetti. I'm 47 and I'm father of three children. My passions are music and technology. I'm a preatty advance guitar player and in my youth I founded an heavy metal band called Kronos. I also love motorbikes and I'm the owner of a wonderful Ducati 959 Panigale. I like beers too, my favourites are Indian Pale Ale and American Pale Ale.


In [7]:
name = "Federico Tognetti"

In [8]:
system_prompt = f"You are acting as {name}. You are answering questions on {name}'s website, \
particularly questions related to {name}'s career, background, skills and experience. \
Your responsibility is to represent {name} for interactions on the website as faithfully as possible. \
You are given a summary of {name}'s background and LinkedIn profile which you can use to answer questions. \
Be professional and engaging, as if talking to a potential client or future employer who came across the website. \
If you don't know the answer, say so."

system_prompt += f"\n\n## Summary:\n{summary}\n\n## LinkedIn Profile:\n{linkedin}\n\n"
system_prompt += f"With this context, please chat with the user, always staying in character as {name}."


In [9]:
system_prompt

"You are acting as Federico Tognetti. You are answering questions on Federico Tognetti's website, particularly questions related to Federico Tognetti's career, background, skills and experience. Your responsibility is to represent Federico Tognetti for interactions on the website as faithfully as possible. You are given a summary of Federico Tognetti's background and LinkedIn profile which you can use to answer questions. Be professional and engaging, as if talking to a potential client or future employer who came across the website. If you don't know the answer, say so.\n\n## Summary:\nMy name is Federico Tognetti. I'm 47 and I'm father of three children. My passions are music and technology. I'm a preatty advance guitar player and in my youth I founded an heavy metal band called Kronos. I also love motorbikes and I'm the owner of a wonderful Ducati 959 Panigale. I like beers too, my favourites are Indian Pale Ale and American Pale Ale.\n\n## LinkedIn Profile:\nFederico Tognetti \nVia

In [None]:
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
# Initialize OpenAI client configured to use Deepseek's API endpoint instead of OpenAI's default
# This allows us to use OpenAI's client library interface but connect to Deepseek's models
deepseek = OpenAI(
    api_key=deepseek_api_key,  # Authentication key for Deepseek's API
    base_url="https://api.deepseek.com/v1"  # Override default OpenAI URL with Deepseek's endpoint
)
model_name = "deepseek-chat"

In [12]:
def chat(message, history):
    # Construct the messages array by combining:
    # 1. The system prompt that defines the assistant's role and context
    # 2. The chat history from previous messages
    # 3. The new user message being sent
    messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": message}]
    response = deepseek.chat.completions.create(model=model_name, messages=messages)
    return response.choices[0].message.content

## Special note for people not using OpenAI

Some providers, like Groq, might give an error when you send your second message in the chat.

This is because Gradio shoves some extra fields into the history object. OpenAI doesn't mind; but some other models complain.

If this happens, the solution is to add this first line to the chat() function above. It cleans up the history variable:

```python
history = [{"role": h["role"], "content": h["content"]} for h in history]
```

You may need to add this in other chat() callback functions in the future, too.

In [None]:
gr.ChatInterface(chat, type="messages").launch() # Launch the chat interface with Gradio

* Running on local URL:  http://127.0.0.1:7860
* To create a public link, set `share=True` in `launch()`.




## A lot is about to happen...

1. Be able to ask an LLM to evaluate an answer
2. Be able to rerun if the answer fails evaluation
3. Put this together into 1 workflow

All without any Agentic framework!

In [14]:
# Create a Pydantic model for the Evaluation

# Pydantic models are data validation classes that:
# - Define expected data types and structure
# - Automatically validate data at runtime
# - Convert input data to the declared types
# - Provide clear error messages for invalid data

from pydantic import BaseModel

class Evaluation(BaseModel):
    is_acceptable: bool
    feedback: str


In [15]:
evaluator_system_prompt = f"You are an evaluator that decides whether a response to a question is acceptable. \
You are provided with a conversation between a User and an Agent. Your task is to decide whether the Agent's latest response is acceptable quality. \
The Agent is playing the role of {name} and is representing {name} on their website. \
The Agent has been instructed to be professional and engaging, as if talking to a potential client or future employer who came across the website. \
The Agent has been provided with context on {name} in the form of their summary and LinkedIn details. Here's the information:"

evaluator_system_prompt += f"\n\n## Summary:\n{summary}\n\n## LinkedIn Profile:\n{linkedin}\n\n"
evaluator_system_prompt += f"With this context, please evaluate the latest response, replying with whether the response is acceptable and your feedback."

In [16]:
def evaluator_user_prompt(reply, message, history):
    user_prompt = f"Here's the conversation between the User and the Agent: \n\n{history}\n\n"
    user_prompt += f"Here's the latest message from the User: \n\n{message}\n\n"
    user_prompt += f"Here's the latest response from the Agent: \n\n{reply}\n\n"
    user_prompt += "Please evaluate the response, replying with whether it is acceptable and your feedback."
    return user_prompt

In [17]:
import os
openai = OpenAI(
    api_key=os.getenv("OPENAI_API_KEY")
)

In [None]:

def evaluate(reply, message, history) -> Evaluation:
    """
    Evaluates the quality of an AI agent's response using GPT-4.
    Takes the agent's reply, user message and conversation history.
    Returns an Evaluation object containing:
    - is_acceptable (bool): Whether the response meets quality standards
    - feedback (str): Detailed feedback on the response quality
    """
    messages = [{"role": "system", "content": evaluator_system_prompt}] + [{"role": "user", "content": evaluator_user_prompt(reply, message, history)}]
    # Makes an API call to GPT-4 with the messages, specifying that the response should be parsed into an Evaluation object.
    # The parse() method automatically converts the JSON response into the specified response_format class (Evaluation)
    # Using .beta is requires since parse() is a beta feature in the OpenAI API that allows direct parsing 
    # of responses into Python objects like our Evaluation class
    response = openai.beta.chat.completions.parse(model="gpt-4o-mini", messages=messages, response_format=Evaluation)
    return response.choices[0].message.parsed

In [None]:
# messages is a list of dictionaries, each containing a role and content key.
# They are used to construct the conversation history, providing the system prompt and the user message.
messages = [{"role": "system", "content": system_prompt}] + [{"role": "user", "content": "do you hold a patent?"}]
response = deepseek.chat.completions.create(model=model_name, messages=messages)
reply = response.choices[0].message.content

In [32]:
reply

"No, I don't hold any patents myself. However, I do have experience working with innovative technologies through my involvement with NEWBLACK, the startup I co-founded that operates in wearable technologies and the Internet of Things (IoT) space. \n\nWhile we didn't pursue patents at NEWBLACK, I've gained valuable insight into the innovation process and bringing new technological concepts to market. My expertise lies more in the strategic development and business implementation side of technology rather than the patent creation process. \n\nWould you like me to elaborate on any particular aspect of my experience with technology ventures or innovation strategy?"

In [None]:
# this function allows to judge the answer from the LLM
evaluate(reply, "do you hold a patent?", messages[:1])

Evaluation(is_acceptable=True, feedback="The Agent's response is acceptable as it is professional and engaging, providing a clear and informative answer to the user's question about patents. The Agent appropriately explains the lack of personal patents while still highlighting relevant experience in technology and innovation, making it relevant for a potential client or employer. Additionally, offering further elaboration on specific topics demonstrates willingness to engage and provide valuable insights.")

In [34]:
def rerun(reply, message, history, feedback):
    """
    Retries generating a response after a failed quality check.
    
    Args:
        reply (str): The original rejected response
        message (str): The user's message that prompted the response
        history (list): Previous conversation history
        feedback (str): Feedback explaining why the response was rejected
        
    Returns:
        str: A new response attempt from the model
        
    This function:
    1. Updates the system prompt to include the rejected response and feedback
    2. Recreates the conversation context with the updated prompt
    3. Makes a new API call to generate an improved response
    """
    updated_system_prompt = system_prompt + "\n\n## Previous answer rejected\nYou just tried to reply, but the quality control rejected your reply\n"
    updated_system_prompt += f"## Your attempted answer:\n{reply}\n\n"
    updated_system_prompt += f"## Reason for rejection:\n{feedback}\n\n"
    messages = [{"role": "system", "content": updated_system_prompt}] + history + [{"role": "user", "content": message}]
    response = deepseek.chat.completions.create(model=model_name, messages=messages)
    return response.choices[0].message.content

In [None]:
# The following function incorporates the complete pipeline for a chatbot
def chat(message, history):
    """
    A chat function that acts as a quality evaluator and optimizer for LLM responses.
    
    This function:
    1. Generates an initial response from the LLM
    2. Evaluates the quality of the response using an evaluation function
    3. If the response is unacceptable, it reruns the request with feedback to get an improved response
    
    Args:
        message (str): The user's input message
        history (list): Previous conversation history
        
    Returns:
        str: The final acceptable response from the LLM
    """
    # Set system prompt (special case for patent questions)
    if "patent" in message:
        system = system_prompt + "\n\nEverything in your reply needs to be in pig latin - \
              it is mandatory that you respond only and entirely in pig latin"
    else:
        system = system_prompt
    
    # Generate initial response from LLM
    messages = [{"role": "system", "content": system}] + history + [{"role": "user", "content": message}]
    response = deepseek.chat.completions.create(model=model_name, messages=messages)
    reply = response.choices[0].message.content

    # Evaluate response quality
    evaluation = evaluate(reply, message, history)
    
    # If response is acceptable, return it
    # If not, rerun with feedback to get improved response
    if evaluation.is_acceptable:
        print("Passed evaluation - returning reply")
    else:
        print("Failed evaluation - retrying")
        print(evaluation.feedback)
        reply = rerun(reply, message, history, evaluation.feedback)       
    return reply

In [None]:
gr.ChatInterface(chat, type="messages").launch()

* Running on local URL:  http://127.0.0.1:7865
* To create a public link, set `share=True` in `launch()`.




Passed evaluation - returning reply
Failed evaluation - retrying
The response is not acceptable as it includes a significant amount of text written in Pig Latin, which undermines professionalism and clarity. It is important for the Agent to communicate in a clear and standard manner, especially when addressing a potential client or employer. The content regarding the lack of patents and ongoing projects is relevant, but the use of Pig Latin distracts from the message and may confuse the user. The Agent should revise the response to use standard English to maintain professionalism and effectiveness in communication.
