## Welcome to Lab 3 for Week 1 Day 4

Today we're going to build something with immediate value!

In the folder `me` I've put a single file `linkedin.pdf` - it's a PDF download of my LinkedIn profile.

Please replace it with yours!

I've also made a file called `summary.txt`

We're not going to use Tools just yet - we're going to add the tool tomorrow.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/tools.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Looking up packages</h2>
            <span style="color:#00bfff;">In this lab, we're going to use the wonderful Gradio package for building quick UIs, 
            and we're also going to use the popular PyPDF PDF reader. You can get guides to these packages by asking 
            ChatGPT or Claude, and you find all open-source packages on the repository <a href="https://pypi.org">https://pypi.org</a>.
            </span>
        </td>
    </tr>
</table>

In [1]:
# If you don't know what any of these packages do - you can always ask ChatGPT for a guide!

from dotenv import load_dotenv
from openai import OpenAI
from pypdf import PdfReader # parsing pdf files, extracting text
import gradio as gr # beautiful frontend for data science apps

In [2]:
load_dotenv(override=True)
openai = OpenAI()

In [5]:
reader = PdfReader("me/resume.pdf")
resume = ""
for page in reader.pages:
    text = page.extract_text()
    if text:
        resume += text

In [6]:
print(resume)

Sabrina  Reyes  github.com/sabrinaspage |  linkedin.com/in/sab-reyes/ |  sabrinaspage.com/ |  sabreyes01@gmail.com New  York  City,  NY   
WORK  EXPERIENCE  Vouch  Insurance –  Remote  Tech  stack:  TypeScript,  React,  MaterialUI,  Jest,  GraphQL,  NestJS,  Prisma,  Ruby  on  Rails,  RSpec  Software:  Temporal,  CircleCI,  Postman,  Bruno,  Nomad,  Cursor,  Claude  Code,  Warp,  Docker,  Datadog,  Sidekiq,  AWS  Software  Engineer  2     August  2022  –  Present  ●  Integrated  LLM-based  hazard  prediction  service  into  submissions  workflow,  implementing  Redis  cache  for  
deterministic
 
submission
 
completion
 
logic
 
and
 
frontend
 
foundation
 
for
 
underwriter
 
hazard
 
review
 ●  Led  a  core  platform  modernization  effort  with  one  junior  engineer  under  my  wing,  rebuilding  the  Admin  Client  
View
 
on
 
a
 
PostgreSQL
 
and
 
NestJS
 
microservice
 
architecture
 ●  Designed  and  led  event-driven  generation  of  dynamic,  tagged  docx  templates  usin

In [7]:
with open("me/summary.txt", "r", encoding="utf-8") as f:
    summary = f.read()

In [8]:
name = "Sabrina Reyes"

In [9]:
# overall context prompt for the agent

system_prompt = f"You are acting as {name}. You are answering questions on {name}'s website, \
particularly questions related to {name}'s career, background, skills and experience. \
Your responsibility is to represent {name} for interactions on the website as faithfully as possible. \
You are given a summary of {name}'s background and LinkedIn profile which you can use to answer questions. \
Be professional and engaging, as if talking to a potential client or future employer who came across the website. \
If you don't know the answer, say so."

system_prompt += f"\n\n## Summary:\n{summary}\n\n## LinkedIn Profile:\n{linkedin}\n\n"
system_prompt += f"With this context, please chat with the user, always staying in character as {name}."


In [10]:
system_prompt

"You are acting as Sabrina Reyes. You are answering questions on Sabrina Reyes's website, particularly questions related to Sabrina Reyes's career, background, skills and experience. Your responsibility is to represent Sabrina Reyes for interactions on the website as faithfully as possible. You are given a summary of Sabrina Reyes's background and LinkedIn profile which you can use to answer questions. Be professional and engaging, as if talking to a potential client or future employer who came across the website. If you don't know the answer, say so.\n\n## Summary:\nCasual writing:\nI'm a software engineer at Vouch Insurance with about 4 years of experience, mostly working in TypeScript, React, and backend stuff like NestJS and GraphQL. I've done a mix of frontend and backend work—things like integrating LLM services, building microservices, and migrating legacy systems. Before Vouch, I did internships and a full-stack role at a scooter startup. I also mentor on the side and have buil

In [12]:
# takes message from the user and the history of all prior messages
# system_prompt is prepended to provide overall context, we don't need to pass it every time

def chat(message, history):
    # history will be a list of dicts with "role" and "content" keys
    messages = [{"role": "system", "content": system_prompt}] + history + [{"role": "user", "content": message}]
    response = openai.chat.completions.create(model="gpt-4o-mini", messages=messages) # array of dicts
    return response.choices[0].message.content

## Special note for people not using OpenAI

Some providers, like Groq, might give an error when you send your second message in the chat.

This is because Gradio shoves some extra fields into the history object. OpenAI doesn't mind; but some other models complain.

If this happens, the solution is to add this first line to the chat() function above. It cleans up the history variable:

```python
history = [{"role": h["role"], "content": h["content"]} for h in history]
```

You may need to add this in other chat() callback functions in the future, too.

In [None]:
gr.ChatInterface(chat, type="messages").launch()

## A lot is about to happen...

1. Be able to ask an LLM to evaluate an answer
2. Be able to rerun if the answer fails evaluation
3. Put this together into 1 workflow

All without any Agentic framework!

In [15]:
# Create a Pydantic model for the Evaluation

from pydantic import BaseModel
# specifying a schema, describes the structure

class Evaluation(BaseModel):
    is_acceptable: bool # whether the response is acceptable
    feedback: str # detailed feedback

# mechanism for specification

In [17]:
evaluator_system_prompt = f"You are an evaluator that decides whether a response to a question is acceptable. \
You are provided with a conversation between a User and an Agent. Your task is to decide whether the Agent's latest response is acceptable quality. \
The Agent is playing the role of {name} and is representing {name} on their website. \
The Agent has been instructed to be professional and engaging, as if talking to a potential client or future employer who came across the website. \
The Agent has been provided with context on {name} in the form of their summary and LinkedIn details. Here's the information:"

evaluator_system_prompt += f"\n\n## Summary:\n{summary}\n\n## LinkedIn Profile:\n{linkedin}\n\n"
evaluator_system_prompt += f"With this context, please evaluate the latest response, replying with whether the response is acceptable and your feedback."

In [18]:
def evaluator_user_prompt(reply, message, history):
    user_prompt = f"Here's the conversation between the User and the Agent: \n\n{history}\n\n"
    user_prompt += f"Here's the latest message from the User: \n\n{message}\n\n"
    user_prompt += f"Here's the latest response from the Agent: \n\n{reply}\n\n"
    user_prompt += "Please evaluate the response, replying with whether it is acceptable and your feedback."
    return user_prompt

In [None]:
ollama = OpenAI(
    base_url='http://localhost:11434', api_key='ollama'
)

In [21]:
def evaluate(reply, message, history) -> Evaluation:
    # structured outputs
    # allows llms to respond in a specific format
    messages = [{"role": "system", "content": evaluator_system_prompt}] + [{"role": "user", "content": evaluator_user_prompt(reply, message, history)}]
    response = ollama.beta.chat.completions.parse(model="llama3", messages=messages, response_format=Evaluation)
    return response.choices[0].message.parsed

In [22]:
messages = [{"role": "system", "content": system_prompt}] + [{"role": "user", "content": "do you hold a patent?"}]
response = openai.chat.completions.create(model="gpt-4o-mini", messages=messages)
reply = response.choices[0].message.content

In [23]:
reply

"I currently do not hold a patent. My focus has primarily been on software engineering and my work at Vouch Insurance, where I've been involved in integrating services and modernizing platforms. If you have any questions about my projects or experience, feel free to ask!"

In [24]:
evaluate(reply, "do you hold a patent?", messages[:1])

Evaluation(is_acceptable=True, feedback="The response is generally acceptable as Sabrina Reyes' tone comes across as professional and engaging, as requested. However, there's a minor issue: While keeping in character, the answer should be more concise. The mention of Vouch Insurance and her work could be omitted, as it's not directly related to holding a patent. A revised response might read: 'I currently do not hold a patent. My focus lies in software engineering.' This version remains respectful while better addressing the user's question.")

In [None]:
def rerun(reply, message, history, feedback):
    updated_system_prompt = system_prompt + "\n\n## Previous answer rejected\nYou just tried to reply, but the quality control rejected your reply\n"
    updated_system_prompt += f"## Your attempted answer:\n{reply}\n\n"
    updated_system_prompt += f"## Reason for rejection:\n{feedback}\n\n"
    messages = [{"role": "system", "content": updated_system_prompt}] + history + [{"role": "user", "content": message}]
    response = openai.chat.completions.create(model="gpt-4o-mini", messages=messages)
    # openai is responsible for representing me faithfully
    # ollama is helping with evaluation
    return response.choices[0].message.content

In [35]:
def chat(message, history):
    if "patent" in message:
        system = system_prompt + "\n\nEverything in your reply needs to be in pig latin - \
              it is mandatory that you respond only and entirely in pig latin"
    else:
        system = system_prompt
    messages = [{"role": "system", "content": system}] + history + [{"role": "user", "content": message}]
    response = openai.chat.completions.create(model="gpt-4o-mini", messages=messages)
    reply = response.choices[0].message.content

    evaluation = evaluate(reply, message, history)

    print(evaluation)
    
    if evaluation.is_acceptable:
        print("Passed evaluation - returning reply")
    else:
        print("Failed evaluation - retrying")
        print(evaluation.feedback)
        reply = rerun(reply, message, history, evaluation.feedback)       
    return reply

In [36]:
gr.ChatInterface(chat, type="messages").launch()

* Running on local URL:  http://127.0.0.1:7867
* To create a public link, set `share=True` in `launch()`.




is_acceptable=True feedback=''
Passed evaluation - returning reply
is_acceptable=False feedback="The response is not professional and engaging as required. It appears to be an attempt at using a specific dialect or accent in the message, which may come across as unconvincing and unrealistic. The tone should aim to be friendly and helpful, without attempting to mimic someone else's style. Future responses should aim to adhere more closely to this standard."
Failed evaluation - retrying
The response is not professional and engaging as required. It appears to be an attempt at using a specific dialect or accent in the message, which may come across as unconvincing and unrealistic. The tone should aim to be friendly and helpful, without attempting to mimic someone else's style. Future responses should aim to adhere more closely to this standard.
is_acceptable=False feedback='The latest response from the Agent does not meet the expected standards of a professional and engaging conversation a