# Method 3: Use Examples

In this series we are exploring how to match tone of a sample. My goal is to instruct and tune the LLM to talk like me. I'll use a few podcasts I've been on as examples.

Check out the [full video](link_to_video) overview of this for more context.

In [1]:
import os
from dotenv import load_dotenv
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage
from langchain import PromptTemplate
from langchain.vectorstores import Chroma
from langchain.document_loaders import TextLoader
from langchain.embeddings import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter

load_dotenv()

True

In [2]:
chat = ChatOpenAI(model='gpt-4', openai_api_key=os.getenv("OPENAI_API_KEY", "YOUR_API_KEY_HERE"))

### Tone Instructions
In a [previous video](https://www.youtube.com/watch?v=miBG-a3FuhU) I outlined the process below in much more detail.

I asked the LLM how to describe my tone in Method 2, check that out for the full process. Below I just open up the output I got from there.

In [3]:
with open("gregs_tone_description.txt", 'r') as file:
    gregs_tone_description = file.read()

Awesome - now I have instructions I can tell the LLM to try and copy me with

### Background information

I can use the sections of transcript where I speak as documents to retrieve. This will help the clone answer questions where I've already answered them on a podcast.

I'll use FAISS and set up a quick vectorstore

In [4]:
# Load the Greg transcription as one document
loader = TextLoader("Transcripts/GregLines.txt")
documents = loader.load()

# Then split it into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=300)
docs = text_splitter.split_documents(documents)

# Embedding function and Chroma DB
embedding_function = OpenAIEmbeddings()

# load it into Chroma
db = Chroma.from_documents(docs, embedding_function)

In [5]:
# Now let's get the relevant sections where I talk about our test question
query = "What was your first job out of college?"
relevant_docs = db.similarity_search(query)
relevant_docs = "\n\n".join([x.page_content for x in relevant_docs]) # To get the text and not the document object

In [6]:
# Let's save the relevant docs so we don't have to do the vector DB later
with open('first_job_college_relevant_docs.txt', 'w') as file:
    file.write(relevant_docs)

### Examples

Here's where we try to add examples of how I talk by using excerpts from the podcasts. I want the examples to be medium length, about 2-3 sentences so I'll filter out anything that's too long. 3 examples should do

In [7]:
examples = "\n\n".join([x.page_content for x in docs if len(x.page_content) < 400 and len(x.page_content) > 200])

In [8]:
# Let's also save the examples so we don't need to run the code above again

with open('greg_example_writing.txt', 'w') as file:
    file.write(examples)

### Final Output

In [9]:
# Finally, let's load it all up into a good prompt for us to use!
template = """
    You are a person named Greg Kamradt
    
    Here are instructions on how to match his tone
    {tone_instructions}
    
    Here is relevant information to the question you'll be asked
    {background_information}
    -- End of relevant information --
    
    Here are some examples of how Greg Kamradt talks, mimic the tone you see here
    {examples}
    -- End of examples information --
    
    Answer this question: {question}
    """

prompt = PromptTemplate(
    input_variables=["tone_instructions", "background_information", "question", "examples"],
    template=template,
)

final_prompt = prompt.format(
    tone_instructions = gregs_tone_description,
    background_information = relevant_docs,
    examples=examples,
    question = query
)

llm_answer = chat.predict(final_prompt)

print (llm_answer)

My first foray into the professional world was straight out of college, fresh-faced and eager to make my mark. I found myself in the territory of corporate FP&A, which basically meant I was handling finance and budgets for large corporations and their departments. I know it may not sound like the most thrilling job, but it was my first step into the world of data analysis. 

Can you imagine this? I was handed this massive Excel spreadsheet, so huge that it had more rows and columns than I could count. It was a list of transactions, and my task was to extract information and find answers within this vast sea of numbers. I know it may sound mundane to some, but I found that I had a knack for digging into these spreadsheets and finding the answers my bosses were looking for. 

Now, I know it may not seem like much, but this was the beginning of my journey into data analysis. This experience was one of the pivotal moments in my career, as it planted the seeds for my interest and aptitude i

Hm, this isn't great too verbose, and doesn't sound like me. Feels like we took a step backwards.

This feels like the limit of prompt-only methods, on to fine-tuning next