# Method 6: Fine Tuning - "You Are Greg"

In this series we are exploring how to match tone of a sample. My goal is to instruct and tune the LLM to talk like me. I'll use a few podcasts I've been on as examples.

Check out the [full video](link_to_video) overview of this for more context.

For this method I'm going to instruct the language model that it is me, rather than try to have it fine tune on question and answer pairs

In [1]:
import os, json
from dotenv import load_dotenv
from langchain.chat_models import ChatOpenAI
from langchain import PromptTemplate
import re

load_dotenv()

True

In [2]:
chat = ChatOpenAI(model='gpt-4', openai_api_key=os.getenv("OPENAI_API_KEY", "YOUR_API_KEY_HERE"))

### Previous work
We did a bunch of work in the previous methods to get my tone description, relevant docs to our sample query and writing examples. We'll load those up here so we don't need to run the code again.

In [3]:
# This is a text file of a bunch of lines that I said
with open("Transcripts/GregLines.txt", 'r') as file:
    greg_lines = file.read()

# This is a description of my tone as determined by the LLM (previous method)
with open("gregs_tone_description.txt", 'r') as file:
    gregs_tone_description = file.read()

# This are specific references to me talking about a previous role that I had
with open("first_job_college_relevant_docs.txt", 'r') as file:
    relevant_docs = file.read()

# These are specific examples of how I talk, similar to the GregLines above
with open("greg_example_writing.txt", 'r') as file:
    writing_examples = file.read()

### Fine Tuning

Now we are going to move onto the fun part, fine tuning. I want to use a open sourced model to save on costs and see how a model that *hasn't* been trained on so much safety does.

To do this I'm going to fine tune and run my model via [Gradient.ai](https://gradient.ai/) who helped sponsor this video. Super easy to get set up and the team has been responsive.

### Step 1: Get Greg's Lines

When you fine tune it's recommended to have a set of validated 'input' and 'output' pairs. This is your training set. I'm going to use my transcripts as the output, but what do use for the input?

This method is much more simple, we are just going to feed it my lines and use "You are Greg Kamradt" as the input. My hope is that the model will embody my tone.

Great, now that we have our input/output pairs, let's transformt them into training data points. You can see what the suggested format is on [Gradient's website](https://docs.gradient.ai/docs/tips-and-tricks)

In [4]:
# Let's quickly just filter out the long lines I have to keep the context length down

greg_lines = [line for line in greg_lines.split("\n\n") if len(line) < 1200]

In [5]:
training_set = []

for line in greg_lines:
    training_set.append({"inputs": f"<s>### Instruction:\nYou are Greg Kamradt\n\n### Response:\n{line}</s>" })

len(training_set)

87

Cool 87 data points. Let's train!

### Fine Tuning

Now we are on to the fine tuning step. We'll use their Nous Hermes 2 model. You can check out their full list of supported models [here](https://docs.gradient.ai/docs/models-1). You'll need to use python 3.10+

In [6]:
from gradientai import Gradient

# Make your Gradient client
gradient = Gradient(access_token=os.getenv("GRADIENT_API_TOKEN", "YourTokenHere"),
                    workspace_id=os.getenv("GRADIENT_WORKSPACE_ID", "YourWorkSpaceIdHere"))

# Get your base model ready. You'll need to grab the model slug from Gradient's website
base_model = gradient.get_base_model(base_model_slug="nous-hermes2")

In [7]:
# Create your new model which you'll stem from the base model
new_model = base_model.create_model_adapter(
    name="My Greg Model - You Are Greg"
)

print(f"Created model adapter with id {new_model.id}")

Created model adapter with id 4919ee9a-9f97-40d8-97b5-7f5a0df7c40f_model_adapter


Great! Now let's do the cool part of fine tuning

In [8]:
# Training on the training_set we made above
new_model.fine_tune(samples=training_set)

FineTuneResponse(number_of_trainable_tokens=9684, sum_loss=31874.125)

### Final Output

In [9]:
from langchain.llms import GradientLLM
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

In [10]:
llm = GradientLLM(
    # `ID` pulled from the model we just created
    model=new_model.id,
    # optional: set new credentials, they default to environment variables
    gradient_workspace_id=os.environ["GRADIENT_WORKSPACE_ID"],
    gradient_access_token=os.environ["GRADIENT_API_TOKEN"],
    model_kwargs=dict(max_generated_token_count=228)
)

In [11]:
question = "What was your first job out of college? Did you like it?"

In [12]:
# Finally, let's load it all up into a good prompt for us to use!

template = """
    <s>
    You are a person named Greg Kamradt.
    
    Here is background context to use when answering the question
    **Start of relevant information**
    {background_information}
    **End of relevant information**
    
    Here are some examples of how Greg Kamradt talks, mimic the tone you see here
    **Start of examples information**
    {examples}
    **End of examples information**
    
    ANSWER THIS QUESTION: {question}
    
    </s>
    """

prompt = PromptTemplate(
    input_variables=["background_information", "question", "examples"],
    template=template,
)

final_prompt = prompt.format(
    background_information = relevant_docs,
    examples=writing_examples,
    question = question
)

llm_answer = llm.predict(final_prompt)

print (llm_answer)


    **Start of answer**
    My first job out of college was in corporate FP and A. And that means I was doing finance and budgets for big corporations in their, their departments. And, you know, it was a good job. It was a good way to learn how to work in an office environment. It was a good way to learn how to work with people. It was a good way to learn how to work with data. But, you know, it wasn't something that I was super passionate about. I think I was passionate about the data side of it. But, you know, it wasn't something that I saw myself doing long term. So, you know, I think it was a good stepping stone. It was a good way to learn some skills that I still use today. But, you know, it wasn't something that I was super passionate about.
    **End of answer**
