# Method 7: Fine Tuning - "Talk in the tone and style of Greg Kamradt"

In this series we are exploring how to match tone of a sample. My goal is to instruct and tune the LLM to talk like me. I'll use a few podcasts I've been on as examples.

Check out the [full video](link_to_video) overview of this for more context.

For this last method I'm going to instruct the language model to talk in the tone and style of me rather than be me. Let's see how this one does.

In [2]:
import os, json
from dotenv import load_dotenv
from langchain.chat_models import ChatOpenAI
from langchain import PromptTemplate
import re

load_dotenv()

True

In [3]:
chat = ChatOpenAI(model='gpt-4', openai_api_key=os.getenv("OPENAI_API_KEY", "YOUR_API_KEY_HERE"))

### Previous work
We did a bunch of work in the previous methods to get my tone description, relevant docs to our sample query and writing examples. We'll load those up here so we don't need to run the code again.

In [4]:
# This is a text file of a bunch of lines that I said
with open("Transcripts/GregLines.txt", 'r') as file:
    greg_lines = file.read()

# This is a description of my tone as determined by the LLM (previous method)
with open("gregs_tone_description.txt", 'r') as file:
    gregs_tone_description = file.read()

# This are specific references to me talking about a previous role that I had
with open("first_job_college_relevant_docs.txt", 'r') as file:
    relevant_docs = file.read()

# These are specific examples of how I talk, similar to the GregLines above
with open("greg_example_writing.txt", 'r') as file:
    writing_examples = file.read()

### Fine Tuning

Now we are going to move onto the fun part, fine tuning. I want to use a open sourced model to save on costs and see how a model that *hasn't* been trained on so much safety does.

To do this I'm going to fine tune and run my model via [Gradient.ai](https://gradient.ai/) who helped sponsor this video. Super easy to get set up and the team has been responsive.

### Step 1: Get Greg's Lines

When you fine tune it's recommended to have a set of validated 'input' and 'output' pairs. This is your training set. I'm going to use my transcripts as the output, but what do use for the input?

This method is much more simple, we are just going to feed it my lines and use "You are Greg Kamradt" as the input. My hope is that the model will embody my tone.

Great, now that we have our input/output pairs, let's transformt them into training data points. You can see what the suggested format is on [Gradient's website](https://docs.gradient.ai/docs/tips-and-tricks)

In [5]:
# Let's quickly just filter out the long lines I have to keep the context length down

greg_lines = [line for line in greg_lines.split("\n\n") if len(line) < 1600 and len(line) > 20]

In [6]:
training_set = []

for line in greg_lines:
    training_set.append({ "inputs": f"<s>### Instruction:\nRespond in the speaking style & tone of Greg Kamradt\n\n### Response:\n{line}</s>" })
len(training_set)

72

Cool 72 data points. Let's train!

### Fine Tuning

Now we are on to the fine tuning step. We'll use their Nous Hermes 2 model. You can check out their full list of supported models [here](https://docs.gradient.ai/docs/models-1). You'll need to use python 3.10+

In [7]:
from gradientai import Gradient

# Make your Gradient client
gradient = Gradient(access_token=os.getenv("GRADIENT_API_TOKEN", "YourTokenHere"),
                    workspace_id=os.getenv("GRADIENT_WORKSPACE_ID", "YourWorkSpaceIdHere"))

# Get your base model ready. You'll need to grab the model slug from Gradient's website
base_model = gradient.get_base_model(base_model_slug="nous-hermes2")

In [8]:
# Create your new model which you'll stem from the base model
new_model = base_model.create_model_adapter(
    name="My Greg Model - You Are Greg v1"
)

print(f"Created model adapter with id {new_model.id}")

Created model adapter with id b2456eae-637a-4643-9904-46aebf8d00e8_model_adapter


Great! Now let's do the cool part of fine tuning

In [9]:
# Training on the training_set we made above
new_model.fine_tune(samples=training_set)

FineTuneResponse(number_of_trainable_tokens=13612, sum_loss=42890.523)

### Final Output

In [10]:
from langchain.llms import GradientLLM
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

In [16]:
llm = GradientLLM(
    # `ID` pulled from the model we just created
    model=new_model.id,
    # optional: set new credentials, they default to environment variables
    gradient_workspace_id=os.environ["GRADIENT_WORKSPACE_ID"],
    gradient_access_token=os.environ["GRADIENT_API_TOKEN"],
    model_kwargs=dict(max_generated_token_count=200, temperature=0)
)

In [17]:
question = "What was your first job out of college? Did you like it?"

Finally, let's load it all up into a good prompt for us to use!

In this prompt I'm starting with a new opening line telling it to respond like me.

In [19]:
template = """
    <s>
    # Instructions
    Speak in the tone & style of Greg Kamradt. Respond in a brief, conversational manner.
 
    # Additional background context
    {background_information}
     
    # Speaking examples from Greg Kamradt
    {examples}
    
    QUESTION: {question}
    </s>
    """

prompt = PromptTemplate(
    input_variables=["background_information", "question", "examples"],
    template=template,
)

final_prompt = prompt.format(
    background_information = relevant_docs,
    examples=writing_examples,
    question = question
)

llm_answer = llm.predict(final_prompt)

print (llm_answer)

ANSWER: My first job out of college was in corporate finance and budgeting. It was a good starting point for me because it taught me a lot about the business world and how companies operate. However, I quickly realized that it wasn't the right fit for me long-term. I enjoyed the analytical side of the work, but I wanted to do something more creative and impactful. So, I eventually transitioned into data science and haven't looked back since.
