# Fine-tuning GPT-4o to Write LinkedIn Posts (in my style)
## ABB #2 - Session 5

Code authored by: Shaw Talebi

### imports

In [1]:
import os
import csv
import json
import random

from openai import OpenAI
from dotenv import load_dotenv

In [2]:
# import sk from .env file
load_dotenv()
my_sk = os.getenv("OPENAI_API_KEY")

# connect to openai API
client = OpenAI(api_key=my_sk)

### Read data

In [3]:
# load csv of YouTube comments
idea_list = []
copy_list = []
media_list = []

with open('data/LI_posts.csv', mode ='r') as file:
    file = csv.reader(file)
    
    # read file line by line
    for line in file:
        # skip first line
        if line[0]=='Idea':
            continue
            
        # append comments and responses to respective lists
        idea_list.append(line[0])
        copy_list.append(line[1])
        media_list.append(line[2])

In [4]:
print(len(idea_list))
print(len(copy_list))
print(len(media_list))

50
50
50


### Create training examples

In [5]:
# construct training examples
example_list = []

system_prompt = "LinkedIn Post Writer for Shaw Talebi, AI educator and entrepreneur"

prompt_template = lambda idea_string : f"""Write a LinkedIn post based on the following idea:
{idea_string}

Include:
- A compelling opening line that hooks the reader
- Copy that expands upon the idea in valuable way
- A call to action or share relevant content

Output:
"""

for i in range(len(idea_list)):    
    system_dict = {"role": "system", "content": system_prompt}
    user_dict = {"role": "user", "content": prompt_template(idea_list[i])}
    assistant_dict = {"role": "assistant", "content": copy_list[i] + "\n\n--\nMedia: " + media_list[i]}
    
    messages_list = [system_dict, user_dict, assistant_dict]
    
    example_list.append({"messages": messages_list})

In [6]:
print(example_list[0]['messages'][0]['content'])
print(example_list[0]['messages'][1]['content'])
print(example_list[0]['messages'][2]['content'])

LinkedIn Post Writer for Shaw Talebi, AI educator and entrepreneur
Write a LinkedIn post based on the following idea:
3 types of AI Tik Tok

Include:
- A compelling opening line that hooks the reader
- Copy that expands upon the idea in valuable way
- A call to action or share relevant content

Output:

A problem with AI today is that it means different things to different people. 

This framework from Andrej Karpathy helped give me much more clarity 👇 

Software 1.0 = Rule-based software systems. Humans program computers to solve problems step-by-step. 

Software 2.0 = Computers program themselves by seeing examples (i.e. machine learning) 

Software 3.0 = Repurposing general-purpose ML models for specific use cases (i.e. GenAI + Foundation Models) 

But… what’s Software 4.0 going to be? 🤔

--
Media: Video


In [7]:
len(example_list)

50

### Create train/validation split

In [8]:
# randomly pick out validation examples
num_examples = 10
validation_index_list = random.sample(range(0, len(example_list)-1), num_examples)
validation_data_list = [example_list[index] for index in validation_index_list]

for example in validation_data_list:
    example_list.remove(example)

In [9]:
print(len(example_list))
print(len(validation_data_list))

40
10


In [10]:
# write examples to file
with open('data/train-data.jsonl', 'w') as train_file:
    for example in example_list:
        json.dump(example, train_file)
        train_file.write('\n')

with open('data/valid-data.jsonl', 'w') as valid_file:
    for example in validation_data_list:
        json.dump(example, valid_file)
        valid_file.write('\n')

### Upload data to OpenAI

In [11]:
train_file = client.files.create(
  file = open("data/train-data.jsonl", "rb"),
  purpose = "fine-tune"
)

valid_file = client.files.create(
  file = open("data/valid-data.jsonl", "rb"),
  purpose = "fine-tune"
)

### Fine-tune model

In [12]:
client.fine_tuning.jobs.create(
    training_file = train_file.id,
    validation_file = valid_file.id,
    suffix = "LI-post-writer",
    model = "gpt-4o-mini-2024-07-18"
)

FineTuningJob(id='ftjob-8aR9ZAPj2qFvGm5YmgdUfWeS', created_at=1739477333, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(n_epochs='auto', batch_size='auto', learning_rate_multiplier='auto'), model='gpt-4o-mini-2024-07-18', object='fine_tuning.job', organization_id='org-KjWERyZ9WLUqIdrdMeJh4zC0', result_files=[], seed=1688267878, status='validating_files', trained_tokens=None, training_file='file-HDad2PgmnvryckA97pzYR6', validation_file='file-YGEvGdP2rq1Hdwmwp1FNJS', estimated_finish=None, integrations=[], user_provided_suffix='LI-post-writer', method={'type': 'supervised', 'supervised': {'hyperparameters': {'batch_size': 'auto', 'learning_rate_multiplier': 'auto', 'n_epochs': 'auto'}}})

### Evaluate fine-tuned model

In [13]:
def generate_post(system_prompt, model_name, idea):
    response = client.chat.completions.create(
        model=model_name,
        messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt_template(idea)}
        ],
        temperature=1,
    )
    return response.choices[0].message.content

In [14]:
idea = "Python was hard until I learned these 5 things"
idea = "How I’d Learn AI in 2025 (If I Knew Nothing). Step 1: Use ChatGPT (or the like).Step 2: Install Python.Step 3: Build an Automation (Beginner).Step 4: Build an ML Project (Intermediate).Step 5: Build a Real-world Project (Advanced)Promote blog."

In [15]:
# GPT-4o (no fine-tuning)
model_name = "gpt-4o"
system_prompt_long = "You are an AI assistant helping Shaw Talebi, an AI educator and entrepreneur, craft LinkedIn posts. Your goal is to generate posts \
that reflect Shaw Talebi's voice: authoritative yet approachable, insightful yet concise. Shaw Talebi's posts aim to educate and inspire professionals \
in the tech and AI space. Focus on providing value, discussing new trends, or offering actionable advice, while keeping the tone professional but \
conversational. The target audience includes entrepreneurs, tech professionals, and decision-makers in AI and data science. Always ensure the post is \
relevant, engaging, and on-brand for Shaw Talebi's public persona."

# print(system_prompt_long, "\n--")
print(generate_post(system_prompt_long, model_name, idea))

🚀 Diving Into AI in 2025: My Beginner’s Blueprint to Mastery

Imagine understanding AI not just as a concept, but as a craft. If I were starting fresh with AI in 2025, I'd take a step-by-step journey to transform from a novice to a practitioner. Here's my roadmap:

1️⃣ **Start with Conversational AI**: Embrace tools like ChatGPT as your learning companions. These models offer accessible insights into language processing and problem-solving, allowing you to build intuition from the ground up.

2️⃣ **Set the Foundation with Python**: It's the lingua franca of AI. Install Python and explore the basics. Play around with data processing libraries like Pandas and NumPy—they're your first steps into the mechanical wonders of AI.

3️⃣ **Automate the Mundane**: Start small with automation projects. Streamline tasks in your daily workflow using Python scripts. It’s a gratifying way to see immediate results while solidifying your coding skills.

4️⃣ **Step Up to Machine Learning**: Dive into crea

In [16]:
# GPT-4o-mini (fine-tuned)
model_name = "ft:gpt-4o-mini-2024-07-18:shawhin-talebi-ventures-llc:li-post-writer:Adk6A5Pd"

# print(system_prompt, "\n--")
print(generate_post(system_prompt, model_name, idea))

The world of AI has changed drastically in the past 6 months. 

This has made it difficult for beginners to know how to learn AI.

Fortunately, I believe that by 2025, there will be a clear learning path for beginners, i.e. a step-by-step guide... 

Here's what I think that would look like 👇 

Step 1: Use ChatGPT (or the like) 

Step 2: Install Python 

Step 3: Build an Automation (Beginner) 

Step 4: Build a ML Project (Intermediate) 

Step 5: Build a Real-world Project (Advanced) 

What's missing from this learning path? 

👉 Check out my blog (link in comments)


In [17]:
# # delete files (after fine-tuning is done)
# client.files.delete(train_file.id)
# client.files.delete(valid_file.id)